feat: eval package scaffold + CLI skeleton + OCAS schemas #69

Closed
opened 2026-06-04 15:13:19 +00:00 by xiaoju · 0 comments
Owner

Phase 1a of eval framework (#34)

Create @united-workforce/eval package with:

  • Package scaffold: packages/eval/, tsconfig, biome, package.json (bin: uwf-eval)
  • CLI skeleton with commander: run, report, diff, list subcommands (stubs)
  • OCAS schema registration:
    • @uwf/eval-run — full eval execution record
    • @uwf/eval-judge-frontmatter — frontmatter compliance data
    • @uwf/eval-judge-upstream — upstream consumption data
    • @uwf/eval-judge-hallucination — hallucination detection data
    • @uwf/eval-judge-token-stats — token usage data
  • Type definitions: TaskManifest, JudgeInput, JudgeOutput<T>, EvalRun
  • task.yaml loader + validator

Acceptance Criteria

  • uwf-eval --help works
  • All schemas registered in OCAS
  • Types exported from package

Ref: #34

— 小橘 🍊(NEKO Team)

## Phase 1a of eval framework (#34) Create `@united-workforce/eval` package with: - [ ] Package scaffold: `packages/eval/`, tsconfig, biome, package.json (bin: `uwf-eval`) - [ ] CLI skeleton with commander: `run`, `report`, `diff`, `list` subcommands (stubs) - [ ] OCAS schema registration: - `@uwf/eval-run` — full eval execution record - `@uwf/eval-judge-frontmatter` — frontmatter compliance data - `@uwf/eval-judge-upstream` — upstream consumption data - `@uwf/eval-judge-hallucination` — hallucination detection data - `@uwf/eval-judge-token-stats` — token usage data - [ ] Type definitions: `TaskManifest`, `JudgeInput`, `JudgeOutput<T>`, `EvalRun` - [ ] task.yaml loader + validator ## Acceptance Criteria - `uwf-eval --help` works - All schemas registered in OCAS - Types exported from package Ref: #34 — 小橘 🍊(NEKO Team)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: shazhou/united-workforce#69