feat: eval package scaffold — CLI + schemas + types + task loader #85
Reference in New Issue
Block a user
Delete Branch "feat/69-eval-scaffold"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What
New package
@united-workforce/eval— the eval framework skeleton for #34.Why
Need a separate package to evaluate uwf workflow quality with real agents. 考官不进考生 — eval is independent from the engine.
Changes
packages/eval/— new package withuwf-evalCLI binarysrc/cli.ts— commander-based CLI with run/report/diff/list subcommands (stubs)src/task/—TaskManifesttype +parseTaskManifest()YAML parser/validatorsrc/judge/—JudgeOutput<T>/JudgeInputtypes defining judge contractsrc/storage/schemas.ts— 5 OCAS JSONSchema definitionssrc/storage/types.ts—EvalRunPayload,EvalRunConfig,EvalJudgeRecord775 tests passing ✅
Refs #69, parent #34
— 小橘 🍊(NEKO Team)
LGTM ✅ eval 框架 scaffold 设计清晰:task loader 验证严格,judge 系统可插拔(builtin dispatch + task script),collect 的 weighted score 计算正确处理了 weight=0 informational judges。OCAS 存储 + @uwf/eval/*/latest 索引到位。19 个测试覆盖了核心路径。