fix: frontmatter judge handles parsed object output #90

Merged
xiaoju merged 1 commits from fix/frontmatter-judge-object-output into main 2026-06-05 03:01:30 +00:00
Owner

What

Fix frontmatter-compliance judge scoring 0 on valid step outputs.

Why

The extract pipeline stores step output as a JSON object in CAS ({"$status": "done", ...}), but the judge only accepted raw markdown strings (---\n$status: done\n---). Every eval run scored 0 on frontmatter-compliance even when $status was correctly set.

Changes

  • frontmatter.ts — check for parsed object first (direct $status lookup), fall through to YAML extraction for raw strings
  • builtin-judges.test.ts — add 2 tests for object output (valid + missing $status)

Test

11/11 builtin-judges tests pass, 804 total pass

## What Fix frontmatter-compliance judge scoring 0 on valid step outputs. ## Why The extract pipeline stores step output as a JSON object in CAS (`{"$status": "done", ...}`), but the judge only accepted raw markdown strings (`---\n$status: done\n---`). Every eval run scored 0 on frontmatter-compliance even when `$status` was correctly set. ## Changes - `frontmatter.ts` — check for parsed object first (direct `$status` lookup), fall through to YAML extraction for raw strings - `builtin-judges.test.ts` — add 2 tests for object output (valid + missing `$status`) ## Test 11/11 builtin-judges tests pass, 804 total pass
xiaoju added 1 commit 2026-06-05 02:58:14 +00:00
fix: frontmatter judge handles parsed object output
CI / check (pull_request) Successful in 2m38s
a08775896f
The extract pipeline stores step output as a JSON object in CAS,
but the frontmatter judge only checked for raw markdown strings.
Now accepts both formats: parsed objects check $status directly,
raw strings go through YAML frontmatter extraction.

Fixes eval frontmatter-compliance scoring 0 on valid outputs.
xiaoju merged commit a0e139935e into main 2026-06-05 03:01:30 +00:00
xiaoju deleted branch fix/frontmatter-judge-object-output 2026-06-05 03:01:30 +00:00
Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: shazhou/united-workforce#90