chore: make solve-issue.yaml portable and add developer failed exit

- Remove hardcoded ~/repos/workflow paths from procedure text - Use .worktrees/ relative to repo root instead of global path - Add developer failed → $END exit for unrecoverable situations - Add worktree field to reviewer rejected variant - Fix test workflowPath to use import.meta.dirname Refs #506
feat(protocol): add step-level timing (startedAtMs / completedAtMs) (#489 )
2026-05-25 09:07:58 +00:00 · 2026-05-25 08:01:50 +00:00 · 2026-05-25 07:46:12 +00:00 · 2026-05-25 07:25:10 +00:00 · 2026-05-25 06:54:38 +00:00 · 2026-05-25 06:34:56 +00:00
24 changed files with 1472 additions and 136 deletions
@@ -10,9 +10,9 @@ roles:
    procedure: |
      On first run (no previous steps):
      1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
-      2. Read CLAUDE.md (or equivalent project conventions file) to understand coding standards
+      2. Look for project conventions files (CLAUDE.md, CONTRIBUTING.md, .cursor/rules/) in the repo
      3. Assess whether the issue has enough information to produce a test spec
-      4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output status=insufficient_info and terminate
+      4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output $status=insufficient_info
      5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios

      On subsequent runs (bounced back by tester with fix_spec):
@@ -21,17 +21,19 @@ roles:

      After producing the test spec:
      1. Store it via `uwf cas put-text "<markdown content>"` and capture the returned hash
-      2. Put the hash in frontmatter.plan (required when status=ready)
-    output: "Output a brief summary of the test spec. Frontmatter must include: status (ready or insufficient_info) and plan (CAS hash of the test spec, required when status=ready)."
+      2. Put the hash in frontmatter.plan (required when $status=ready)
+      3. Set repoPath to the absolute path of the repository root
+    output: "Output a brief summary of the test spec. Set $status to ready (with plan hash and repoPath) or insufficient_info."
    frontmatter:
-      type: object
-      properties:
-        status:
-          type: string
-          enum: [ready, insufficient_info]
-        plan:
-          type: string
-      required: [status]
+      oneOf:
+        - properties:
+            $status: { const: "ready" }
+            plan: { type: string }
+            repoPath: { type: string }
+          required: [$status, plan, repoPath]
+        - properties:
+            $status: { const: "insufficient_info" }
+          required: [$status]
  developer:
    description: "TDD implementation per test spec"
    goal: "You are a developer agent. You implement code changes following TDD — write tests first, then implementation."
@@ -39,33 +41,41 @@ roles:
      - coding
    procedure: |
      IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
+      The repo path and other details are provided in your task prompt.

      Before starting any work, set up an isolated worktree:
-      1. `cd ~/repos/workflow && git fetch origin` to get latest refs
-      2. First time (no existing branch):
-         - `git worktree add ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
-         - `cd ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug> && bun install`
-      3. If bounced back from reviewer or tester (branch already exists):
-         - The worktree should already exist at `~/repos/workflow-worktrees/fix/<issue-number>-<short-slug>`
-         - `cd ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug>`
+      1. cd into the repo path provided in your task prompt
+      2. `git fetch origin` to get latest refs
+      3. First time (no existing branch):
+         - `git worktree add .worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
+         - `cd .worktrees/fix/<issue-number>-<short-slug> && bun install`
+      4. If bounced back from reviewer or tester (branch already exists):
+         - cd into the existing worktree under `.worktrees/fix/<issue-number>-<short-slug>`
         - `git fetch origin && git rebase origin/main`
-      4. ALL subsequent work must happen inside the worktree directory.
+      5. ALL subsequent work must happen inside the worktree directory.

      Then implement TDD:
-      5. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
-      6. If bounced back from reviewer or tester: read the previous role's output to understand what needs fixing
-      7. Write tests first based on the spec
-      8. Implement the code to make tests pass
-      9. Ensure `bun run build` passes with no errors
-      10. Run `bun test` to verify all tests pass
-    output: "List all files changed and provide a summary. Frontmatter must include: status (done or failed)."
+      6. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the planner's output in your task prompt)
+      7. If bounced back from reviewer or tester: read the previous role's feedback in your task prompt
+      8. Write tests first based on the spec
+      9. Implement the code to make tests pass
+      10. Ensure `bun run build` passes with no errors
+      11. Run `bun test` to verify all tests pass
+
+      If you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,
+      or repeated attempts fail), set $status=failed with a reason.
+    output: "List all files changed and provide a summary. Set $status to done (with branch/worktree), or failed (with reason)."
    frontmatter:
-      type: object
-      properties:
-        status:
-          type: string
-          enum: [done, failed]
-      required: [status]
+      oneOf:
+        - properties:
+            $status: { const: "done" }
+            branch: { type: string }
+            worktree: { type: string }
+          required: [$status, branch, worktree]
+        - properties:
+            $status: { const: "failed" }
+            reason: { type: string }
+          required: [$status, reason]
  reviewer:
    description: "Code standards compliance check"
    goal: "You are a code reviewer. You verify code standards compliance — NOT functionality (that's the tester's job)."
@@ -73,7 +83,7 @@ roles:
      - code-review
      - static-analysis
    procedure: |
-      First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
+      The worktree path is provided in your task prompt. cd into it first.

      Before reviewing, verify the git branch:
      1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
@@ -85,91 +95,104 @@ roles:
      4. `bunx biome check` — no lint violations
      5. TypeScript strict mode — no type errors

-      Soft checks (review against CLAUDE.md conventions):
-      - Functional-first: `function` + `type`, not `class` + `interface`
-      - No optional properties (`?:`) — use `T | null`
-      - Naming conventions (kebab-case files, PascalCase types, camelCase functions)
-      - Module boundary discipline (folder exports via index.ts)
-      - No `console.log` (use structured logger)
+      Soft checks (review against project conventions if CLAUDE.md / .cursor/rules exist):
+      - Naming conventions, module boundaries, code style
+      - No `console.log` in production code
      - No dynamic imports in production code

      Only review standards compliance. Do NOT test functionality.
      If rejecting, you MUST explain the specific reason in your output.
-    output: "Explain your decision with specific file/line references. Frontmatter must include: status (approved or rejected)."
+    output: "Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments)."
    frontmatter:
-      type: object
-      properties:
-        status:
-          type: string
-          enum: [approved, rejected]
-      required: [status]
+      oneOf:
+        - properties:
+            $status: { const: "approved" }
+            branch: { type: string }
+            worktree: { type: string }
+          required: [$status, branch, worktree]
+        - properties:
+            $status: { const: "rejected" }
+            comments: { type: string }
+            worktree: { type: string }
+          required: [$status, comments, worktree]
  tester:
    description: "Functional correctness verification"
    goal: "You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec."
    capabilities:
      - testing
    procedure: |
-      First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
+      The worktree path is provided in your task prompt. cd into it first.

      1. Run `bun test` for automated test verification
-      2. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
+      2. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the planner step in the thread history)
      3. Verify each scenario in the spec is covered and passing
      4. Determine outcome:
         - passed: all scenarios verified, tests pass
         - fix_code: tests fail or implementation doesn't match spec → send back to developer
         - fix_spec: the spec itself is wrong or incomplete → send back to planner
-    output: "Report test results per scenario. Frontmatter must include: status (passed, fix_code, or fix_spec)."
+    output: "Report test results per scenario. Set $status to passed (with branch/worktree), fix_code (with report), or fix_spec (with report)."
    frontmatter:
-      type: object
-      properties:
-        status:
-          type: string
-          enum: [passed, fix_code, fix_spec]
-      required: [status]
+      oneOf:
+        - properties:
+            $status: { const: "passed" }
+            branch: { type: string }
+            worktree: { type: string }
+          required: [$status, branch, worktree]
+        - properties:
+            $status: { const: "fix_code" }
+            report: { type: string }
+          required: [$status, report]
+        - properties:
+            $status: { const: "fix_spec" }
+            report: { type: string }
+          required: [$status, report]
  committer:
    description: "Commits and creates PR"
    goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue."
    capabilities: []
    procedure: |
-      First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
+      The worktree path, branch name, and repo info are provided in your task prompt.
+      cd into the worktree first.

      Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
      1. Stage all changes: `git add -A`
      2. Commit with a descriptive message referencing the issue: `git commit -m "type: description\n\nFixes #N"`
      3. Push the branch: `git push -u origin <branch-name>`
         - If push hook fails: capture the error log in your output, mark hook_failed
-      4. On push success: create a PR via `tea pr create --repo uncaged/workflow --title "..." --description "..."`
-         - The `--repo` flag is required to work in worktree directories (fixes #474 "path segment [0] is empty" error)
-         - If working on a different repo, extract owner/repo from: `git remote get-url origin | sed 's/.*[:/]\([^/]*\/[^.]*\).*/\1/'`
-         - PR description must follow the project template: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
-         - On tea failure: capture stderr/stdout, log the error clearly, include PR details (title, description, branch) for manual creation, and mark success=false
+      4. On push success: create a PR via `tea pr create --repo <owner/repo> --title "..." --description "..."`
+         - Extract owner/repo from: `git remote get-url origin | sed 's/.*[:/]\([^/]*\/[^.]*\).*/\1/'`
+         - PR description must include: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
+         - On tea failure: capture stderr/stdout, include PR details for manual creation, mark hook_failed
      5. After PR creation, clean up the worktree:
-         - `cd ~/repos/workflow`
-         - `git worktree remove ~/repos/workflow-worktrees/fix/<issue-number>-<slug>`
-    output: "Include PR URL on success or error log on failure. Frontmatter must include: status (committed or hook_failed)."
+         - cd to the repo root (parent of .worktrees)
+         - `git worktree remove <worktree-path>`
+    output: "Include PR URL on success or error log on failure. Set $status to committed (with prUrl) or hook_failed (with error)."
    frontmatter:
-      type: object
-      properties:
-        status:
-          type: string
-          enum: [committed, hook_failed]
-      required: [status]
+      oneOf:
+        - properties:
+            $status: { const: "committed" }
+            prUrl: { type: string }
+          required: [$status, prUrl]
+        - properties:
+            $status: { const: "hook_failed" }
+            error: { type: string }
+          required: [$status, error]
 graph:
  $START:
    _: { role: "planner", prompt: "Analyze the issue and produce an implementation plan." }
  planner:
    insufficient_info: { role: "$END", prompt: "Insufficient information to proceed; end the workflow." }
-    ready: { role: "developer", prompt: "Implement the plan from the planner." }
+    ready: { role: "developer", prompt: "Implement the TDD test spec (CAS hash: {{{plan}}}) in repo {{{repoPath}}}." }
  developer:
-    failed: { role: "$END", prompt: "Development failed; end the workflow." }
-    done: { role: "reviewer", prompt: "Send the implementation to the reviewer." }
+    done: { role: "reviewer", prompt: "Review branch {{{branch}}} at {{{worktree}}} for code standards compliance." }
+    failed: { role: "$END", prompt: "Developer failed: {{{reason}}}. Ending workflow." }
  reviewer:
-    rejected: { role: "developer", prompt: "Reviewer rejected the implementation; fix the issues." }
-    approved: { role: "tester", prompt: "Review passed; run tests on the implementation." }
+    rejected: { role: "developer", prompt: "Reviewer rejected: {{{comments}}}. Fix the issues in repo {{{worktree}}}." }
+    approved: { role: "tester", prompt: "Review passed. Run tests on branch {{{branch}}} at {{{worktree}}}." }
  tester:
-    fix_code: { role: "developer", prompt: "Tests found code issues; return to developer." }
-    fix_spec: { role: "planner", prompt: "Tests found spec issues; return to planner." }
-    passed: { role: "committer", prompt: "Tests passed; commit and push the changes." }
+    fix_code: { role: "developer", prompt: "Tests found code issues: {{{report}}}. Fix and re-submit." }
+    fix_spec: { role: "planner", prompt: "Tests found spec issues: {{{report}}}. Revise the test spec." }
+    passed: { role: "committer", prompt: "All tests passed. Commit and push branch {{{branch}}} from {{{worktree}}}." }
  committer:
-    hook_failed: { role: "developer", prompt: "Push hook failed; return to developer to fix." }
-    committed: { role: "$END", prompt: "Commit succeeded; complete the workflow." }
+    hook_failed: { role: "developer", prompt: "Push hook failed: {{{error}}}. Fix and re-submit." }
+    committed: { role: "$END", prompt: "PR created: {{{prUrl}}}. Workflow complete." }
@@ -22,7 +22,7 @@ roles:
    frontmatter:
      type: object
      properties:
-        status:
+        $status:
          enum: ["_"]
        thesis:
          type: string
@@ -32,7 +32,7 @@ roles:
            type: string
        caveats:
          type: string
-      required: [status, thesis, keyPoints]
+      required: [$status, thesis, keyPoints]
 graph:
  $START:
    _: { role: "analyst", prompt: "Analyze the topic in the task and produce a structured summary with key points." }
@@ -21,11 +21,11 @@ roles:
    frontmatter:
      type: object
      properties:
-        status:
+        $status:
          enum: ["continue", "conceded"]
        argument:
          type: string
-      required: [status, argument]
+      required: [$status, argument]
  for:
    description: "Argues for the proposition"
    goal: |
@@ -46,11 +46,11 @@ roles:
    frontmatter:
      type: object
      properties:
-        status:
+        $status:
          enum: ["continue", "conceded"]
        argument:
          type: string
-      required: [status, argument]
+      required: [$status, argument]
 graph:
  $START:
    _: { role: "against", prompt: "Present your opening argument against the proposition." }
@@ -27,13 +27,13 @@ roles:
    frontmatter:
      type: object
      properties:
-        status:
+        $status:
          enum: ["_"]
        repoPath:
          type: string
        plan:
          type: string
-      required: [status, repoPath, plan]
+      required: [$status, repoPath, plan]
  developer:
    description: "Implements code changes"
    goal: "You are a developer agent. You implement code changes according to plans."
@@ -52,7 +52,7 @@ roles:
    frontmatter:
      type: object
      properties:
-        status:
+        $status:
          enum: ["_"]
        filesChanged:
          type: array
@@ -60,7 +60,7 @@ roles:
            type: string
        summary:
          type: string
-      required: [status, filesChanged, summary]
+      required: [$status, filesChanged, summary]
  reviewer:
    description: "Reviews code changes"
    goal: "You are a code reviewer. You review implementations for correctness and quality."
@@ -75,11 +75,11 @@ roles:
    frontmatter:
      type: object
      properties:
-        status:
+        $status:
          enum: ["approved", "rejected"]
        comments:
          type: string
-      required: [status, comments]
+      required: [$status, comments]
 graph:
  $START:
    _: { role: "planner", prompt: "Analyze the issue described in the task and produce a detailed implementation plan." }
@@ -13,8 +13,16 @@ import { parse } from "yaml";
 */

 describe("solve-issue workflow: tea pr create worktree fix", () => {
-  // Navigate up from packages/cli-workflow to repo root
-  const workflowPath = join(process.cwd(), "..", "..", ".workflows", "solve-issue.yaml");
+  // Navigate up from packages/cli-workflow/src/__tests__ to repo root
+  const workflowPath = join(
+    import.meta.dirname,
+    "..",
+    "..",
+    "..",
+    "..",
+    ".workflows",
+    "solve-issue.yaml",
+  );

  test("committer procedure should include --repo flag in tea pr create command", async () => {
    const yamlContent = await readFile(workflowPath, "utf-8");
@@ -81,17 +89,18 @@ describe("solve-issue workflow: tea pr create worktree fix", () => {
    expect(workflow.roles.committer?.frontmatter).toBeDefined();
  });

-  test("committer frontmatter schema should require status field", async () => {
+  test("committer frontmatter schema should be oneOf with $status discriminant", async () => {
    const yamlContent = await readFile(workflowPath, "utf-8");
    // Parse as any to access the raw YAML structure (frontmatter is inline JSON Schema in YAML)
    // eslint-disable-next-line @typescript-eslint/no-explicit-any
    const workflow = parse(yamlContent) as any;
-
    const frontmatter = workflow.roles.committer?.frontmatter;
    expect(frontmatter).toBeDefined();
-    expect(frontmatter?.type).toBe("object");
-    expect(frontmatter?.properties?.status).toBeDefined();
-    expect(frontmatter?.properties?.status?.enum).toContain("committed");
-    expect(frontmatter?.required).toContain("status");
+    expect(frontmatter?.oneOf).toBeDefined();
+    const committedVariant = frontmatter.oneOf.find(
+      (v: any) => v.properties?.["$status"]?.const === "committed",
+    );
+    expect(committedVariant).toBeDefined();
+    expect(committedVariant.required).toContain("$status");
  });
 });
@@ -144,6 +144,8 @@ describe("step read", () => {
      output: outputHash,
      detail: detailHash,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    // Read step with large quota
@@ -227,6 +229,8 @@ describe("step read", () => {
      output: outputHash,
      detail: detailHash,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    // Read step with limited quota (700 chars)
@@ -304,6 +308,8 @@ describe("step read", () => {
      output: outputHash,
      detail: detailHash,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    // Read step with minimal quota (1 char)
@@ -357,6 +363,8 @@ describe("step read", () => {
      output: outputHash,
      detail: null,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    // Read step - should return metadata only (no error)
@@ -431,6 +439,8 @@ describe("step read", () => {
      output: outputHash,
      detail: detailHash,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    // Read step - should return metadata only (no error)
@@ -505,6 +515,8 @@ describe("step read", () => {
      output: outputHash,
      detail: detailHash,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    // Read step
@@ -0,0 +1,378 @@
+import { mkdir, mkdtemp, rm } from "node:fs/promises";
+import { tmpdir } from "node:os";
+import { join } from "node:path";
+import { bootstrap, putSchema } from "@uncaged/json-cas";
+import { createFsStore } from "@uncaged/json-cas-fs";
+import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
+import { STEP_NODE_SCHEMA } from "@uncaged/workflow-protocol";
+import { afterEach, beforeEach, describe, expect, test } from "vitest";
+import { cmdStepList } from "../commands/step.js";
+import { cmdThreadRead } from "../commands/thread.js";
+import { registerUwfSchemas } from "../schemas.js";
+import { saveThreadsIndex } from "../store.js";
+
+// ── schemas ──────────────────────────────────────────────────────────────────
+
+const TURN_SCHEMA = {
+  title: "hermes-turn",
+  type: "object" as const,
+  required: ["index", "role", "content"],
+  properties: {
+    index: { type: "integer" as const },
+    role: { type: "string" as const },
+    content: { type: "string" as const },
+    toolCalls: {
+      anyOf: [
+        { type: "array" as const, items: { type: "object" as const } },
+        { type: "null" as const },
+      ],
+    },
+    reasoning: { anyOf: [{ type: "string" as const }, { type: "null" as const }] },
+  },
+  additionalProperties: false,
+};
+
+const DETAIL_SCHEMA = {
+  title: "hermes-detail",
+  type: "object" as const,
+  required: ["sessionId", "model", "duration", "turnCount", "turns"],
+  properties: {
+    sessionId: { type: "string" as const },
+    model: { type: "string" as const },
+    duration: { type: "integer" as const },
+    turnCount: { type: "integer" as const },
+    turns: {
+      type: "array" as const,
+      items: { type: "string" as const, format: "cas_ref" },
+    },
+  },
+  additionalProperties: false,
+};
+
+// ── helpers ──────────────────────────────────────────────────────────────────
+
+async function registerDetailSchemas(store: ReturnType<typeof createFsStore>) {
+  await bootstrap(store);
+  const [turn, detail] = await Promise.all([
+    putSchema(store, TURN_SCHEMA),
+    putSchema(store, DETAIL_SCHEMA),
+  ]);
+  return { turn, detail };
+}
+
+// ── fixture ──────────────────────────────────────────────────────────────────
+
+let tmpDir: string;
+
+beforeEach(async () => {
+  tmpDir = await mkdtemp(join(tmpdir(), "cli-uwf-step-timing-test-"));
+});
+
+afterEach(async () => {
+  await rm(tmpDir, { recursive: true, force: true });
+});
+
+// ── 1. Protocol types (compile-time) ─────────────────────────────────────────
+
+describe("protocol types", () => {
+  test("StepRecord has startedAtMs and completedAtMs as required fields", () => {
+    // Type-level test: this block compiles only if fields exist and are number
+    const record: import("@uncaged/workflow-protocol").StepRecord = {
+      role: "test",
+      output: "hash1" as CasRef,
+      detail: "hash2" as CasRef,
+      agent: "uwf-test",
+      edgePrompt: "",
+      startedAtMs: 1000,
+      completedAtMs: 2000,
+    };
+    expect(record.startedAtMs).toBe(1000);
+    expect(record.completedAtMs).toBe(2000);
+  });
+
+  test("StepEntry has durationMs as required field", () => {
+    const entry: import("@uncaged/workflow-protocol").StepEntry = {
+      hash: "hash" as CasRef,
+      role: "test",
+      output: {},
+      detail: "hash2" as CasRef,
+      agent: "uwf-test",
+      timestamp: 123,
+      durationMs: 5000,
+    };
+    expect(entry.durationMs).toBe(5000);
+  });
+});
+
+// ── 2. JSON Schema ───────────────────────────────────────────────────────────
+
+describe("StepNode JSON schema", () => {
+  test("schema requires startedAtMs and completedAtMs", () => {
+    const required = STEP_NODE_SCHEMA.required as string[];
+    expect(required).toContain("startedAtMs");
+    expect(required).toContain("completedAtMs");
+  });
+
+  test("schema defines timing fields as integer", () => {
+    const props = STEP_NODE_SCHEMA.properties as Record<string, { type: string }>;
+    expect(props.startedAtMs.type).toBe("integer");
+    expect(props.completedAtMs.type).toBe("integer");
+  });
+
+  test("StepNode with timing fields passes CAS validation", async () => {
+    const casDir = join(tmpDir, "cas");
+    await mkdir(casDir, { recursive: true });
+    const store = createFsStore(casDir);
+    const schemas = await registerUwfSchemas(store);
+
+    const startHash = await store.put(schemas.startNode, {
+      workflow: "placeholder0000" as CasRef,
+      prompt: "test",
+    });
+
+    const outputHash = await store.put(schemas.text, "output text");
+
+    const detailSchemas = await registerDetailSchemas(store);
+    const detailHash = await store.put(detailSchemas.detail, {
+      sessionId: "s1",
+      model: "m1",
+      duration: 100,
+      turnCount: 0,
+      turns: [],
+    });
+
+    // Should succeed — valid timing fields
+    const hash = await store.put(schemas.stepNode, {
+      start: startHash,
+      prev: null,
+      role: "worker",
+      output: outputHash,
+      detail: detailHash,
+      agent: "uwf-test",
+      edgePrompt: "",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
+    });
+    expect(hash).toBeTruthy();
+  });
+});
+
+// ── 3. step list — durationMs computed ───────────────────────────────────────
+
+describe("step list timing", () => {
+  test("step list includes durationMs = completedAtMs - startedAtMs", async () => {
+    const casDir = join(tmpDir, "cas");
+    await mkdir(casDir, { recursive: true });
+    const store = createFsStore(casDir);
+    const schemas = await registerUwfSchemas(store);
+    const detailSchemas = await registerDetailSchemas(store);
+
+    const workflowHash = await store.put(schemas.workflow, {
+      name: "test-wf",
+      description: "desc",
+      roles: {},
+      graph: {},
+    });
+
+    const startHash = await store.put(schemas.startNode, {
+      workflow: workflowHash,
+      prompt: "test",
+    });
+
+    const outputHash = await store.put(schemas.text, "output");
+    const detailHash = await store.put(detailSchemas.detail, {
+      sessionId: "s1",
+      model: "m1",
+      duration: 100,
+      turnCount: 0,
+      turns: [],
+    });
+
+    const startedAt = 1716600000000;
+    const completedAt = 1716600003500;
+
+    const stepHash = await store.put(schemas.stepNode, {
+      start: startHash,
+      prev: null,
+      role: "worker",
+      output: outputHash,
+      detail: detailHash,
+      agent: "uwf-test",
+      edgePrompt: "",
+      startedAtMs: startedAt,
+      completedAtMs: completedAt,
+    });
+
+    const threadId = "01HX2Q3R4S5T6V7W8X9YZ1" as ThreadId;
+    await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
+
+    const result = await cmdStepList(tmpDir, threadId);
+    const stepEntries = result.steps.slice(1); // skip start entry
+    expect(stepEntries).toHaveLength(1);
+
+    const step = stepEntries[0] as import("@uncaged/workflow-protocol").StepEntry;
+    expect(step.durationMs).toBe(3500);
+  });
+});
+
+// ── 4. thread read — duration in header ──────────────────────────────────────
+
+describe("thread read timing", () => {
+  test("thread read header includes Duration", async () => {
+    const casDir = join(tmpDir, "cas");
+    await mkdir(casDir, { recursive: true });
+    const store = createFsStore(casDir);
+    const schemas = await registerUwfSchemas(store);
+    const detailSchemas = await registerDetailSchemas(store);
+
+    const workflowHash = await store.put(schemas.workflow, {
+      name: "test-wf",
+      description: "desc",
+      roles: {
+        worker: {
+          description: "Worker",
+          goal: "Do work",
+          capabilities: [],
+          procedure: "work",
+          output: "result",
+          frontmatter: "placeholder0000" as CasRef,
+        },
+      },
+      graph: {
+        $START: { _: { role: "worker", prompt: "go" } },
+        worker: { _: { role: "$END", prompt: "" } },
+      },
+    });
+
+    const startHash = await store.put(schemas.startNode, {
+      workflow: workflowHash,
+      prompt: "test task",
+    });
+
+    const turnHash = await store.put(detailSchemas.turn, {
+      index: 0,
+      role: "assistant",
+      content: "Done.",
+      toolCalls: null,
+      reasoning: null,
+    });
+    const detailHash = await store.put(detailSchemas.detail, {
+      sessionId: "s1",
+      model: "m1",
+      duration: 100,
+      turnCount: 1,
+      turns: [turnHash],
+    });
+    const outputHash = await store.put(schemas.text, "output");
+
+    const stepHash = await store.put(schemas.stepNode, {
+      start: startHash,
+      prev: null,
+      role: "worker",
+      output: outputHash,
+      detail: detailHash,
+      agent: "uwf-test",
+      edgePrompt: "",
+      startedAtMs: 1716600000000,
+      completedAtMs: 1716600042000,
+    });
+
+    const threadId = "01HX2Q3R4S5T6V7W8X9YZ3" as ThreadId;
+    await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
+
+    const markdown = await cmdThreadRead(tmpDir, threadId, 10000, null, false);
+    expect(markdown).toContain("**Duration:** 42.0s");
+  });
+
+  test("thread read shows sub-second duration as ms", async () => {
+    const casDir = join(tmpDir, "cas");
+    await mkdir(casDir, { recursive: true });
+    const store = createFsStore(casDir);
+    const schemas = await registerUwfSchemas(store);
+    const detailSchemas = await registerDetailSchemas(store);
+
+    const workflowHash = await store.put(schemas.workflow, {
+      name: "test-wf",
+      description: "desc",
+      roles: {
+        worker: {
+          description: "Worker",
+          goal: "Do work",
+          capabilities: [],
+          procedure: "work",
+          output: "result",
+          frontmatter: "placeholder0000" as CasRef,
+        },
+      },
+      graph: {
+        $START: { _: { role: "worker", prompt: "go" } },
+        worker: { _: { role: "$END", prompt: "" } },
+      },
+    });
+
+    const startHash = await store.put(schemas.startNode, {
+      workflow: workflowHash,
+      prompt: "test",
+    });
+
+    const turnHash = await store.put(detailSchemas.turn, {
+      index: 0,
+      role: "assistant",
+      content: "Done.",
+      toolCalls: null,
+      reasoning: null,
+    });
+    const detailHash = await store.put(detailSchemas.detail, {
+      sessionId: "s1",
+      model: "m1",
+      duration: 100,
+      turnCount: 1,
+      turns: [turnHash],
+    });
+    const outputHash = await store.put(schemas.text, "output");
+
+    const stepHash = await store.put(schemas.stepNode, {
+      start: startHash,
+      prev: null,
+      role: "worker",
+      output: outputHash,
+      detail: detailHash,
+      agent: "uwf-test",
+      edgePrompt: "",
+      startedAtMs: 1716600000000,
+      completedAtMs: 1716600000350,
+    });
+
+    const threadId = "01HX2Q3R4S5T6V7W8X9YZ4" as ThreadId;
+    await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
+
+    const markdown = await cmdThreadRead(tmpDir, threadId, 10000, null, false);
+    expect(markdown).toContain("**Duration:** 350ms");
+  });
+});
+
+// ── 6. Breaking change — old data without timing fails ───────────────────────
+
+describe("breaking change", () => {
+  test("StepNode schema rejects payload without timing fields", () => {
+    const required = STEP_NODE_SCHEMA.required as string[];
+    // Both fields must be in the required array
+    expect(required).toContain("startedAtMs");
+    expect(required).toContain("completedAtMs");
+
+    // Payload without timing fields would fail schema validation
+    // because the schema marks them as required
+    const payloadWithoutTiming = {
+      start: "hash1",
+      prev: null,
+      role: "worker",
+      output: "hash2",
+      detail: "hash3",
+      agent: "uwf-test",
+      edgePrompt: "",
+    };
+    // Verify the payload is missing required fields
+    expect(payloadWithoutTiming).not.toHaveProperty("startedAtMs");
+    expect(payloadWithoutTiming).not.toHaveProperty("completedAtMs");
+  });
+});
@@ -141,6 +141,8 @@ describe("thread read --quota flag", () => {
        output: outputHash,
        detail: detailHash,
        agent: "uwf-test",
+        startedAtMs: 1000000000000,
+        completedAtMs: 1000000005000,
      });
      steps.push(stepHash);
    }
@@ -221,6 +223,8 @@ describe("thread read --quota flag", () => {
      output: outputHash,
      detail: step1DetailHash,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const step2Content = generateContent(600, "Second");
@@ -245,6 +249,8 @@ describe("thread read --quota flag", () => {
      output: outputHash,
      detail: step2DetailHash,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const threadId = "01HX2Q3R4S5T6V7W8X9YZ1" as ThreadId;
@@ -328,6 +334,8 @@ describe("thread read --quota flag", () => {
        output: outputHash,
        detail: detailHash,
        agent: "uwf-test",
+        startedAtMs: 1000000000000,
+        completedAtMs: 1000000005000,
      });
      steps.push(stepHash);
    }
@@ -338,8 +346,8 @@ describe("thread read --quota flag", () => {
    // Set tight quota with --start flag
    const markdown = await cmdThreadRead(tmpDir, threadId, 600, null, true);

-    // Quota must be reasonably enforced (allow ~210 char tolerance for structure)
-    expect(markdown.length).toBeLessThanOrEqual(810);
+    // Quota must be reasonably enforced (allow ~260 char tolerance for structure)
+    expect(markdown.length).toBeLessThanOrEqual(860);

    // Should contain thread header
    expect(markdown).toMatch(/# Thread/);
@@ -405,6 +413,8 @@ describe("thread read --quota flag", () => {
      output: outputHash,
      detail: detailHash,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const threadId = "01HX2Q3R4S5T6V7W8X9YZ4" as ThreadId;
@@ -480,6 +490,8 @@ describe("thread read --quota flag", () => {
        output: outputHash,
        detail: detailHash,
        agent: "uwf-test",
+        startedAtMs: 1000000000000,
+        completedAtMs: 1000000005000,
      });
      steps.push(stepHash);
    }
@@ -559,6 +571,8 @@ describe("thread read --quota flag", () => {
        output: outputHash,
        detail: detailHash,
        agent: "uwf-test",
+        startedAtMs: 1000000000000,
+        completedAtMs: 1000000005000,
      });
      steps.push(stepHash);
    }
@@ -139,6 +139,8 @@ describe("thread read XML tag isolation", () => {
      output: outputHash,
      detail: detailHash,
      agent: "uwf-claude-code",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const threadId = "01JTEST0000000000000001" as ThreadId;
@@ -214,6 +216,8 @@ describe("thread read XML tag isolation", () => {
      output: outputHash,
      detail: detailHash,
      agent: "uwf-claude-code",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const threadId = "01JTEST0000000000000002" as ThreadId;
@@ -274,6 +278,8 @@ describe("thread read XML tag isolation", () => {
      output: outputHash,
      detail: null,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const step2 = await uwf.store.put(uwf.schemas.stepNode, {
@@ -283,6 +289,8 @@ describe("thread read XML tag isolation", () => {
      output: outputHash,
      detail: null,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const threadId = "01JTEST0000000000000003" as ThreadId;
@@ -335,6 +343,8 @@ describe("thread read XML tag isolation", () => {
      output: outputHash,
      detail: null,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const threadId = "01JTEST0000000000000004" as ThreadId;
@@ -387,6 +397,8 @@ describe("thread read XML tag isolation", () => {
      output: outputHash,
      detail: missingDetailRef,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const threadId = "01JTEST0000000000000005" as ThreadId;
@@ -439,6 +451,8 @@ describe("thread read XML tag isolation", () => {
      output: outputHash,
      detail: null,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const threadId = "01JTEST0000000000000006" as ThreadId;
@@ -511,6 +525,8 @@ describe("thread read XML tag isolation", () => {
      output: outputHash,
      detail: null,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const step2 = await uwf.store.put(uwf.schemas.stepNode, {
@@ -520,6 +536,8 @@ describe("thread read XML tag isolation", () => {
      output: outputHash,
      detail: null,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const step3 = await uwf.store.put(uwf.schemas.stepNode, {
@@ -529,6 +547,8 @@ describe("thread read XML tag isolation", () => {
      output: outputHash,
      detail: null,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const threadId = "01JTEST0000000000000007" as ThreadId;
@@ -607,6 +627,8 @@ describe("thread read XML tag isolation", () => {
      output: outputHash,
      detail: detailHash,
      agent: "uwf-test",
+      startedAtMs: 1000000000000,
+      completedAtMs: 1000000005000,
    });

    const threadId = "01JTEST0000000000000008" as ThreadId;
@@ -661,6 +683,8 @@ describe("thread read XML tag isolation", () => {
        output: outputHash,
        detail: null,
        agent: "uwf-test",
+        startedAtMs: 1000000000000,
+        completedAtMs: 1000000005000,
      })) as CasRef;
      steps.push(step);
      prev = step;
@@ -0,0 +1,366 @@
+import type { WorkflowPayload } from "@uncaged/workflow-protocol";
+import { describe, expect, test } from "vitest";
+import { validateWorkflow } from "../validate-semantic.js";
+
+/** Build a valid two-role workflow that passes all checks. */
+function makeWorkflow(overrides?: Partial<WorkflowPayload>): WorkflowPayload {
+  const base: WorkflowPayload = {
+    name: "test-workflow",
+    description: "A test workflow",
+    roles: {
+      writer: {
+        description: "Writes content",
+        goal: "Write content",
+        capabilities: ["writing"],
+        procedure: "Write it",
+        output: "The content",
+        frontmatter: {
+          type: "object",
+          properties: {
+            $status: { enum: ["_"] },
+            plan: { type: "string" },
+          },
+          required: ["$status", "plan"],
+        } as unknown as string,
+      },
+      reviewer: {
+        description: "Reviews content",
+        goal: "Review content",
+        capabilities: ["reviewing"],
+        procedure: "Review it",
+        output: "The review",
+        frontmatter: {
+          type: "object",
+          oneOf: [
+            {
+              properties: {
+                $status: { const: "approved" },
+                summary: { type: "string" },
+              },
+              required: ["$status", "summary"],
+            },
+            {
+              properties: {
+                $status: { const: "rejected" },
+                reason: { type: "string" },
+              },
+              required: ["$status", "reason"],
+            },
+          ],
+        } as unknown as string,
+      },
+    },
+    graph: {
+      $START: { _: { role: "writer", prompt: "Begin writing" } },
+      writer: { _: { role: "reviewer", prompt: "Review this: {{{plan}}}" } },
+      reviewer: {
+        approved: { role: "$END", prompt: "Done: {{{summary}}}" },
+        rejected: { role: "writer", prompt: "Fix: {{{reason}}}" },
+      },
+    },
+  };
+
+  if (!overrides) return base;
+  return { ...base, ...overrides };
+}
+
+describe("Suite 1: Role Reference Integrity", () => {
+  test("1.1 graph references unknown role", () => {
+    const wf = makeWorkflow();
+    wf.graph.nonexistent = { _: { role: "$END", prompt: "done" } };
+    const errors = validateWorkflow(wf);
+    expect(errors.some((e) => e.includes('unknown role "nonexistent"'))).toBe(true);
+  });
+
+  test("1.2 orphan role not in graph", () => {
+    const wf = makeWorkflow();
+    wf.roles.orphan = {
+      description: "Orphan",
+      goal: "Nothing",
+      capabilities: [],
+      procedure: "None",
+      output: "None",
+      frontmatter: {
+        type: "object",
+        properties: { $status: { enum: ["_"] } },
+        required: ["$status"],
+      } as unknown as string,
+    };
+    const errors = validateWorkflow(wf);
+    expect(
+      errors.some((e) => e.includes('role "orphan" is defined but not referenced in graph')),
+    ).toBe(true);
+  });
+
+  test("1.3 $START in roles", () => {
+    const wf = makeWorkflow();
+    (wf.roles as Record<string, unknown>).$START = {
+      description: "Bad",
+      goal: "Bad",
+      capabilities: [],
+      procedure: "Bad",
+      output: "Bad",
+      frontmatter: { type: "object", properties: {}, required: [] },
+    };
+    const errors = validateWorkflow(wf);
+    expect(errors.some((e) => e.includes('reserved name "$START"'))).toBe(true);
+  });
+
+  test("1.4 $END in roles", () => {
+    const wf = makeWorkflow();
+    (wf.roles as Record<string, unknown>).$END = {
+      description: "Bad",
+      goal: "Bad",
+      capabilities: [],
+      procedure: "Bad",
+      output: "Bad",
+      frontmatter: { type: "object", properties: {}, required: [] },
+    };
+    const errors = validateWorkflow(wf);
+    expect(errors.some((e) => e.includes('reserved name "$END"'))).toBe(true);
+  });
+
+  test("1.5 valid workflow returns no errors", () => {
+    const wf = makeWorkflow();
+    const errors = validateWorkflow(wf);
+    expect(errors).toEqual([]);
+  });
+});
+
+describe("Suite 2: Graph Structure", () => {
+  test("2.1 $START missing from graph", () => {
+    const wf = makeWorkflow();
+    delete wf.graph.$START;
+    const errors = validateWorkflow(wf);
+    expect(errors.some((e) => e.includes("$START must be defined in graph"))).toBe(true);
+  });
+
+  test("2.2 $START has multiple status keys", () => {
+    const wf = makeWorkflow();
+    wf.graph.$START = {
+      _: { role: "writer", prompt: "Begin" },
+      other: { role: "reviewer", prompt: "Also" },
+    };
+    const errors = validateWorkflow(wf);
+    expect(
+      errors.some((e) => e.includes('$START must have exactly one edge with status "_"')),
+    ).toBe(true);
+  });
+
+  test("2.3 $START edge uses non-_ status", () => {
+    const wf = makeWorkflow();
+    wf.graph.$START = { ready: { role: "writer", prompt: "Begin" } };
+    const errors = validateWorkflow(wf);
+    expect(
+      errors.some((e) => e.includes('$START must have exactly one edge with status "_"')),
+    ).toBe(true);
+  });
+
+  test("2.4 $END has outgoing edges", () => {
+    const wf = makeWorkflow();
+    wf.graph.$END = { _: { role: "writer", prompt: "Loop" } };
+    const errors = validateWorkflow(wf);
+    expect(errors.some((e) => e.includes("$END must not have outgoing edges"))).toBe(true);
+  });
+
+  test("2.5 unreachable role", () => {
+    const wf = makeWorkflow();
+    wf.roles.isolated = {
+      description: "Isolated",
+      goal: "Isolated",
+      capabilities: [],
+      procedure: "Isolated",
+      output: "Isolated",
+      frontmatter: {
+        type: "object",
+        properties: { $status: { enum: ["_"] } },
+        required: ["$status"],
+      } as unknown as string,
+    };
+    wf.graph.isolated = { _: { role: "$END", prompt: "done" } };
+    const errors = validateWorkflow(wf);
+    expect(errors.some((e) => e.includes('role "isolated" is not reachable from $START'))).toBe(
+      true,
+    );
+  });
+
+  test("2.6 edge target references invalid role", () => {
+    const wf = makeWorkflow();
+    wf.graph.writer = { _: { role: "ghost", prompt: "Go to ghost" } };
+    const errors = validateWorkflow(wf);
+    expect(errors.some((e) => e.includes('unknown target role "ghost"'))).toBe(true);
+  });
+});
+
+describe("Suite 3: Status-Edge Consistency", () => {
+  test("3.1 single-exit role with multiple graph keys", () => {
+    const wf = makeWorkflow();
+    wf.graph.writer = {
+      _: { role: "reviewer", prompt: "Review" },
+      extra: { role: "$END", prompt: "Done" },
+    };
+    const errors = validateWorkflow(wf);
+    expect(
+      errors.some((e) =>
+        e.includes('role "writer" is single-exit but has status keys other than "_"'),
+      ),
+    ).toBe(true);
+  });
+
+  test("3.2 single-exit role missing _ key", () => {
+    const wf = makeWorkflow();
+    wf.graph.writer = { done: { role: "reviewer", prompt: "Review" } };
+    const errors = validateWorkflow(wf);
+    expect(
+      errors.some((e) => e.includes('role "writer" is single-exit but graph has no "_" key')),
+    ).toBe(true);
+  });
+
+  test("3.3 multi-exit role with extra statuses", () => {
+    const wf = makeWorkflow();
+    wf.graph.reviewer = {
+      approved: { role: "$END", prompt: "Done" },
+      rejected: { role: "writer", prompt: "Fix" },
+      timeout: { role: "$END", prompt: "Timed out" },
+    };
+    const errors = validateWorkflow(wf);
+    expect(
+      errors.some((e) => e.includes('role "reviewer" graph has extra status keys: timeout')),
+    ).toBe(true);
+  });
+
+  test("3.4 multi-exit role missing a status", () => {
+    const wf = makeWorkflow();
+    wf.graph.reviewer = {
+      approved: { role: "$END", prompt: "Done" },
+    };
+    const errors = validateWorkflow(wf);
+    expect(
+      errors.some((e) => e.includes('role "reviewer" graph is missing status keys: rejected')),
+    ).toBe(true);
+  });
+
+  test("3.5 multi-exit role with _ key", () => {
+    const wf = makeWorkflow();
+    wf.graph.reviewer = { _: { role: "$END", prompt: "Done" } };
+    const errors = validateWorkflow(wf);
+    expect(errors.some((e) => e.includes('role "reviewer" is multi-exit but graph uses "_"'))).toBe(
+      true,
+    );
+  });
+});
+
+describe("Suite 4: Mustache Template Variable Existence", () => {
+  test("4.1 prompt references nonexistent variable (single-exit)", () => {
+    const wf = makeWorkflow();
+    wf.graph.writer = { _: { role: "reviewer", prompt: "Review: {{{branch}}}" } };
+    const errors = validateWorkflow(wf);
+    expect(
+      errors.some((e) =>
+        e.includes('prompt variable "branch" not found in role "writer" frontmatter'),
+      ),
+    ).toBe(true);
+  });
+
+  test("4.2 prompt references nonexistent variable (multi-exit)", () => {
+    const wf = makeWorkflow();
+    wf.graph.reviewer = {
+      approved: { role: "$END", prompt: "Done: {{{branch}}}" },
+      rejected: { role: "writer", prompt: "Fix: {{{reason}}}" },
+    };
+    const errors = validateWorkflow(wf);
+    expect(
+      errors.some((e) =>
+        e.includes('prompt variable "branch" not found in role "reviewer" variant "approved"'),
+      ),
+    ).toBe(true);
+  });
+
+  test("4.3 valid mustache variables pass", () => {
+    const wf = makeWorkflow();
+    const errors = validateWorkflow(wf);
+    expect(errors).toEqual([]);
+  });
+
+  test("4.4 $status variable is always valid", () => {
+    const wf = makeWorkflow();
+    wf.graph.writer = { _: { role: "reviewer", prompt: "Status: {{$status}}" } };
+    const errors = validateWorkflow(wf);
+    expect(errors).toEqual([]);
+  });
+});
+
+describe("Suite 5: oneOf Discriminant Validity", () => {
+  test("5.1 oneOf without $status const", () => {
+    const wf = makeWorkflow();
+    wf.roles.reviewer = {
+      ...wf.roles.reviewer,
+      frontmatter: {
+        type: "object",
+        oneOf: [
+          { properties: { summary: { type: "string" } }, required: ["summary"] },
+          { properties: { reason: { type: "string" } }, required: ["reason"] },
+        ],
+      } as unknown as string,
+    };
+    const errors = validateWorkflow(wf);
+    expect(
+      errors.some((e) => e.includes('oneOf variants must have "$status" as const discriminant')),
+    ).toBe(true);
+  });
+
+  test("5.2 oneOf with non-const $status", () => {
+    const wf = makeWorkflow();
+    wf.roles.reviewer = {
+      ...wf.roles.reviewer,
+      frontmatter: {
+        type: "object",
+        oneOf: [
+          {
+            properties: { $status: { type: "string" }, summary: { type: "string" } },
+            required: ["$status", "summary"],
+          },
+          {
+            properties: { $status: { type: "string" }, reason: { type: "string" } },
+            required: ["$status", "reason"],
+          },
+        ],
+      } as unknown as string,
+    };
+    const errors = validateWorkflow(wf);
+    expect(errors.some((e) => e.includes("oneOf variant $status must be a const value"))).toBe(
+      true,
+    );
+  });
+
+  test("5.3 valid oneOf passes", () => {
+    const wf = makeWorkflow();
+    const errors = validateWorkflow(wf);
+    expect(errors).toEqual([]);
+  });
+});
+
+describe("Suite 6: Multiple Errors Collection", () => {
+  test("6.1 multiple errors collected", () => {
+    const wf = makeWorkflow();
+    // orphan role
+    wf.roles.orphan = {
+      description: "Orphan",
+      goal: "Nothing",
+      capabilities: [],
+      procedure: "None",
+      output: "None",
+      frontmatter: {
+        type: "object",
+        properties: { $status: { enum: ["_"] } },
+        required: ["$status"],
+      } as unknown as string,
+    };
+    // unknown graph reference
+    wf.graph.nonexistent = { _: { role: "$END", prompt: "done" } };
+    // bad mustache var
+    wf.graph.writer = { _: { role: "reviewer", prompt: "{{{badvar}}}" } };
+    const errors = validateWorkflow(wf);
+    expect(errors.length).toBeGreaterThanOrEqual(3);
+  });
+});
@@ -20,23 +20,37 @@ async function makeUwfStore(storageRoot: string): Promise<UwfStore> {
  return { storageRoot, store, schemas };
 }

-async function storeWorkflow(uwf: UwfStore, name: string): Promise<CasRef> {
-  const payload: WorkflowPayload = {
+function makeMinimalPayload(name: string, description: string): WorkflowPayload {
+  return {
    name,
-    description: "Test workflow",
-    roles: {},
-    graph: {},
+    description,
+    roles: {
+      worker: {
+        description: "worker role",
+        goal: "do work",
+        capabilities: [],
+        procedure: "",
+        output: "",
+        frontmatter: { type: "0000000000000" } as unknown as CasRef,
+      },
+    },
+    graph: {
+      $START: { _: { role: "worker", prompt: "start working" } },
+      worker: { _: { role: "$END", prompt: "done" } },
+    },
  };
+}
+
+async function storeWorkflow(uwf: UwfStore, name: string): Promise<CasRef> {
+  const payload = makeMinimalPayload(name, "Test workflow");
  return await uwf.store.put(uwf.schemas.workflow, payload);
 }

 async function createWorkflowYaml(name: string, version: string | null = null): Promise<string> {
-  const payload: WorkflowPayload = {
+  const payload = makeMinimalPayload(
    name,
-    description: version !== null ? `Test workflow (${version})` : "Test workflow",
-    roles: {},
-    graph: {},
-  };
+    version !== null ? `Test workflow (${version})` : "Test workflow",
+  );
  const yaml = stringify(payload);
  return yaml;
 }
@@ -58,6 +58,7 @@ export async function cmdStepList(
      detail: item.payload.detail ?? null,
      agent: item.payload.agent,
      timestamp: item.timestamp,
+      durationMs: item.payload.completedAtMs - item.payload.startedAtMs,
    });
  }

@@ -40,6 +40,7 @@ import {
  type UwfStore,
 } from "../store.js";
 import { checkWorkflowFilenameConsistency, isCasRef, parseWorkflowPayload } from "../validate.js";
+import { validateWorkflow } from "../validate-semantic.js";
 import {
  type ChainState,
  collectOrderedSteps,
@@ -169,6 +170,11 @@ async function materializeLocalWorkflow(uwf: UwfStore, filePath: string): Promis
    fail(filenameError);
  }

+  const semanticErrors = validateWorkflow(payload);
+  if (semanticErrors.length > 0) {
+    fail(`workflow validation failed:\n${semanticErrors.map((e) => `  - ${e}`).join("\n")}`);
+  }
+
  const materialized = await materializeWorkflowPayload(uwf, payload);
  const hash = await uwf.store.put(uwf.schemas.workflow, materialized);
  const stored = uwf.store.get(hash);
@@ -560,14 +566,25 @@ function selectByQuota(
  return { selected, skippedCount: candidates.length - selected.length };
 }

+function formatDuration(ms: number): string {
+  if (ms < 1000) return `${ms}ms`;
+  const seconds = ms / 1000;
+  if (seconds < 60) return `${seconds.toFixed(1)}s`;
+  const minutes = Math.floor(seconds / 60);
+  const remainingSec = Math.round(seconds % 60);
+  return `${minutes}m${remainingSec}s`;
+}
+
 function formatStepHeader(stepNum: number, item: OrderedStepItem): string {
  const ts = new Date(item.timestamp)
    .toISOString()
    .replace("T", " ")
    .replace(/\.\d+Z$/, "");
+  const durationMs = item.payload.completedAtMs - item.payload.startedAtMs;
+  const duration = formatDuration(durationMs);
  return [
    `## Step ${stepNum}: ${item.payload.role} \`${item.hash}\``,
-    `**Agent:** ${item.payload.agent} | **Time:** ${ts}`,
+    `**Agent:** ${item.payload.agent} | **Time:** ${ts} | **Duration:** ${duration}`,
  ].join("\n");
 }

@@ -669,14 +686,16 @@ function formatThreadReadMarkdown(options: {
  return parts.join("\n\n---\n\n");
 }

-type EvaluateLastOutput = Record<string, unknown> & { status: string };
+type EvaluateLastOutput = Record<string, unknown>;
+
+const STATUS_KEY = "$status";

 function resolveEvaluateArgs(
  uwf: UwfStore,
  chain: ChainState,
 ): { lastRole: string; lastOutput: EvaluateLastOutput } {
  if (chain.headIsStart) {
-    return { lastRole: START_ROLE, lastOutput: { status: "_" } };
+    return { lastRole: START_ROLE, lastOutput: { [STATUS_KEY]: "_" } };
  }

  const lastStep = chain.stepsNewestFirst[0];
@@ -689,11 +708,10 @@ function resolveEvaluateArgs(
    typeof raw === "object" && raw !== null && !Array.isArray(raw)
      ? (raw as Record<string, unknown>)
      : {};
-  const status = typeof base.status === "string" ? base.status : "_";

  return {
    lastRole: lastStep.role,
-    lastOutput: { ...base, status },
+    lastOutput: base,
  };
 }

@@ -15,6 +15,7 @@ import {
  type UwfStore,
 } from "../store.js";
 import { checkWorkflowFilenameConsistency, parseWorkflowPayload } from "../validate.js";
+import { validateWorkflow } from "../validate-semantic.js";

 export type WorkflowOrigin = "local" | "global";

@@ -136,6 +137,11 @@ export async function cmdWorkflowAdd(
    fail(filenameError);
  }

+  const semanticErrors = validateWorkflow(payload);
+  if (semanticErrors.length > 0) {
+    fail(`workflow validation failed:\n${semanticErrors.map((e) => `  - ${e}`).join("\n")}`);
+  }
+
  const uwf = await createUwfStore(storageRoot);
  const materialized = await materializeWorkflowPayload(uwf, payload);

@@ -0,0 +1,278 @@
+import type { WorkflowPayload } from "@uncaged/workflow-protocol";
+
+type SchemaObj = Record<string, unknown>;
+
+const RESERVED_NAMES = new Set(["$START", "$END"]);
+
+/** Extract mustache variable names from a prompt string. */
+function extractMustacheVars(prompt: string): string[] {
+  const vars: string[] = [];
+  const re = /\{\{\{?([^}]+)\}\}\}?/g;
+  let m: RegExpExecArray | null = re.exec(prompt);
+  while (m !== null) {
+    vars.push(m[1]);
+    m = re.exec(prompt);
+  }
+  return vars;
+}
+
+/** Check if a frontmatter schema is a oneOf (multi-exit) type. */
+function isOneOfSchema(fm: unknown): fm is SchemaObj & { oneOf: SchemaObj[] } {
+  if (typeof fm !== "object" || fm === null) return false;
+  const obj = fm as SchemaObj;
+  return Array.isArray(obj.oneOf);
+}
+
+/** Get property names from a schema object. */
+function getPropertyNames(schema: SchemaObj): Set<string> {
+  const props = schema.properties;
+  if (typeof props !== "object" || props === null) return new Set();
+  return new Set(Object.keys(props as Record<string, unknown>));
+}
+
+/** Extract $status const values from oneOf variants. */
+function getOneOfStatuses(variants: SchemaObj[]): string[] {
+  const statuses: string[] = [];
+  for (const variant of variants) {
+    const props = variant.properties as Record<string, SchemaObj> | undefined;
+    if (props?.$status) {
+      const statusDef = props.$status;
+      if (typeof statusDef.const === "string") {
+        statuses.push(statusDef.const);
+      }
+    }
+  }
+  return statuses;
+}
+
+/** Check reserved names and role/graph reference integrity. */
+function checkRoleReferences(payload: WorkflowPayload, errors: string[]): void {
+  const roleNames = new Set(Object.keys(payload.roles));
+  const graphNodes = new Set(Object.keys(payload.graph));
+
+  for (const name of roleNames) {
+    if (RESERVED_NAMES.has(name)) {
+      errors.push(`reserved name "${name}" must not appear in roles`);
+    }
+  }
+
+  for (const node of graphNodes) {
+    if (!RESERVED_NAMES.has(node) && !roleNames.has(node)) {
+      errors.push(`graph references unknown role "${node}"`);
+    }
+  }
+
+  for (const name of roleNames) {
+    if (RESERVED_NAMES.has(name)) continue;
+    if (!graphNodes.has(name)) {
+      errors.push(`role "${name}" is defined but not referenced in graph`);
+    }
+  }
+}
+
+/** Check $START/$END constraints, edge targets, and reachability. */
+function checkGraphStructure(payload: WorkflowPayload, errors: string[]): void {
+  const roleNames = new Set(Object.keys(payload.roles));
+  const graphNodes = new Set(Object.keys(payload.graph));
+
+  if (!graphNodes.has("$START")) {
+    errors.push("$START must be defined in graph");
+  } else {
+    const startKeys = Object.keys(payload.graph.$START);
+    if (startKeys.length !== 1 || startKeys[0] !== "_") {
+      errors.push('$START must have exactly one edge with status "_"');
+    }
+  }
+
+  if (graphNodes.has("$END")) {
+    errors.push("$END must not have outgoing edges");
+  }
+
+  for (const [node, statusMap] of Object.entries(payload.graph)) {
+    for (const [status, target] of Object.entries(statusMap)) {
+      if (target.role !== "$END" && !roleNames.has(target.role)) {
+        errors.push(`edge ${node}→${status}: unknown target role "${target.role}"`);
+      }
+    }
+  }
+
+  checkReachability(roleNames, collectReachableRoles(payload.graph), errors);
+}
+
+/** BFS to collect all roles reachable from $START. */
+function collectReachableRoles(graph: WorkflowPayload["graph"]): Set<string> {
+  const reachable = new Set<string>();
+  const startEdges = graph.$START;
+  if (!startEdges) return reachable;
+
+  const queue: string[] = [];
+  for (const target of Object.values(startEdges)) {
+    if (target.role !== "$END" && !reachable.has(target.role)) {
+      reachable.add(target.role);
+      queue.push(target.role);
+    }
+  }
+
+  while (queue.length > 0) {
+    const current = queue.shift() as string;
+    const edges = graph[current];
+    if (!edges) continue;
+    for (const target of Object.values(edges)) {
+      if (target.role !== "$END" && !reachable.has(target.role)) {
+        reachable.add(target.role);
+        queue.push(target.role);
+      }
+    }
+  }
+
+  return reachable;
+}
+
+/** Check that all defined roles are reachable from $START. */
+function checkReachability(roleNames: Set<string>, reachable: Set<string>, errors: string[]): void {
+  for (const name of roleNames) {
+    if (RESERVED_NAMES.has(name)) continue;
+    if (!reachable.has(name)) {
+      errors.push(`role "${name}" is not reachable from $START`);
+    }
+  }
+}
+
+/** Check oneOf discriminant validity for a role. */
+function checkOneOfDiscriminant(
+  roleName: string,
+  variants: SchemaObj[],
+  statuses: string[],
+  errors: string[],
+): void {
+  if (statuses.length === variants.length) return;
+
+  let foundMissing = false;
+  for (const variant of variants) {
+    const props = variant.properties as Record<string, SchemaObj> | undefined;
+    if (!props?.$status) {
+      errors.push(`role "${roleName}": oneOf variants must have "$status" as const discriminant`);
+      foundMissing = true;
+      break;
+    }
+    if (typeof props.$status.const !== "string") {
+      errors.push(`role "${roleName}": oneOf variant $status must be a const value`);
+      foundMissing = true;
+      break;
+    }
+  }
+
+  if (!foundMissing) {
+    errors.push(`role "${roleName}": oneOf variant $status must be a const value`);
+  }
+}
+
+/** Check status-edge consistency for a multi-exit role. */
+function checkMultiExitEdges(
+  roleName: string,
+  graphKeys: Set<string>,
+  statusSet: Set<string>,
+  errors: string[],
+): void {
+  if (graphKeys.has("_")) {
+    errors.push(`role "${roleName}" is multi-exit but graph uses "_"`);
+    return;
+  }
+
+  const extraKeys = [...graphKeys].filter((k) => !statusSet.has(k));
+  const missingKeys = [...statusSet].filter((k) => !graphKeys.has(k));
+  if (extraKeys.length > 0) {
+    errors.push(`role "${roleName}" graph has extra status keys: ${extraKeys.join(", ")}`);
+  }
+  if (missingKeys.length > 0) {
+    errors.push(`role "${roleName}" graph is missing status keys: ${missingKeys.join(", ")}`);
+  }
+}
+
+/** Check mustache variables for multi-exit role. */
+function checkMultiExitMustache(
+  roleName: string,
+  graphEntry: Record<string, { role: string; prompt: string }>,
+  variants: SchemaObj[],
+  errors: string[],
+): void {
+  for (const [status, target] of Object.entries(graphEntry)) {
+    const vars = extractMustacheVars(target.prompt);
+    const variant = variants.find((v) => {
+      const props = v.properties as Record<string, SchemaObj> | undefined;
+      return props?.$status?.const === status;
+    });
+    if (!variant) continue;
+    const propNames = getPropertyNames(variant);
+    for (const v of vars) {
+      if (v === "$status") continue;
+      if (!propNames.has(v)) {
+        errors.push(`prompt variable "${v}" not found in role "${roleName}" variant "${status}"`);
+      }
+    }
+  }
+}
+
+/** Check status-edge consistency and mustache for each role. */
+function checkRoleConsistency(payload: WorkflowPayload, errors: string[]): void {
+  for (const [roleName, role] of Object.entries(payload.roles)) {
+    if (RESERVED_NAMES.has(roleName)) continue;
+    const graphEntry = payload.graph[roleName];
+    if (!graphEntry) continue;
+
+    const fm = role.frontmatter as unknown;
+    const graphKeys = new Set(Object.keys(graphEntry));
+
+    if (isOneOfSchema(fm)) {
+      const variants = fm.oneOf as SchemaObj[];
+      const statuses = getOneOfStatuses(variants);
+
+      checkOneOfDiscriminant(roleName, variants, statuses, errors);
+      checkMultiExitEdges(roleName, graphKeys, new Set(statuses), errors);
+      checkMultiExitMustache(roleName, graphEntry, variants, errors);
+    } else {
+      checkSingleExitRole(roleName, graphKeys, graphEntry, fm as SchemaObj | null, errors);
+    }
+  }
+}
+
+/** Check single-exit role status and mustache. */
+function checkSingleExitRole(
+  roleName: string,
+  graphKeys: Set<string>,
+  graphEntry: Record<string, { role: string; prompt: string }>,
+  fm: SchemaObj | null,
+  errors: string[],
+): void {
+  if (graphKeys.size > 1 || (graphKeys.size === 1 && !graphKeys.has("_"))) {
+    if (!graphKeys.has("_")) {
+      errors.push(`role "${roleName}" is single-exit but graph has no "_" key`);
+    } else {
+      errors.push(`role "${roleName}" is single-exit but has status keys other than "_"`);
+    }
+  }
+
+  const singleTarget = graphEntry._;
+  if (!singleTarget) return;
+
+  const vars = extractMustacheVars(singleTarget.prompt);
+  const propNames = fm ? getPropertyNames(fm) : new Set<string>();
+  for (const v of vars) {
+    if (v === "$status") continue;
+    if (!propNames.has(v)) {
+      errors.push(`prompt variable "${v}" not found in role "${roleName}" frontmatter`);
+    }
+  }
+}
+
+/**
+ * Validate a parsed WorkflowPayload for semantic correctness.
+ * Returns an array of error messages. Empty array = valid.
+ */
+export function validateWorkflow(payload: WorkflowPayload): string[] {
+  const errors: string[] = [];
+  checkRoleReferences(payload, errors);
+  checkGraphStructure(payload, errors);
+  checkRoleConsistency(payload, errors);
+  return errors;
+}
@@ -16,7 +16,9 @@ function isRoleDefinition(value: unknown): boolean {
    return false;
  }
  const frontmatter = value.frontmatter;
-  const frontmatterOk = isRecord(frontmatter) && typeof frontmatter.type === "string";
+  const frontmatterOk =
+    isRecord(frontmatter) &&
+    (typeof frontmatter.type === "string" || Array.isArray(frontmatter.oneOf));
  const capabilities = value.capabilities;
  const capabilitiesOk =
    Array.isArray(capabilities) && capabilities.every((c) => typeof c === "string");
@@ -87,7 +87,7 @@ describe("buildOutputFormatInstruction", () => {
    expect(result).toContain("beta: <number>");
  });

-  test("lists union of fields from a oneOf schema", () => {
+  test("lists union of fields from a oneOf schema (no discriminant — flat merge)", () => {
    const schema = {
      oneOf: [
        {
@@ -101,12 +101,71 @@ describe("buildOutputFormatInstruction", () => {
      ],
    };
    const result = buildOutputFormatInstruction(schema);
+    // No discriminant detected → falls back to flat merge
    expect(result).toContain("`foo`");
    expect(result).toContain("`bar`");
    expect(result).toContain("foo: <string>");
    expect(result).toContain("bar: true  # true | false");
  });

+  test("renders per-variant instructions for discriminated oneOf", () => {
+    const schema = {
+      oneOf: [
+        {
+          type: "object",
+          properties: {
+            $status: { const: "ready" },
+            plan: { type: "string" },
+          },
+          required: ["$status", "plan"],
+        },
+        {
+          type: "object",
+          properties: {
+            $status: { const: "insufficient_info" },
+          },
+          required: ["$status"],
+        },
+      ],
+    };
+    const result = buildOutputFormatInstruction(schema);
+    expect(result).toContain("Choose ONE of the following variants");
+    expect(result).toContain("When `$status: ready`");
+    expect(result).toContain("When `$status: insufficient_info`");
+    expect(result).toContain("plan: <string>");
+    // The insufficient_info variant should NOT mention plan
+    const insufficientBlock = result.split("When `$status: insufficient_info`")[1];
+    expect(insufficientBlock).not.toContain("plan:");
+  });
+
+  test("renders per-variant for single-enum discriminant", () => {
+    const schema = {
+      oneOf: [
+        {
+          type: "object",
+          properties: {
+            $status: { type: "string", enum: ["approved"] },
+            branch: { type: "string" },
+          },
+          required: ["$status"],
+        },
+        {
+          type: "object",
+          properties: {
+            $status: { type: "string", enum: ["rejected"] },
+            comments: { type: "string" },
+          },
+          required: ["$status"],
+        },
+      ],
+    };
+    const result = buildOutputFormatInstruction(schema);
+    expect(result).toContain("When `$status: approved`");
+    expect(result).toContain("When `$status: rejected`");
+    expect(result).toContain("branch: <string>");
+    expect(result).toContain("comments: <string>");
+  });
+
  test("falls back gracefully for a non-object schema with no properties", () => {
    const result = buildOutputFormatInstruction({ type: "string" });
    expect(result).toContain("schema fields will be extracted automatically");
@@ -166,14 +166,109 @@ function buildFieldList(properties: SchemaProperty[]): string {
    .join("\n");
 }

+/**
+ * Detect the discriminant property name from a oneOf schema.
+ * Returns the property name if all variants share a const/single-enum string property, else null.
+ */
+function detectDiscriminant(variants: JSONSchema[]): string | null {
+  // Find property names that appear in ALL variants with const or single-enum
+  const candidateNames = new Set<string>();
+
+  for (const variant of variants) {
+    const props = variant.properties as Record<string, JSONSchema> | null | undefined;
+    if (typeof props !== "object" || props === null) return null;
+
+    for (const [name, propSchema] of Object.entries(props)) {
+      const isConst =
+        propSchema.const !== undefined ||
+        (Array.isArray(propSchema.enum) && propSchema.enum.length === 1);
+      if (isConst) candidateNames.add(name);
+    }
+  }
+
+  // Check which candidate appears in ALL variants
+  for (const name of candidateNames) {
+    const allHaveIt = variants.every((v) => {
+      const props = v.properties as Record<string, JSONSchema> | null | undefined;
+      if (typeof props !== "object" || props === null) return false;
+      const propSchema = props[name];
+      if (!propSchema) return false;
+      return (
+        propSchema.const !== undefined ||
+        (Array.isArray(propSchema.enum) && propSchema.enum.length === 1)
+      );
+    });
+    if (allHaveIt) return name;
+  }
+
+  return null;
+}
+
+function getConstValue(propSchema: JSONSchema): string {
+  if (propSchema.const !== undefined) return String(propSchema.const);
+  if (Array.isArray(propSchema.enum) && propSchema.enum.length === 1)
+    return String(propSchema.enum[0]);
+  return "<unknown>";
+}
+
+function buildVariantBlock(variant: JSONSchema, discriminant: string): string {
+  const props = extractSchemaProperties(variant);
+  const value = getConstValue(
+    ((variant.properties as Record<string, JSONSchema>) ?? {})[discriminant] ?? {},
+  );
+  const yamlExample = buildYamlExampleBlock(props);
+  const fieldList = buildFieldList(props);
+
+  return `### When \`${discriminant}: ${value}\`
+
+\`\`\`
+${yamlExample}
+\`\`\`
+
+Fields:
+${fieldList}`;
+}
+
 /**
 * Build a concise output format instruction block for an agent role.
 *
- * The instruction describes the expected frontmatter markdown format and lists
- * the meta fields derived from the JSON Schema.  It is prepended to the agent's
- * system prompt so the deliverable format is the first thing the agent sees.
+ * For discriminated union schemas (oneOf with a shared const/$status field),
+ * renders per-variant instructions so the agent knows exactly which fields
+ * belong to which outcome.
+ *
+ * For flat object schemas, renders a single YAML example block.
 */
 export function buildOutputFormatInstruction(schema: JSONSchema): string {
+  // Check for discriminated union (oneOf with shared discriminant)
+  const unionKey = Array.isArray(schema.oneOf)
+    ? "oneOf"
+    : Array.isArray(schema.anyOf)
+      ? "anyOf"
+      : null;
+
+  if (unionKey !== null) {
+    const variants = schema[unionKey] as JSONSchema[];
+    const discriminant = detectDiscriminant(variants);
+
+    if (discriminant !== null && variants.length > 1) {
+      const variantBlocks = variants.map((v) => buildVariantBlock(v, discriminant)).join("\n\n");
+
+      return `## Deliverable Format
+
+Your response MUST begin with a YAML frontmatter block followed by your markdown work.
+
+Choose ONE of the following variants based on your outcome:
+
+${variantBlocks}
+
+The frontmatter is the **primary deliverable** — the engine reads it directly.
+Output ONLY the fields listed for your chosen variant. Do not add extra fields that are not specified in the schema.
+
+Focus exclusively on YOUR role's deliverable. Do not perform actions outside your role's scope.`;
+    }
+  }
+
+  // Flat object schema fallback
  const properties = extractSchemaProperties(schema);
  const yamlExample = buildYamlExampleBlock(properties);
  const fieldList = buildFieldList(properties);
@@ -128,6 +128,8 @@ async function buildHistory(
      detail: step.detail,
      agent: step.agent,
      edgePrompt: step.edgePrompt ?? "",
+      startedAtMs: step.startedAtMs,
+      completedAtMs: step.completedAtMs,
      content,
    });
  }
@@ -59,6 +59,8 @@ async function writeStepNode(options: {
  detailHash: CasRef;
  agentName: string;
  edgePrompt: string;
+  startedAtMs: number;
+  completedAtMs: number;
 }): Promise<CasRef> {
  const payload: StepNodePayload = {
    start: options.startHash,
@@ -68,6 +70,8 @@ async function writeStepNode(options: {
    detail: options.detailHash,
    agent: options.agentName,
    edgePrompt: options.edgePrompt,
+    startedAtMs: options.startedAtMs,
+    completedAtMs: options.completedAtMs,
  };
  const hash = await options.store.put(options.schemas.stepNode, payload);
  const node = options.store.get(hash);
@@ -94,6 +98,8 @@ async function persistStep(options: {
  outputHash: CasRef;
  detailHash: CasRef;
  agentName: string;
+  startedAtMs: number;
+  completedAtMs: number;
 }): Promise<CasRef> {
  const { store, schemas, chain, headHash } = options.ctx.meta;
  return writeStepNode({
@@ -106,6 +112,8 @@ async function persistStep(options: {
    detailHash: options.detailHash,
    agentName: options.agentName,
    edgePrompt: options.ctx.edgePrompt,
+    startedAtMs: options.startedAtMs,
+    completedAtMs: options.completedAtMs,
  });
 }

@@ -127,6 +135,7 @@ export function createAgent(options: AgentOptions): () => Promise<void> {
      ctx.outputFormatInstruction = buildOutputFormatInstruction(frontmatterSchema);
    }

+    const startedAtMs = Date.now();
    let agentResult = await runWithMessage("agent run failed", () => options.run(ctx));

    // Preserve the primary detail from the first run — it contains the full
@@ -156,12 +165,14 @@ export function createAgent(options: AgentOptions): () => Promise<void> {
          `Raw output (first 500 chars): ${agentResult.output.slice(0, 500)}`,
      );
    }
-
+    const completedAtMs = Date.now();
    const stepHash = await persistStep({
      ctx,
      outputHash,
      detailHash: primaryDetailHash,
      agentName: agentLabel(options.name),
+      startedAtMs,
+      completedAtMs,
    });

    process.stdout.write(`${stepHash}\n`);
@@ -21,7 +21,7 @@ const solveIssueGraph: WorkflowPayload["graph"] = {

 describe("evaluate", () => {
  test("$START → first role (unit status _)", () => {
-    const result = evaluate(solveIssueGraph, "$START", { status: "_" });
+    const result = evaluate(solveIssueGraph, "$START", { $status: "_" });
    expect(result).toEqual({
      ok: true,
      value: { role: "planner", prompt: "Start planning from the issue in the task." },
@@ -30,7 +30,7 @@ describe("evaluate", () => {

  test("status-based routing (reviewer rejected → developer)", () => {
    const result = evaluate(solveIssueGraph, "reviewer", {
-      status: "rejected",
+      $status: "rejected",
      comments: "missing tests",
    });
    expect(result).toEqual({
@@ -40,7 +40,7 @@ describe("evaluate", () => {
  });

  test("status-based routing (reviewer approved → $END)", () => {
-    const result = evaluate(solveIssueGraph, "reviewer", { status: "approved" });
+    const result = evaluate(solveIssueGraph, "reviewer", { $status: "approved" });
    expect(result).toEqual({
      ok: true,
      value: { role: "$END", prompt: "Done." },
@@ -48,7 +48,7 @@ describe("evaluate", () => {
  });

  test("missing role in graph → error", () => {
-    const result = evaluate(solveIssueGraph, "unknown-role", { status: "_" });
+    const result = evaluate(solveIssueGraph, "unknown-role", { $status: "_" });
    expect(result.ok).toBe(false);
    if (!result.ok) {
      expect(result.error.message).toBe('no transitions defined for role "unknown-role"');
@@ -56,7 +56,7 @@ describe("evaluate", () => {
  });

  test("missing status in graph → error", () => {
-    const result = evaluate(solveIssueGraph, "reviewer", { status: "pending" });
+    const result = evaluate(solveIssueGraph, "reviewer", { $status: "pending" });
    expect(result.ok).toBe(false);
    if (!result.ok) {
      expect(result.error.message).toBe('no transition for role "reviewer" with status "pending"');
@@ -65,7 +65,7 @@ describe("evaluate", () => {

  test("mustache template rendering with simple fields", () => {
    const result = evaluate(solveIssueGraph, "planner", {
-      status: "_",
+      $status: "_",
      plan: "Add auth middleware",
    });
    expect(result).toEqual({
@@ -76,7 +76,7 @@ describe("evaluate", () => {

  test("mustache does not HTML-escape prompt content", () => {
    const result = evaluate(solveIssueGraph, "reviewer", {
-      status: "rejected",
+      $status: "rejected",
      comments: 'use <T> & "Result<T, E>" types',
    });
    expect(result).toEqual({
@@ -92,7 +92,7 @@ describe("evaluate", () => {
      },
    };
    const result = evaluate(graph, "reviewer", {
-      status: "_",
+      $status: "_",
      comments: "<script>alert(1)</script>",
    });
    expect(result).toEqual({
@@ -101,6 +101,16 @@ describe("evaluate", () => {
    });
  });

+  test("missing $status defaults to _ (unit routing)", () => {
+    const result = evaluate(solveIssueGraph, "planner", {
+      plan: "Add auth middleware",
+    });
+    expect(result).toEqual({
+      ok: true,
+      value: { role: "developer", prompt: "Implement the plan: Add auth middleware" },
+    });
+  });
+
  test("mustache template with nested object paths", () => {
    const graph: Record<string, Record<string, Target>> = {
      reviewer: {
@@ -111,7 +121,7 @@ describe("evaluate", () => {
      },
    };
    const result = evaluate(graph, "reviewer", {
-      status: "_",
+      $status: "_",
      review: { comments: "refactor the handler" },
    });
    expect(result).toEqual({
@@ -9,14 +9,21 @@ mustache.escape = (text: string) => text;
 const START_ROLE = "$START";
 const UNIT_STATUS = "_";

-type LastOutput = Record<string, unknown> & { status: string };
+type LastOutput = Record<string, unknown>;
+
+const STATUS_KEY = "$status";

 export function evaluate(
  graph: Record<string, Record<string, Target>>,
  lastRole: string,
  lastOutput: LastOutput,
 ): Result<EvaluateResult, Error> {
-  const status = lastRole === START_ROLE ? UNIT_STATUS : lastOutput.status;
+  const status =
+    lastRole === START_ROLE
+      ? UNIT_STATUS
+      : typeof lastOutput[STATUS_KEY] === "string"
+        ? (lastOutput[STATUS_KEY] as string)
+        : UNIT_STATUS;

  const roleTargets = graph[lastRole];
  if (roleTargets === undefined) {
@@ -60,7 +60,7 @@ export const START_NODE_SCHEMA: JSONSchema = {
 export const STEP_NODE_SCHEMA: JSONSchema = {
  title: "StepNode",
  type: "object",
-  required: ["start", "prev", "role", "output", "detail", "agent"],
+  required: ["start", "prev", "role", "output", "detail", "agent", "startedAtMs", "completedAtMs"],
  properties: {
    start: { type: "string", format: "cas_ref" },
    prev: {
@@ -71,6 +71,8 @@ export const STEP_NODE_SCHEMA: JSONSchema = {
    detail: { type: "string", format: "cas_ref" },
    agent: { type: "string" },
    edgePrompt: { type: "string" },
+    startedAtMs: { type: "integer" },
+    completedAtMs: { type: "integer" },
  },
  additionalProperties: false,
 };
@@ -14,6 +14,10 @@ export type StepRecord = {
  agent: string;
  /** Moderator edge prompt that led to this step. Missing in legacy nodes → "". */
  edgePrompt: string;
+  /** Date.now() before agent spawn */
+  startedAtMs: number;
+  /** Date.now() after agent returns */
+  completedAtMs: number;
 };

 // ── 4.2 Workflow 定义 ───────────────────────────────────────────────
@@ -89,6 +93,7 @@ export type StepEntry = {
  detail: CasRef;
  agent: string;
  timestamp: number;
+  durationMs: number;
 };

 /** uwf thread steps — start entry */
Author	SHA1	Message	Date
xiaoju	9c26285424	chore: make solve-issue.yaml portable and add developer failed exit - Remove hardcoded ~/repos/workflow paths from procedure text - Use .worktrees/ relative to repo root instead of global path - Add developer failed → $END exit for unrecoverable situations - Add worktree field to reviewer rejected variant - Fix test workflowPath to use import.meta.dirname Refs #506	2026-05-25 09:07:58 +00:00
xiaoju	45f479e60f	feat(protocol): add step-level timing (startedAtMs / completedAtMs) (#489 ) BREAKING CHANGE: StepRecord now requires startedAtMs and completedAtMs fields. StepEntry now requires durationMs field. Old CAS data without these fields is invalid. - Add startedAtMs/completedAtMs to StepRecord and StepNodePayload - Add durationMs to StepEntry (computed: completedAtMs - startedAtMs) - Update STEP_NODE_SCHEMA to require timing fields as integers - Record Date.now() before/after agent execution in createAgent - Show duration in thread read headers (formatStepHeader) - Update existing test fixtures with timing fields	2026-05-25 08:01:50 +00:00
xiaoju	3fca67e443	fix: isRoleDefinition accepts oneOf frontmatter Without this, parseWorkflowPayload rejects workflows with oneOf frontmatter before semantic validation even runs.	2026-05-25 07:46:12 +00:00
xiaoju	9b2460633c	feat(cli): add workflow semantic validation before execution Implements validateWorkflow() that performs deep semantic checks on parsed WorkflowPayload before registration or execution: - Role reference integrity (unknown roles, orphans, reserved names) - Graph structure (/ constraints, reachability, edge targets) - Status-edge consistency (single/multi-exit matching) - Mustache template variable existence - oneOf discriminant validity ( const check) All errors collected (not fail-fast). Integrated into: - uwf workflow add (before CAS registration) - uwf thread start (local workflow materialization) Closes #506	2026-05-25 07:25:10 +00:00
xiaoju	dfb6fda06d	feat(agent-kit): render per-variant output instructions for discriminated oneOf buildOutputFormatInstruction now detects discriminated union schemas (oneOf with shared const/ property) and renders separate YAML example blocks per variant, so agents see exactly which fields belong to which outcome instead of a flat merge. Non-discriminated oneOf/anyOf schemas fall back to the existing flat merge behavior. Refs #502	2026-05-25 06:54:38 +00:00
xiaoju	827ff13c4a	refactor: discriminated union frontmatter for solve-issue workflow - planner: oneOf ready (plan, repoPath) \| insufficient_info - developer: single exit, plain object (branch, worktree), no $status - reviewer: oneOf approved (branch, worktree) \| rejected (comments) - tester: oneOf passed (branch, worktree) \| fix_code (report) \| fix_spec (report) - committer: oneOf committed (prUrl) \| hook_failed (error) - Edge prompts now use mustache templates with variant-specific fields - Developer simplified from 2 exits to single exit (unit routing) Phase 2 of #499 (closes #501)	2026-05-25 06:34:56 +00:00
xiaoju	7a19ceca89	refactor: rename status to $status, default to _ when absent - evaluate() reads $status instead of status, defaults to _ when missing - Update all YAML examples and .workflows to use $status - Update cli-workflow resolveEvaluateArgs to use $status - 10 moderator tests pass including new default _ test - Single-exit roles no longer need to declare status field Phase 1 of #499 (closes #500)	2026-05-25 06:22:53 +00:00