fix: explicitly forbid extra frontmatter fields in output format instruction

buildOutputFormatInstruction now includes explicit language telling agents to output ONLY schema-defined fields and to focus on their role's deliverable. Fixes #394
Merge pull request 'feat: validate model connectivity during uwf setup' (#392 ) from feat/335-setup-validate-model into main
2026-05-22 10:49:04 +00:00 · 2026-05-22 10:32:01 +00:00 · 2026-05-22 10:30:39 +00:00 · 2026-05-22 10:11:13 +00:00 · 2026-05-22 10:03:56 +00:00 · 2026-05-22 10:03:45 +00:00
31 changed files with 1475 additions and 142 deletions
@@ -0,0 +1,83 @@
+# Test Spec: uwf setup model connectivity validation (#335)
+
+## Context
+
+File: `packages/cli-workflow/src/commands/setup.ts`
+Test file: `packages/cli-workflow/src/__tests__/setup-validate.test.ts`
+
+After `cmdSetup` writes config, it should send a test chat completion request to verify the configured model is reachable. If validation fails, warn the user (don't abort — config is already saved).
+
+## Implementation Notes
+
+- Add a `validateModel(baseUrl, apiKey, model)` function that sends a minimal chat completion request (`POST /chat/completions` with `messages: [{role:"user",content:"hi"}]`, `max_tokens: 1`)
+- Returns `Result<void, string>` — ok if 2xx response, error with reason string otherwise
+- Use `AbortSignal.timeout(15_000)` for the request
+- Both `cmdSetup` and `cmdSetupInteractive` should call it after saving config
+- `cmdSetup` returns validation result in its return object: `{ ...existing, validation: { ok: true } | { ok: false, error: string } }`
+- `cmdSetupInteractive` prints a warning to console if validation fails, success message if it passes
+- Use the project logger (`createLogger`) — no raw `console.log` except in interactive CLI output (per CLAUDE.md)
+
+## Test Cases (vitest)
+
+### 1. `validateModel` — success path
+- Mock `fetch` to return `{ status: 200, ok: true, json: () => ({}) }`
+- Call `validateModel(baseUrl, apiKey, model)`
+- Assert returns `{ ok: true, value: undefined }`
+- Assert fetch was called with correct URL (`${baseUrl}/chat/completions`), correct headers (`Authorization: Bearer ${apiKey}`), correct body (model, messages, max_tokens: 1)
+
+### 2. `validateModel` — HTTP error (401 unauthorized)
+- Mock `fetch` to return `{ status: 401, ok: false, statusText: "Unauthorized" }`
+- Call `validateModel(baseUrl, apiKey, model)`
+- Assert returns `{ ok: false, error: <string containing "401"> }`
+
+### 3. `validateModel` — HTTP error (404 model not found)
+- Mock `fetch` to return `{ status: 404, ok: false, statusText: "Not Found" }`
+- Assert returns `{ ok: false, error: <string containing "404"> }`
+
+### 4. `validateModel` — network timeout
+- Mock `fetch` to throw `DOMException` with name `AbortError`
+- Assert returns `{ ok: false, error: <string containing "timeout" or "unreachable"> }`
+
+### 5. `validateModel` — network error (DNS failure, connection refused)
+- Mock `fetch` to throw `TypeError("fetch failed")`
+- Assert returns `{ ok: false, error: <string mentioning connectivity> }`
+
+### 6. `cmdSetup` — includes validation result on success
+- Mock global `fetch` for `/chat/completions` to succeed
+- Call `cmdSetup({ provider, baseUrl, apiKey, model, storageRoot })`
+- Assert returned object has `validation: { ok: true, value: undefined }`
+- Assert config files are still written (existing behavior preserved)
+
+### 7. `cmdSetup` — includes validation result on failure (config still saved)
+- Mock global `fetch` for `/chat/completions` to return 401
+- Call `cmdSetup({ ... })`
+- Assert returned object has `validation: { ok: false, error: ... }`
+- Assert `config.yaml` and `.env` are still written (validation failure doesn't prevent saving)
+
+### 8. `cmdSetupInteractive` — prints success message on validation pass
+- Mock `fetch` for both `/models` and `/chat/completions` to succeed
+- Mock stdin to provide valid selections
+- Capture console output
+- Assert output contains a success message like "Model verified" or "✓"
+
+### 9. `cmdSetupInteractive` — prints warning on validation failure
+- Mock `fetch`: `/models` succeeds, `/chat/completions` returns 401
+- Mock stdin for valid selections
+- Capture console output
+- Assert output contains a warning about model not being reachable and suggests trying a different model
+
+### 10. `validateModel` — request body correctness
+- Mock `fetch` to capture the request body
+- Call `validateModel(baseUrl, apiKey, "test-model")`
+- Assert body is `{ model: "test-model", messages: [{role: "user", content: "hi"}], max_tokens: 1 }`
+
+## Export Requirements
+
+- `validateModel` must be exported (for direct unit testing)
+- Signature: `async function validateModel(baseUrl: string, apiKey: string, model: string): Promise<Result<void, string>>`
+- `Result` type: `{ ok: true; value: T } | { ok: false; error: E }` (project convention)
+
+## Files to Create/Modify
+
+- **New**: `packages/cli-workflow/src/__tests__/setup-validate.test.ts` — all test cases above
+- **Modify**: `packages/cli-workflow/src/commands/setup.ts` — add `validateModel`, integrate into `cmdSetup` and `cmdSetupInteractive`
@@ -0,0 +1,167 @@
+name: "solve-issue"
+description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds."
+roles:
+  planner:
+    description: "Analyzes issue and outputs a TDD test spec"
+    goal: "You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify."
+    capabilities:
+      - issue-analysis
+      - planning
+    procedure: |
+      On first run (no previous steps):
+      1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
+      2. Read CLAUDE.md (or equivalent project conventions file) to understand coding standards
+      3. Assess whether the issue has enough information to produce a test spec
+      4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output status=insufficient_info and terminate
+      5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
+
+      On subsequent runs (bounced back by tester with fix_spec):
+      1. Read the tester's output from the previous step to understand what's wrong with the spec
+      2. Revise the test spec accordingly
+
+      After producing the test spec:
+      1. Store it via `uwf cas put-text "<markdown content>"` and capture the returned hash
+      2. Put the hash in frontmatter.plan (required when status=ready)
+    output: "Output a brief summary of the test spec. Frontmatter must include: status (ready or insufficient_info) and plan (CAS hash of the test spec, required when status=ready)."
+    frontmatter:
+      type: object
+      properties:
+        status:
+          type: string
+          enum: [ready, insufficient_info]
+        plan:
+          type: string
+      required: [status]
+  developer:
+    description: "TDD implementation per test spec"
+    goal: "You are a developer agent. You implement code changes following TDD — write tests first, then implementation."
+    capabilities:
+      - coding
+    procedure: |
+      1. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's meta.plan)
+      2. If bounced back from reviewer or tester: read the previous role's output to understand what needs fixing
+      3. Write tests first based on the spec
+      4. Implement the code to make tests pass
+      5. Ensure `bun run build` passes with no errors
+      6. Run `bun test` to verify all tests pass
+    output: "List all files changed and provide a summary. Frontmatter must include: status (done or failed)."
+    frontmatter:
+      type: object
+      properties:
+        status:
+          type: string
+          enum: [done, failed]
+      required: [status]
+  reviewer:
+    description: "Code standards compliance check"
+    goal: "You are a code reviewer. You verify code standards compliance — NOT functionality (that's the tester's job)."
+    capabilities:
+      - code-review
+      - static-analysis
+    procedure: |
+      Hard checks (must all pass):
+      1. `bun run build` — no build errors
+      2. `bunx biome check` — no lint violations
+      3. TypeScript strict mode — no type errors
+
+      Soft checks (review against CLAUDE.md conventions):
+      - Functional-first: `function` + `type`, not `class` + `interface`
+      - No optional properties (`?:`) — use `T | null`
+      - Naming conventions (kebab-case files, PascalCase types, camelCase functions)
+      - Module boundary discipline (folder exports via index.ts)
+      - No `console.log` (use structured logger)
+      - No dynamic imports in production code
+
+      Only review standards compliance. Do NOT test functionality.
+      If rejecting, you MUST explain the specific reason in your output.
+    output: "Explain your decision with specific file/line references. Frontmatter must include: approved (true or false)."
+    frontmatter:
+      type: object
+      properties:
+        approved:
+          type: boolean
+      required: [approved]
+  tester:
+    description: "Functional correctness verification"
+    goal: "You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec."
+    capabilities:
+      - testing
+    procedure: |
+      1. Run `bun test` for automated test verification
+      2. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's meta.plan)
+      3. Verify each scenario in the spec is covered and passing
+      4. Determine outcome:
+         - passed: all scenarios verified, tests pass
+         - fix_code: tests fail or implementation doesn't match spec → send back to developer
+         - fix_spec: the spec itself is wrong or incomplete → send back to planner
+    output: "Report test results per scenario. Frontmatter must include: status (passed, fix_code, or fix_spec)."
+    frontmatter:
+      type: object
+      properties:
+        status:
+          type: string
+          enum: [passed, fix_code, fix_spec]
+      required: [status]
+  committer:
+    description: "Commits and creates PR"
+    goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue."
+    capabilities: []
+    procedure: |
+      Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
+      1. Stage all changes: `git add -A`
+      2. Commit with a descriptive message referencing the issue: `git commit -m "type: description\n\nFixes #N"`
+      3. Push the branch: `git push -u origin <branch-name>`
+         - If push hook fails: capture the error log in your output, mark hook_failed
+      4. On push success: create a PR via `tea pr create --title "..." --description "..."`
+         - PR description must follow the project template: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
+    output: "Include PR URL on success or error log on failure. Frontmatter must include: success (true or false)."
+    frontmatter:
+      type: object
+      properties:
+        success:
+          type: boolean
+      required: [success]
+conditions:
+  insufficientInfo:
+    description: "Planner determined there's not enough info to proceed"
+    expression: "$last('planner').status = 'insufficient_info'"
+  devFailed:
+    description: "Developer failed to implement"
+    expression: "$last('developer').status = 'failed'"
+  rejected:
+    description: "Reviewer rejected the implementation"
+    expression: "$last('reviewer').approved = false"
+  fixCode:
+    description: "Tester found code issues"
+    expression: "$last('tester').status = 'fix_code'"
+  fixSpec:
+    description: "Tester found spec issues"
+    expression: "$last('tester').status = 'fix_spec'"
+  hookFailed:
+    description: "Push hook failed"
+    expression: "$last('committer').success = false"
+graph:
+  $START:
+    - role: "planner"
+  planner:
+    - role: "$END"
+      condition: "insufficientInfo"
+    - role: "developer"
+  developer:
+    - role: "$END"
+      condition: "devFailed"
+    - role: "reviewer"
+  reviewer:
+    - role: "developer"
+      condition: "rejected"
+    - role: "tester"
+  tester:
+    - role: "developer"
+      condition: "fixCode"
+    - role: "planner"
+      condition: "fixSpec"
+    - role: "committer"
+  committer:
+    - role: "developer"
+      condition: "hookFailed"
+    - role: "$END"
@@ -5,6 +5,8 @@
      "**",
      "!**/dist",
      "!**/node_modules",
+      "!**/legacy-packages",
+      "!scripts",
      "!packages/workflow/workflow",
      "!xiaoju/scripts/bundle.ts"
    ]
@@ -36,7 +38,7 @@
      }
    },
    {
-      "includes": ["**/*.d.ts"],
+      "includes": ["**/*.d.ts", "**/vitest.config.*"],
      "linter": {
        "rules": {
          "style": {
@@ -44,6 +46,16 @@
          }
        }
      }
+    },
+    {
+      "includes": ["**/cli.ts", "**/setup.ts"],
+      "linter": {
+        "rules": {
+          "suspicious": {
+            "noConsole": "off"
+          }
+        }
+      }
    }
  ],
  "linter": {
@@ -19,7 +19,7 @@ roles:
    output: |
      Provide your analysis as markdown under the frontmatter.
      The frontmatter must include your structured findings.
-    meta:
+    frontmatter:
      type: object
      properties:
        thesis:
@@ -9,7 +9,7 @@ roles:
      - planning
    procedure: "Analyze the issue and create a detailed, actionable implementation plan."
    output: "Output the plan summary and list of concrete steps."
-    meta:
+    frontmatter:
      type: object
      properties:
        plan:
@@ -28,7 +28,7 @@ roles:
      - testing
    procedure: "Implement the plan. Write code, tests, and ensure existing tests pass."
    output: "List all files changed and provide a summary of the implementation."
-    meta:
+    frontmatter:
      type: object
      properties:
        filesChanged:
@@ -46,7 +46,7 @@ roles:
      - static-analysis
    procedure: "Review the implementation against the plan. Check for bugs, edge cases, and style."
    output: "Approve or reject with detailed comments explaining your decision."
-    meta:
+    frontmatter:
      type: object
      properties:
        approved:
@@ -57,7 +57,7 @@ roles:
 conditions:
  notApproved:
    description: "Reviewer rejected the implementation"
-    expression: "steps[-1].output.approved = false"
+    expression: "$last('reviewer').approved = false"
 graph:
  $START:
    - role: "planner"
@@ -0,0 +1,150 @@
+import { mkdtemp, rm } from "node:fs/promises";
+import { tmpdir } from "node:os";
+import { join } from "node:path";
+import { afterEach, beforeEach, describe, expect, test, vi } from "vitest";
+import { cmdSetup, validateModel } from "../commands/setup.js";
+
+describe("validateModel", () => {
+  const BASE_URL = "https://api.example.com/v1";
+  const API_KEY = "sk-test-key";
+  const MODEL = "test-model";
+
+  afterEach(() => {
+    vi.restoreAllMocks();
+  });
+
+  test("success path — returns ok on 200", async () => {
+    const mockFetch = vi
+      .spyOn(globalThis, "fetch")
+      .mockResolvedValue(new Response(JSON.stringify({}), { status: 200 }));
+
+    const result = await validateModel(BASE_URL, API_KEY, MODEL);
+
+    expect(result).toEqual({ ok: true, value: undefined });
+    expect(mockFetch).toHaveBeenCalledOnce();
+
+    const [url, opts] = mockFetch.mock.calls[0]!;
+    expect(url).toBe(`${BASE_URL}/chat/completions`);
+    expect((opts as RequestInit).headers).toEqual(
+      expect.objectContaining({ Authorization: `Bearer ${API_KEY}` }),
+    );
+    const body = JSON.parse((opts as RequestInit).body as string);
+    expect(body).toEqual({
+      model: MODEL,
+      messages: [{ role: "user", content: "hi" }],
+      max_tokens: 1,
+    });
+  });
+
+  test("HTTP 401 — returns error containing 401", async () => {
+    vi.spyOn(globalThis, "fetch").mockResolvedValue(
+      new Response("Unauthorized", { status: 401, statusText: "Unauthorized" }),
+    );
+
+    const result = await validateModel(BASE_URL, API_KEY, MODEL);
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.error).toContain("401");
+    }
+  });
+
+  test("HTTP 404 — returns error containing 404", async () => {
+    vi.spyOn(globalThis, "fetch").mockResolvedValue(
+      new Response("Not Found", { status: 404, statusText: "Not Found" }),
+    );
+
+    const result = await validateModel(BASE_URL, API_KEY, MODEL);
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.error).toContain("404");
+    }
+  });
+
+  test("network timeout — returns error mentioning timeout", async () => {
+    const err = new DOMException("signal timed out", "AbortError");
+    vi.spyOn(globalThis, "fetch").mockRejectedValue(err);
+
+    const result = await validateModel(BASE_URL, API_KEY, MODEL);
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.error.toLowerCase()).toMatch(/timeout|timed out/);
+    }
+  });
+
+  test("network error (DNS/connection) — returns error mentioning connectivity", async () => {
+    vi.spyOn(globalThis, "fetch").mockRejectedValue(new TypeError("fetch failed"));
+
+    const result = await validateModel(BASE_URL, API_KEY, MODEL);
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.error.toLowerCase()).toMatch(/connect|reach|network/);
+    }
+  });
+
+  test("request body correctness", async () => {
+    const mockFetch = vi
+      .spyOn(globalThis, "fetch")
+      .mockResolvedValue(new Response(JSON.stringify({}), { status: 200 }));
+
+    await validateModel(BASE_URL, API_KEY, "my-special-model");
+
+    const body = JSON.parse((mockFetch.mock.calls[0]![1] as RequestInit).body as string);
+    expect(body).toEqual({
+      model: "my-special-model",
+      messages: [{ role: "user", content: "hi" }],
+      max_tokens: 1,
+    });
+  });
+});
+
+describe("cmdSetup with validation", () => {
+  let storageRoot: string;
+
+  beforeEach(async () => {
+    storageRoot = await mkdtemp(join(tmpdir(), "uwf-setup-validate-"));
+  });
+
+  afterEach(async () => {
+    vi.restoreAllMocks();
+    await rm(storageRoot, { recursive: true, force: true });
+  });
+
+  const setupArgs = () => ({
+    provider: "testprovider",
+    baseUrl: "https://api.test.com/v1",
+    apiKey: "sk-test",
+    model: "test-model",
+    storageRoot,
+  });
+
+  test("includes validation result on success", async () => {
+    vi.spyOn(globalThis, "fetch").mockResolvedValue(
+      new Response(JSON.stringify({}), { status: 200 }),
+    );
+
+    const result = await cmdSetup(setupArgs());
+
+    expect(result.validation).toEqual({ ok: true, value: undefined });
+    // Config files should still be written
+    expect(result.configPath).toBeTruthy();
+    expect(result.envPath).toBeTruthy();
+  });
+
+  test("includes validation failure — config still saved", async () => {
+    vi.spyOn(globalThis, "fetch").mockResolvedValue(
+      new Response("Unauthorized", { status: 401, statusText: "Unauthorized" }),
+    );
+
+    const result = await cmdSetup(setupArgs());
+
+    expect(result.validation).toBeDefined();
+    expect((result.validation as { ok: boolean }).ok).toBe(false);
+    // Config files should still be written despite validation failure
+    expect(result.configPath).toBeTruthy();
+    expect(result.envPath).toBeTruthy();
+  });
+});
@@ -0,0 +1,71 @@
+import { execFileSync } from "node:child_process";
+import { join } from "node:path";
+import { describe, expect, test } from "vitest";
+
+const CLI_PATH = join(import.meta.dirname, "..", "cli.js");
+
+function runCli(args: string[]): { stdout: string; stderr: string; exitCode: number } {
+  try {
+    const stdout = execFileSync("bun", ["run", CLI_PATH, ...args], {
+      encoding: "utf8",
+      env: { ...process.env, WORKFLOW_STORAGE_ROOT: "/tmp/uwf-test-nonexistent" },
+      stdio: ["ignore", "pipe", "pipe"],
+    });
+    return { stdout, stderr: "", exitCode: 0 };
+  } catch (e: unknown) {
+    const err = e as NodeJS.ErrnoException & { stdout?: string; stderr?: string; status?: number };
+    return {
+      stdout: err.stdout ?? "",
+      stderr: err.stderr ?? "",
+      exitCode: err.status ?? 1,
+    };
+  }
+}
+
+describe("thread step --count CLI parsing", () => {
+  test("--help shows -c/--count option", () => {
+    const result = runCli(["thread", "step", "--help"]);
+    expect(result.stdout).toContain("--count");
+    expect(result.stdout).toContain("-c");
+  });
+
+  test("description says 'one or more steps'", () => {
+    const result = runCli(["thread", "step", "--help"]);
+    expect(result.stdout).toContain("one or more steps");
+  });
+});
+
+describe("cmdThreadStep count logic", () => {
+  test("count=0 fails with validation error", () => {
+    const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "0"]);
+    expect(result.exitCode).not.toBe(0);
+    expect(result.stderr).toContain("positive integer");
+  });
+
+  test("negative count fails with validation error", () => {
+    const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "-1"]);
+    expect(result.exitCode).not.toBe(0);
+    expect(result.stderr).toContain("positive integer");
+  });
+
+  test("non-integer count fails with validation error", () => {
+    const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "1.5"]);
+    expect(result.exitCode).not.toBe(0);
+    expect(result.stderr).toContain("positive integer");
+  });
+
+  test("count=1 is the default (no -c flag)", () => {
+    // Without -c, it should attempt to run 1 step (failing on missing thread, not on count validation)
+    const result = runCli(["thread", "step", "FAKE_THREAD_ID"]);
+    expect(result.exitCode).not.toBe(0);
+    // Should NOT contain "positive integer" error — should fail on thread lookup instead
+    expect(result.stderr).not.toContain("positive integer");
+  });
+
+  test("count=3 passes validation (fails on thread lookup)", () => {
+    const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "3"]);
+    expect(result.exitCode).not.toBe(0);
+    // Should NOT contain "positive integer" error — should fail on thread/storage lookup
+    expect(result.stderr).not.toContain("positive integer");
+  });
+});
@@ -7,6 +7,7 @@ import {
  cmdCasGet,
  cmdCasHas,
  cmdCasPut,
+  cmdCasPutText,
  cmdCasRefs,
  cmdCasReindex,
  cmdCasSchemaGet,
@@ -14,6 +15,7 @@ import {
  cmdCasWalk,
 } from "./commands/cas.js";
 import { cmdSetup, cmdSetupInteractive } from "./commands/setup.js";
+import { cmdSkillCli } from "./commands/skill.js";
 import {
  cmdThreadFork,
  cmdThreadKill,
@@ -107,15 +109,21 @@ thread

 thread
  .command("step")
-  .description("Execute one step")
+  .description("Execute one or more steps")
  .argument("<thread-id>", "Thread ULID")
  .option("--agent <cmd>", "Override agent command")
-  .action((threadId: string, opts: { agent: string | undefined }) => {
+  .option("-c, --count <number>", "Number of steps to run (default: 1)")
+  .action((threadId: string, opts: { agent: string | undefined; count: string | undefined }) => {
    const storageRoot = resolveStorageRoot();
    runAction(async () => {
      const agentOverride = opts.agent ?? null;
-      const result = await cmdThreadStep(storageRoot, threadId, agentOverride);
-      writeOutput(result);
+      const count = opts.count !== undefined ? Number(opts.count) : 1;
+      const results = await cmdThreadStep(storageRoot, threadId, agentOverride, count);
+      if (results.length === 1) {
+        writeOutput(results[0]);
+      } else {
+        writeOutput(results);
+      }
    });
  });

@@ -220,6 +228,15 @@ thread
    });
  });

+const skill = program.command("skill").description("Built-in skill references for agents");
+
+skill
+  .command("cli")
+  .description("Print a markdown reference of all uwf commands")
+  .action(() => {
+    console.log(cmdSkillCli());
+  });
+
 program
  .command("setup")
  .description("Configure provider, model, and agent")
@@ -285,6 +302,17 @@ cas
    });
  });

+cas
+  .command("put-text")
+  .description("Store a plain text string, print its hash")
+  .argument("<text>", "Text content to store")
+  .action((text: string) => {
+    const storageRoot = resolveStorageRoot();
+    runAction(async () => {
+      writeOutput(await cmdCasPutText(storageRoot, text));
+    });
+  });
+
 cas
  .command("has")
  .description("Check if a hash exists")
@@ -1,10 +1,12 @@
 import { readFileSync } from "node:fs";
 import { join } from "node:path";

-import type { Hash, JSONSchema, Store } from "@uncaged/json-cas";
-import { bootstrap, getSchema, refs, walk } from "@uncaged/json-cas";
+import type { JSONSchema, Store } from "@uncaged/json-cas";
+import { bootstrap, getSchema, putSchema, refs, walk } from "@uncaged/json-cas";
 import { createFsStore } from "@uncaged/json-cas-fs";

+import { TEXT_SCHEMA } from "../schemas.js";
+
 // ---- Helpers ----

 function openStore(storageRoot: string): Store {
@@ -121,3 +123,10 @@ export async function cmdCasSchemaGet(storageRoot: string, hash: string): Promis
  }
  return schema;
 }
+
+export async function cmdCasPutText(storageRoot: string, text: string): Promise<{ hash: string }> {
+  const store = openStore(storageRoot);
+  const typeHash = await putSchema(store, TEXT_SCHEMA);
+  const hash = await store.put(typeHash, text);
+  return { hash };
+}
@@ -1,11 +1,46 @@
 import { existsSync, mkdirSync, readFileSync, writeFileSync } from "node:fs";
-import { homedir } from "node:os";
-import { join, resolve } from "node:path";
+import { join } from "node:path";
 import { stdin as input, stdout as output } from "node:process";
 import { createInterface } from "node:readline/promises";
-
+import type { Result } from "@uncaged/workflow-util";
 import { parse, stringify } from "yaml";

+/**
+ * Send a minimal chat completion request to verify the model is reachable.
+ * Returns ok on 2xx, error with reason string otherwise.
+ */
+export async function validateModel(
+  baseUrl: string,
+  apiKey: string,
+  model: string,
+): Promise<Result<void, string>> {
+  try {
+    const url = `${baseUrl.replace(/\/+$/, "")}/chat/completions`;
+    const res = await fetch(url, {
+      method: "POST",
+      headers: {
+        Authorization: `Bearer ${apiKey}`,
+        "Content-Type": "application/json",
+      },
+      body: JSON.stringify({
+        model,
+        messages: [{ role: "user", content: "hi" }],
+        max_tokens: 1,
+      }),
+      signal: AbortSignal.timeout(15_000),
+    });
+    if (!res.ok) {
+      return { ok: false, error: `HTTP ${res.status} ${res.statusText}` };
+    }
+    return { ok: true, value: undefined };
+  } catch (err: unknown) {
+    if (err instanceof DOMException && err.name === "AbortError") {
+      return { ok: false, error: "Request timed out — model endpoint unreachable" };
+    }
+    return { ok: false, error: `Network error — could not reach endpoint (${String(err)})` };
+  }
+}
+
 /**
 * Preset provider list — embedded to avoid runtime YAML loading dependency.
 * Keep in sync with providers.yaml in cli-workflow.
@@ -164,12 +199,16 @@ export async function cmdSetup(args: SetupArgs): Promise<Record<string, unknown>
  envData[envName] = args.apiKey;
  saveEnvFile(envPath, envData);

+  // Validate model connectivity
+  const validation = await validateModel(args.baseUrl, args.apiKey, args.model);
+
  return {
    configPath,
    envPath,
    provider: args.provider,
    model: args.model,
    defaultAgent: merged.defaultAgent,
+    validation,
  };
 }

@@ -329,7 +368,7 @@ export async function cmdSetupInteractive(storageRoot: string): Promise<Record<s

    console.log(`  → ${providerName}/${model}\n`);

-    await cmdSetup({
+    const setupResult = await cmdSetup({
      provider: providerName,
      baseUrl,
      apiKey,
@@ -337,6 +376,19 @@ export async function cmdSetupInteractive(storageRoot: string): Promise<Record<s
      storageRoot,
    });

+    // Show validation result
+    if (setupResult.validation && typeof setupResult.validation === "object") {
+      const v = setupResult.validation as { ok: boolean; error?: string };
+      if (v.ok) {
+        console.log("✓ Model verified — connection successful.\n");
+      } else {
+        console.log(`\n⚠ Warning: Could not reach model — ${v.error}`);
+        console.log(
+          "  Config saved, but you may want to try a different model or check your API key.\n",
+        );
+      }
+    }
+
    console.log("Setup complete! Get started:\n");
    console.log("  uwf workflow put <workflow.yaml>   Register a workflow");
    console.log('  uwf thread start <name> -p "..."   Start a thread');
@@ -0,0 +1 @@
+export { generateCliReference as cmdSkillCli } from "@uncaged/workflow-util";
@@ -673,6 +673,27 @@ export async function cmdThreadStep(
  storageRoot: string,
  threadId: ThreadId,
  agentOverride: string | null,
+  count: number,
+): Promise<StepOutput[]> {
+  if (count < 1 || !Number.isInteger(count)) {
+    fail(`--count must be a positive integer, got: ${count}`);
+  }
+
+  const results: StepOutput[] = [];
+  for (let i = 0; i < count; i++) {
+    const result = await cmdThreadStepOnce(storageRoot, threadId, agentOverride);
+    results.push(result);
+    if (result.done) {
+      break;
+    }
+  }
+  return results;
+}
+
+async function cmdThreadStepOnce(
+  storageRoot: string,
+  threadId: ThreadId,
+  agentOverride: string | null,
 ): Promise<StepOutput> {
  const index = await loadThreadsIndex(storageRoot);
  const headHash = index[threadId];
@@ -2,7 +2,12 @@ import { readFile } from "node:fs/promises";

 import type { JSONSchema } from "@uncaged/json-cas";
 import { putSchema, validate } from "@uncaged/json-cas";
-import type { CasRef, RoleDefinition, WorkflowPayload } from "@uncaged/workflow-protocol";
+import type {
+  CasRef,
+  RoleDefinition,
+  Transition,
+  WorkflowPayload,
+} from "@uncaged/workflow-protocol";
 import { parse } from "yaml";

 import {
@@ -46,11 +51,28 @@ function isJsonSchema(value: unknown): value is JSONSchema {
  return typeof value === "object" && value !== null && !Array.isArray(value);
 }

-async function resolveMetaRef(uwf: UwfStore, roleName: string, meta: unknown): Promise<CasRef> {
-  if (!isJsonSchema(meta)) {
-    fail(`role "${roleName}": meta must be a JSON Schema object`);
+/** Normalize graph transitions: ensure condition is null (not undefined) for fallback entries. */
+function normalizeGraph(graph: Record<string, Transition[]>): Record<string, Transition[]> {
+  const result: Record<string, Transition[]> = {};
+  for (const [node, transitions] of Object.entries(graph)) {
+    result[node] = transitions.map((t) => ({
+      role: t.role,
+      condition: t.condition ?? null,
+    }));
  }
-  const schema: JSONSchema = meta.title === undefined ? { ...meta, title: roleName } : meta;
+  return result;
+}
+
+async function resolveFrontmatterRef(
+  uwf: UwfStore,
+  roleName: string,
+  frontmatter: unknown,
+): Promise<CasRef> {
+  if (!isJsonSchema(frontmatter)) {
+    fail(`role "${roleName}": frontmatter must be a JSON Schema object`);
+  }
+  const schema: JSONSchema =
+    frontmatter.title === undefined ? { ...frontmatter, title: roleName } : frontmatter;
  return putSchema(uwf.store, schema);
 }

@@ -60,14 +82,18 @@ export async function materializeWorkflowPayload(
 ): Promise<WorkflowPayload> {
  const roles: Record<string, RoleDefinition> = {};
  for (const [roleName, role] of Object.entries(raw.roles)) {
-    const meta = await resolveMetaRef(uwf, `${raw.name}.${roleName}`, role.meta);
+    const frontmatter = await resolveFrontmatterRef(
+      uwf,
+      `${raw.name}.${roleName}`,
+      role.frontmatter,
+    );
    roles[roleName] = {
      description: role.description,
      goal: role.goal,
      capabilities: role.capabilities,
      procedure: role.procedure,
      output: role.output,
-      meta,
+      frontmatter,
    };
  }
  return {
@@ -75,7 +101,7 @@ export async function materializeWorkflowPayload(
    description: raw.description,
    roles,
    conditions: raw.conditions,
-    graph: raw.graph,
+    graph: normalizeGraph(raw.graph),
  };
 }

@@ -2,10 +2,13 @@ import type { Hash, Store } from "@uncaged/json-cas";
 import { putSchema } from "@uncaged/json-cas";
 import { START_NODE_SCHEMA, STEP_NODE_SCHEMA, WORKFLOW_SCHEMA } from "@uncaged/workflow-protocol";

+export const TEXT_SCHEMA = { type: "string" as const };
+
 export type UwfSchemaHashes = {
  workflow: Hash;
  startNode: Hash;
  stepNode: Hash;
+  text: Hash;
 };

 /**
@@ -13,10 +16,11 @@ export type UwfSchemaHashes = {
 * Idempotent: safe to call on every CLI invocation.
 */
 export async function registerUwfSchemas(store: Store): Promise<UwfSchemaHashes> {
-  const [workflow, startNode, stepNode] = await Promise.all([
+  const [workflow, startNode, stepNode, text] = await Promise.all([
    putSchema(store, WORKFLOW_SCHEMA),
    putSchema(store, START_NODE_SCHEMA),
    putSchema(store, STEP_NODE_SCHEMA),
+    putSchema(store, TEXT_SCHEMA),
  ]);
-  return { workflow, startNode, stepNode };
+  return { workflow, startNode, stepNode, text };
 }
@@ -15,8 +15,8 @@ function isRoleDefinition(value: unknown): boolean {
  if (!isRecord(value)) {
    return false;
  }
-  const meta = value.meta;
-  const metaOk = isRecord(meta) && typeof meta.type === "string";
+  const frontmatter = value.frontmatter;
+  const frontmatterOk = isRecord(frontmatter) && typeof frontmatter.type === "string";
  const capabilities = value.capabilities;
  const capabilitiesOk =
    Array.isArray(capabilities) && capabilities.every((c) => typeof c === "string");
@@ -26,7 +26,7 @@ function isRoleDefinition(value: unknown): boolean {
    capabilitiesOk &&
    typeof value.procedure === "string" &&
    typeof value.output === "string" &&
-    metaOk
+    frontmatterOk
  );
 }

@@ -42,7 +42,10 @@ function isTransition(value: unknown): boolean {
    return false;
  }
  const condition = value.condition;
-  return typeof value.role === "string" && (condition === null || typeof condition === "string");
+  return (
+    typeof value.role === "string" &&
+    (condition === null || condition === undefined || typeof condition === "string")
+  );
 }

 function isStringRecord(value: unknown, itemCheck: (item: unknown) => boolean): boolean {
@@ -1,4 +1,5 @@
 import { spawn } from "node:child_process";
+import type { Store } from "@uncaged/json-cas";

 import {
  type AgentContext,
@@ -10,7 +11,6 @@ import {
 import {
  loadHermesSession,
  parseSessionIdFromStdout,
-  storeHermesRawOutput,
  storeHermesSessionDetail,
 } from "./session-detail.js";

@@ -52,17 +52,8 @@ export function buildHermesPrompt(ctx: AgentContext): string {
  return parts.join("\n");
 }

-function spawnHermesChat(prompt: string): Promise<{ stdout: string; stderr: string }> {
+function spawnHermes(args: string[]): Promise<{ stdout: string; stderr: string }> {
  return new Promise((resolve, reject) => {
-    const args = [
-      "chat",
-      "-q",
-      prompt,
-      "--yolo",
-      "--max-turns",
-      String(HERMES_MAX_TURNS),
-      "--quiet",
-    ];
    const child = spawn(HERMES_COMMAND, args, {
      env: process.env,
      shell: false,
@@ -94,23 +85,73 @@ function spawnHermesChat(prompt: string): Promise<{ stdout: string; stderr: stri
  });
 }

+function spawnHermesChat(prompt: string): Promise<{ stdout: string; stderr: string }> {
+  return spawnHermes([
+    "chat",
+    "-q",
+    prompt,
+    "--yolo",
+    "--max-turns",
+    String(HERMES_MAX_TURNS),
+    "--quiet",
+  ]);
+}
+
+function spawnHermesResume(
+  sessionId: string,
+  message: string,
+): Promise<{ stdout: string; stderr: string }> {
+  return spawnHermes([
+    "chat",
+    "--resume",
+    sessionId,
+    "-q",
+    message,
+    "--yolo",
+    "--max-turns",
+    String(HERMES_MAX_TURNS),
+    "--quiet",
+  ]);
+}
+
+function parseSessionId(stdout: string, stderr: string): string {
+  const sessionId = parseSessionIdFromStdout(stderr) ?? parseSessionIdFromStdout(stdout);
+  if (sessionId === null) {
+    throw new Error(
+      "Failed to parse session_id from hermes output.\n" +
+        `stderr (first 200 chars): ${stderr.slice(0, 200)}\n` +
+        `stdout (first 200 chars): ${stdout.slice(0, 200)}`,
+    );
+  }
+  return sessionId;
+}
+
+async function buildResultFromSession(sessionId: string, store: Store): Promise<AgentRunResult> {
+  const session = await loadHermesSession(sessionId);
+  if (session === null) {
+    throw new Error(`Failed to load hermes session file for session_id: ${sessionId}`);
+  }
+  const { detailHash, output } = await storeHermesSessionDetail(store, session);
+  return { output, detailHash, sessionId };
+}
+
 async function runHermes(ctx: AgentContext): Promise<AgentRunResult> {
  const fullPrompt = buildHermesPrompt(ctx);
  const { stdout, stderr } = await spawnHermesChat(fullPrompt);
-  const { store } = ctx;
+  const sessionId = parseSessionId(stdout, stderr);
+  return buildResultFromSession(sessionId, ctx.store);
+}

-  // --quiet mode: session_id may be on stdout or stderr
-  const sessionId = parseSessionIdFromStdout(stderr) ?? parseSessionIdFromStdout(stdout);
-  if (sessionId !== null) {
-    const session = await loadHermesSession(sessionId);
-    if (session !== null) {
-      const { detailHash, output } = await storeHermesSessionDetail(store, session);
-      return { output, detailHash };
-    }
-  }
-
-  const detailHash = await storeHermesRawOutput(store, stdout);
-  return { output: stdout, detailHash };
+async function continueHermes(
+  sessionId: string,
+  message: string,
+  store: Store,
+): Promise<AgentRunResult> {
+  const { stdout, stderr } = await spawnHermesResume(sessionId, message);
+  // Resume may return a new session_id
+  const newSessionId = parseSessionIdFromStdout(stderr) ?? parseSessionIdFromStdout(stdout);
+  const resolvedId = newSessionId ?? sessionId;
+  return buildResultFromSession(resolvedId, store);
 }

 /** Agent CLI factory: parses argv, runs Hermes, extracts output, writes StepNode. */
@@ -118,5 +159,6 @@ export function createHermesAgent(): () => Promise<void> {
  return createAgent({
    name: "hermes",
    run: runHermes,
+    continue: continueHermes,
  });
 }
@@ -2,13 +2,32 @@ import { describe, expect, test } from "vitest";

 import { buildOutputFormatInstruction } from "../src/build-output-format-instruction.js";

+const PLANNER_SCHEMA = {
+  type: "object",
+  properties: {
+    status: { type: "string", enum: ["ready", "insufficient_info"] },
+    plan: { type: "string" },
+  },
+  required: ["status"],
+  additionalProperties: false,
+};
+
+const REVIEWER_SCHEMA = {
+  type: "object",
+  properties: {
+    approved: { type: "boolean" },
+  },
+  required: ["approved"],
+  additionalProperties: false,
+};
+
 describe("buildOutputFormatInstruction", () => {
  test("always includes the frontmatter example block", () => {
    const result = buildOutputFormatInstruction({});
    expect(result).toContain("---");
-    expect(result).toContain("status: done");
-    expect(result).toContain("confidence:");
-    expect(result).toContain("scope: role");
+    expect(result).not.toContain("status: done");
+    expect(result).not.toContain("confidence:");
+    expect(result).not.toContain("scope: role");
  });

  test("always marks frontmatter as the primary deliverable", () => {
@@ -16,17 +35,36 @@ describe("buildOutputFormatInstruction", () => {
    expect(result).toContain("primary deliverable");
  });

-  test("lists fields from a flat object schema", () => {
+  test("generates planner-specific YAML example from schema", () => {
+    const result = buildOutputFormatInstruction(PLANNER_SCHEMA);
+    expect(result).toContain("status: ready  # required | ready | insufficient_info");
+    expect(result).toContain("plan: <string>");
+    expect(result).not.toContain("status: done");
+    expect(result).not.toContain("confidence:");
+    expect(result).not.toContain("artifacts:");
+  });
+
+  test("generates reviewer-specific YAML example from schema", () => {
+    const result = buildOutputFormatInstruction(REVIEWER_SCHEMA);
+    expect(result).toContain("approved: true  # required | true | false");
+    expect(result).not.toContain("status:");
+  });
+
+  test("lists fields from a flat object schema with required marker", () => {
    const schema = {
      type: "object",
      properties: {
        status: { type: "string" },
        confidence: { type: "number" },
      },
+      required: ["status"],
    };
    const result = buildOutputFormatInstruction(schema);
-    expect(result).toContain("`status`");
+    expect(result).toContain("`status` (required)");
    expect(result).toContain("`confidence`");
+    expect(result).not.toContain("`confidence` (required)");
+    expect(result).toContain("status: <string>  # required");
+    expect(result).toContain("confidence: <number>");
  });

  test("lists union of fields from an anyOf schema", () => {
@@ -45,6 +83,8 @@ describe("buildOutputFormatInstruction", () => {
    const result = buildOutputFormatInstruction(schema);
    expect(result).toContain("`alpha`");
    expect(result).toContain("`beta`");
+    expect(result).toContain("alpha: <string>");
+    expect(result).toContain("beta: <number>");
  });

  test("lists union of fields from a oneOf schema", () => {
@@ -63,6 +103,8 @@ describe("buildOutputFormatInstruction", () => {
    const result = buildOutputFormatInstruction(schema);
    expect(result).toContain("`foo`");
    expect(result).toContain("`bar`");
+    expect(result).toContain("foo: <string>");
+    expect(result).toContain("bar: true  # true | false");
  });

  test("falls back gracefully for a non-object schema with no properties", () => {
@@ -80,6 +122,45 @@ describe("buildOutputFormatInstruction", () => {
    const result = buildOutputFormatInstruction(schema);
    const matches = [...result.matchAll(/`shared`/g)];
    expect(matches.length).toBe(1);
+    expect(result).toContain("shared: <string>");
+  });
+
+  test("marks required when any union variant requires the field", () => {
+    const schema = {
+      anyOf: [
+        {
+          type: "object",
+          properties: { shared: { type: "string" } },
+          required: ["shared"],
+        },
+        { type: "object", properties: { shared: { type: "number" } } },
+      ],
+    };
+    const result = buildOutputFormatInstruction(schema);
+    expect(result).toContain("`shared` (required)");
+    expect(result).toContain("shared: <string>  # required");
+  });
+
+  test("explicitly forbids extra frontmatter fields", () => {
+    const result = buildOutputFormatInstruction(PLANNER_SCHEMA);
+    expect(result).toMatch(/\b(only|exclusively)\b.*fields/i);
+    expect(result).toMatch(/do not add (extra|additional|other) fields/i);
+  });
+
+  test("forbids extra fields even for empty schema", () => {
+    const result = buildOutputFormatInstruction({});
+    expect(result).toMatch(/do not add (extra|additional|other) fields/i);
+  });
+
+  test("forbids extra fields for anyOf/oneOf schemas", () => {
+    const schema = {
+      anyOf: [
+        { type: "object", properties: { alpha: { type: "string" } } },
+        { type: "object", properties: { beta: { type: "number" } } },
+      ],
+    };
+    const result = buildOutputFormatInstruction(schema);
+    expect(result).toMatch(/do not add (extra|additional|other) fields/i);
  });

  test("includes focus reminder about role scope", () => {
@@ -18,13 +18,16 @@ describe("buildRolePrompt", () => {
    expect(result).toContain("## Capabilities");
    expect(result).toContain("- cursor-agent");
    expect(result).toContain("- file-edit");
+    expect(result).toContain("## Prepare");
+    expect(result).toContain("uwf CLI Reference");
+    expect(result).toContain("cursor-agent, file-edit");
    expect(result).toContain("## Procedure");
    expect(result).toContain("Implement the feature.");
    expect(result).toContain("## Output");
    expect(result).toContain("Summarize changes.");
  });

-  test("empty fields are omitted", () => {
+  test("empty fields are omitted but Prepare is always present", () => {
    const role: RoleDefinition = {
      description: "A reviewer",
      goal: "You are a code reviewer.",
@@ -35,12 +38,14 @@ describe("buildRolePrompt", () => {
    };
    const result = buildRolePrompt(role);
    expect(result).toContain("## Goal");
+    expect(result).toContain("## Prepare");
+    expect(result).toContain("uwf CLI Reference");
    expect(result).toContain("## Procedure");
    expect(result).not.toContain("## Capabilities");
    expect(result).not.toContain("## Output");
  });

-  test("all empty returns empty string", () => {
+  test("all empty still includes Prepare section", () => {
    const role: RoleDefinition = {
      description: "Minimal",
      goal: "",
@@ -50,7 +55,12 @@ describe("buildRolePrompt", () => {
      meta: "placeholder00000" as string,
    };
    const result = buildRolePrompt(role);
-    expect(result).toBe("");
+    expect(result).toContain("## Prepare");
+    expect(result).toContain("uwf CLI Reference");
+    expect(result).not.toContain("## Goal");
+    expect(result).not.toContain("## Capabilities");
+    expect(result).not.toContain("## Procedure");
+    expect(result).not.toContain("## Output");
  });

  test("capabilities rendered as bullet list", () => {
@@ -29,6 +29,27 @@ const STRICT_SCHEMA = {
  additionalProperties: false,
 };

+/** Role-specific schema (reviewer) — only approved, no standard agent fields. */
+const REVIEWER_SCHEMA = {
+  type: "object",
+  properties: {
+    approved: { type: "boolean" },
+  },
+  required: ["approved"],
+  additionalProperties: false,
+};
+
+/** Role-specific schema (planner) — custom status enum + plan hash. */
+const PLANNER_SCHEMA = {
+  type: "object",
+  properties: {
+    status: { type: "string", enum: ["ready", "insufficient_info"] },
+    plan: { type: "string" },
+  },
+  required: ["status"],
+  additionalProperties: false,
+};
+
 async function makeStoreWithSchema(schema: Record<string, unknown>) {
  const store = createMemoryStore();
  const schemaHash = await putSchema(store, schema);
@@ -134,3 +155,48 @@ describe("tryFrontmatterFastPath — fallback: schema mismatch", () => {
    expect(result).toBeNull();
  });
 });
+
+// ── Role-specific schema fields ───────────────────────────────────────────────
+
+describe("tryFrontmatterFastPath — role-specific fields", () => {
+  test("extracts approved only for reviewer schema (no extra standard fields)", async () => {
+    const { store, schemaHash } = await makeStoreWithSchema(REVIEWER_SCHEMA);
+
+    const raw = "---\napproved: true\n---\n\nReview passed.";
+
+    const result = await tryFrontmatterFastPath(raw, schemaHash, store);
+    expect(result).not.toBeNull();
+
+    const node = store.get(result!.outputHash);
+    expect(node).not.toBeNull();
+    const payload = node!.payload as Record<string, unknown>;
+    expect(payload).toEqual({ approved: true });
+    expect(payload.status).toBeUndefined();
+    expect(payload.scope).toBeUndefined();
+  });
+
+  test("extracts plan and role-specific status for planner schema", async () => {
+    const { store, schemaHash } = await makeStoreWithSchema(PLANNER_SCHEMA);
+
+    const raw = "---\nstatus: ready\nplan: 01HASHPLANNER0001\n---\n\nSpec summary.";
+
+    const result = await tryFrontmatterFastPath(raw, schemaHash, store);
+    expect(result).not.toBeNull();
+
+    const node = store.get(result!.outputHash);
+    expect(node).not.toBeNull();
+    const payload = node!.payload as Record<string, unknown>;
+    expect(payload.status).toBe("ready");
+    expect(payload.plan).toBe("01HASHPLANNER0001");
+    expect(payload.scope).toBeUndefined();
+  });
+
+  test("returns null when required role-specific field is missing", async () => {
+    const { store, schemaHash } = await makeStoreWithSchema(REVIEWER_SCHEMA);
+
+    const raw = "---\nstatus: done\nscope: role\n---\n\nBody.";
+
+    const result = await tryFrontmatterFastPath(raw, schemaHash, store);
+    expect(result).toBeNull();
+  });
+});
@@ -1,5 +1,11 @@
 import type { JSONSchema } from "@uncaged/json-cas";

+type SchemaProperty = {
+  name: string;
+  schema: JSONSchema;
+  required: boolean;
+};
+
 /**
 * Extract top-level property names from a JSON Schema object.
 *
@@ -9,9 +15,44 @@ import type { JSONSchema } from "@uncaged/json-cas";
 *
 * Returns an empty array for schemas with no inspectable property definitions.
 */
-function extractSchemaFields(schema: JSONSchema): string[] {
+export function extractSchemaFields(schema: JSONSchema): string[] {
+  return extractSchemaProperties(schema).map((p) => p.name);
+}
+
+function extractSchemaProperties(schema: JSONSchema): SchemaProperty[] {
+  const objectSchemas = collectObjectSchemas(schema);
+  if (objectSchemas.length === 0) {
+    return [];
+  }
+
+  const byName = new Map<string, SchemaProperty>();
+
+  for (const objectSchema of objectSchemas) {
+    const requiredSet = new Set(
+      Array.isArray(objectSchema.required) ? (objectSchema.required as string[]) : [],
+    );
+    const properties = objectSchema.properties as Record<string, JSONSchema> | null | undefined;
+    if (typeof properties !== "object" || properties === null) {
+      continue;
+    }
+
+    for (const [name, propSchema] of Object.entries(properties)) {
+      const required = requiredSet.has(name);
+      const existing = byName.get(name);
+      if (existing === undefined) {
+        byName.set(name, { name, schema: propSchema, required });
+      } else if (required) {
+        byName.set(name, { ...existing, required: true });
+      }
+    }
+  }
+
+  return [...byName.values()];
+}
+
+function collectObjectSchemas(schema: JSONSchema): JSONSchema[] {
  if (typeof schema.properties === "object" && schema.properties !== null) {
-    return Object.keys(schema.properties as Record<string, unknown>);
+    return [schema];
  }

  const unionKey = Array.isArray(schema.anyOf)
@@ -20,18 +61,109 @@ function extractSchemaFields(schema: JSONSchema): string[] {
      ? "oneOf"
      : null;

-  if (unionKey !== null) {
-    const variants = schema[unionKey] as JSONSchema[];
-    const fieldSet = new Set<string>();
-    for (const variant of variants) {
-      for (const field of extractSchemaFields(variant)) {
-        fieldSet.add(field);
-      }
-    }
-    return [...fieldSet];
+  if (unionKey === null) {
+    return [];
  }

-  return [];
+  const variants = schema[unionKey] as JSONSchema[];
+  const result: JSONSchema[] = [];
+  for (const variant of variants) {
+    result.push(...collectObjectSchemas(variant));
+  }
+  return result;
+}
+
+function resolvePropertySchema(prop: JSONSchema): JSONSchema {
+  if (Array.isArray(prop.enum) && prop.enum.length > 0) {
+    return prop;
+  }
+
+  const unionKey = Array.isArray(prop.anyOf) ? "anyOf" : Array.isArray(prop.oneOf) ? "oneOf" : null;
+
+  if (unionKey !== null) {
+    const variants = prop[unionKey] as JSONSchema[];
+    const nonNull = variants.filter((v) => v.type !== "null");
+    if (nonNull.length === 1) {
+      return nonNull[0];
+    }
+  }
+
+  return prop;
+}
+
+function formatYamlScalar(value: unknown): string {
+  if (typeof value === "boolean") {
+    return String(value);
+  }
+  if (typeof value === "number") {
+    return String(value);
+  }
+  return String(value);
+}
+
+function buildPropertyComment(parts: string[]): string {
+  const filtered = parts.filter((p) => p.length > 0);
+  return filtered.length > 0 ? `  # ${filtered.join(" | ")}` : "";
+}
+
+function buildPropertyExampleLine(prop: SchemaProperty): string {
+  const resolved = resolvePropertySchema(prop.schema);
+  const commentParts: string[] = [];
+  if (prop.required) {
+    commentParts.push("required");
+  }
+
+  if (Array.isArray(resolved.enum) && resolved.enum.length > 0) {
+    const enumValues = resolved.enum.map((v) => String(v));
+    commentParts.push(...enumValues);
+    const first = resolved.enum[0];
+    return `${prop.name}: ${formatYamlScalar(first)}${buildPropertyComment(commentParts)}`;
+  }
+
+  if (resolved.type === "boolean") {
+    commentParts.push("true", "false");
+    return `${prop.name}: true${buildPropertyComment(commentParts)}`;
+  }
+
+  if (resolved.type === "string") {
+    return `${prop.name}: <string>${buildPropertyComment(commentParts)}`;
+  }
+
+  if (resolved.type === "number" || resolved.type === "integer") {
+    return `${prop.name}: <number>${buildPropertyComment(commentParts)}`;
+  }
+
+  if (resolved.type === "array") {
+    return `${prop.name}:\n  - <item>${buildPropertyComment(commentParts)}`;
+  }
+
+  if (resolved.type === "object") {
+    return `${prop.name}: <object>${buildPropertyComment(commentParts)}`;
+  }
+
+  return `${prop.name}: <value>${buildPropertyComment(commentParts)}`;
+}
+
+function buildYamlExampleBlock(properties: SchemaProperty[]): string {
+  if (properties.length === 0) {
+    return "---\n\n... your markdown work here ...";
+  }
+
+  const lines = properties.map((p) => buildPropertyExampleLine(p));
+  return `---\n${lines.join("\n")}\n---\n\n... your markdown work here ...`;
+}
+
+function buildFieldList(properties: SchemaProperty[]): string {
+  if (properties.length === 0) {
+    return "  (schema fields will be extracted automatically)";
+  }
+
+  return properties
+    .map((p) => {
+      const suffix = p.required ? " (required)" : "";
+      return `  - \`${p.name}\`${suffix}`;
+    })
+    .join("\n");
 }

 /**
@@ -42,28 +174,16 @@ function extractSchemaFields(schema: JSONSchema): string[] {
 * system prompt so the deliverable format is the first thing the agent sees.
 */
 export function buildOutputFormatInstruction(schema: JSONSchema): string {
-  const fields = extractSchemaFields(schema);
-
-  const fieldList =
-    fields.length > 0
-      ? fields.map((f) => `  - \`${f}\``).join("\n")
-      : "  (schema fields will be extracted automatically)";
+  const properties = extractSchemaProperties(schema);
+  const yamlExample = buildYamlExampleBlock(properties);
+  const fieldList = buildFieldList(properties);

  return `## Deliverable Format

 Your response MUST begin with a YAML frontmatter block followed by your markdown work:

 \`\`\`
---
-status: done          # done | needs_input | in_progress | failed
-next: <role-name>     # suggested next role, or omit
-confidence: 0.9       # 0.0–1.0, your self-assessed confidence
-artifacts:            # list of file paths or CAS hashes you produced
-  - path/to/file.ts
-scope: role           # role | thread
---
-
-... your markdown work here ...
+${yamlExample}
 \`\`\`

 The frontmatter is the **primary deliverable** — the engine reads it directly.
@@ -71,5 +191,7 @@ Your meta output must satisfy these fields:

 ${fieldList}

+Output ONLY the fields listed above. Do not add extra fields that are not specified in the schema.
+
 Focus exclusively on YOUR role's deliverable. Do not perform actions outside your role's scope.`;
 }
@@ -1,10 +1,15 @@
 import type { RoleDefinition } from "@uncaged/workflow-protocol";
+import { generateCliReference } from "@uncaged/workflow-util";

 /**
 * Build the role prompt from a RoleDefinition.
 *
- * Assembles structured sections: Goal, Capabilities, Procedure, Output.
+ * Assembles structured sections: Goal, Capabilities, Prepare, Procedure, Output.
 * Empty strings and empty arrays are omitted from the output.
+ *
+ * The Prepare section always inlines the uwf CLI reference so the agent has
+ * workflow knowledge without needing to run an external command. The capabilities
+ * array is rendered as keyword hints for implicit skill loading.
 */
 export function buildRolePrompt(role: RoleDefinition): string {
  const sections: string[] = [];
@@ -18,6 +23,15 @@ export function buildRolePrompt(role: RoleDefinition): string {
    sections.push(`## Capabilities\n\n${list}`);
  }

+  const prepareLines: string[] = [generateCliReference()];
+  if (role.capabilities.length > 0) {
+    const keywords = role.capabilities.join(", ");
+    prepareLines.push(
+      `You have the following capabilities: ${keywords}. Load relevant skills matching these keywords before starting work.`,
+    );
+  }
+  sections.push(`## Prepare\n\n${prepareLines.join("\n\n")}`);
+
  if (role.procedure !== "") {
    sections.push(`## Procedure\n\n${role.procedure}`);
  }
@@ -1,13 +1,139 @@
 import type { Store } from "@uncaged/json-cas";
-import { validate } from "@uncaged/json-cas";
+import { getSchema, validate } from "@uncaged/json-cas";
 import type { CasRef } from "@uncaged/workflow-protocol";
-import { parseFrontmatterMarkdown, validateFrontmatter } from "@uncaged/workflow-util";
+import {
+  type AgentFrontmatter,
+  createLogger,
+  parseFrontmatterMarkdown,
+  validateFrontmatter,
+} from "@uncaged/workflow-util";
+import { parse as parseYaml } from "yaml";
+
+import { extractSchemaFields } from "./build-output-format-instruction.js";
+
+const log = createLogger({ sink: { kind: "stderr" } });
+
+const STANDARD_KEYS = ["status", "next", "confidence", "artifacts", "scope"] as const;
+
+type StandardKey = (typeof STANDARD_KEYS)[number];

 export type FrontmatterFastPathResult = {
  body: string;
  outputHash: CasRef;
 };

+function extractYamlBlock(raw: string): string | null {
+  const fence = "---";
+  if (!raw.startsWith(fence)) {
+    return null;
+  }
+
+  const rest = raw.slice(fence.length);
+  if (rest.length > 0 && rest[0] !== "\n" && rest[0] !== "\r") {
+    return null;
+  }
+
+  const afterOpen = rest.startsWith("\n") ? rest.slice(1) : rest;
+  const closeIndex = afterOpen.indexOf(`\n${fence}`);
+  if (closeIndex === -1) {
+    return null;
+  }
+
+  return afterOpen.slice(0, closeIndex);
+}
+
+function parseRawFrontmatterFields(raw: string): Record<string, unknown> {
+  const yamlText = extractYamlBlock(raw);
+  if (yamlText === null) {
+    return {};
+  }
+
+  try {
+    const parsed = parseYaml(yamlText);
+    if (typeof parsed !== "object" || parsed === null || Array.isArray(parsed)) {
+      return {};
+    }
+    return parsed as Record<string, unknown>;
+  } catch {
+    return {};
+  }
+}
+
+function defaultCandidate(frontmatter: AgentFrontmatter): Record<string, unknown> {
+  return {
+    status: frontmatter.status,
+    next: frontmatter.next,
+    confidence: frontmatter.confidence,
+    artifacts: [...frontmatter.artifacts],
+    scope: frontmatter.scope,
+  };
+}
+
+function pickStandardField(frontmatter: AgentFrontmatter, key: StandardKey): unknown {
+  switch (key) {
+    case "status":
+      return frontmatter.status;
+    case "next":
+      return frontmatter.next;
+    case "confidence":
+      return frontmatter.confidence;
+    case "artifacts":
+      return [...frontmatter.artifacts];
+    case "scope":
+      return frontmatter.scope;
+  }
+}
+
+function isStandardKey(key: string): key is StandardKey {
+  return (STANDARD_KEYS as readonly string[]).includes(key);
+}
+
+function pickFieldValue(
+  field: string,
+  frontmatter: AgentFrontmatter,
+  rawFields: Record<string, unknown>,
+): unknown | undefined {
+  if (!isStandardKey(field)) {
+    return Object.hasOwn(rawFields, field) ? rawFields[field] : undefined;
+  }
+
+  const coerced = pickStandardField(frontmatter, field);
+  if (field === "artifacts" || field === "scope") {
+    return coerced;
+  }
+  if (coerced !== null) {
+    return coerced;
+  }
+  return Object.hasOwn(rawFields, field) ? rawFields[field] : coerced;
+}
+
+/**
+ * Build a CAS candidate object from schema property keys and parsed frontmatter.
+ *
+ * When the schema has no inspectable properties, falls back to the five standard
+ * agent frontmatter fields for backward compatibility.
+ */
+function buildCandidate(
+  frontmatter: AgentFrontmatter,
+  rawFields: Record<string, unknown>,
+  schemaFields: string[],
+): Record<string, unknown> {
+  if (schemaFields.length === 0) {
+    return defaultCandidate(frontmatter);
+  }
+
+  const candidate: Record<string, unknown> = {};
+
+  for (const field of schemaFields) {
+    const value = pickFieldValue(field, frontmatter, rawFields);
+    if (value !== undefined) {
+      candidate[field] = value;
+    }
+  }
+
+  return candidate;
+}
+
 /**
 * Try to satisfy `outputSchema` from frontmatter fields alone.
 *
@@ -32,16 +158,22 @@ export async function tryFrontmatterFastPath(

  const validationErrors = validateFrontmatter(frontmatter);
  if (validationErrors.length > 0) {
+    log(
+      "9GNPS4WY",
+      `frontmatter validation errors: ${validationErrors.map((e) => e.message).join("; ")}`,
+    );
    return null;
  }

-  const candidate: Record<string, unknown> = {
-    status: frontmatter.status,
-    next: frontmatter.next,
-    confidence: frontmatter.confidence,
-    artifacts: [...frontmatter.artifacts],
-    scope: frontmatter.scope,
-  };
+  const schema = getSchema(store, outputSchema);
+  if (schema === null) {
+    log("8FHMR2QX", `output schema not found in CAS: ${outputSchema}`);
+    return null;
+  }
+
+  const schemaFields = extractSchemaFields(schema);
+  const rawFields = parseRawFrontmatterFields(raw);
+  const candidate = buildCandidate(frontmatter, rawFields, schemaFields);

  let outputHash: CasRef;
  let node: ReturnType<Store["get"]>;
@@ -50,10 +182,12 @@ export async function tryFrontmatterFastPath(
    outputHash = await store.put(outputSchema, candidate);
    node = store.get(outputHash);
  } catch {
+    log("2KMQT7NR", "failed to store frontmatter candidate in CAS");
    return null;
  }

  if (node === null || !validate(store, node)) {
+    log("2KMQT7NR", "stored frontmatter candidate failed schema validation");
    return null;
  }

@@ -12,4 +12,10 @@ export type { FrontmatterFastPathResult } from "./frontmatter.js";
 export { tryFrontmatterFastPath } from "./frontmatter.js";
 export { createAgent } from "./run.js";
 export { getConfigPath, getEnvPath, loadWorkflowConfig } from "./storage.js";
-export type { AgentContext, AgentOptions, AgentRunFn, AgentRunResult } from "./types.js";
+export type {
+  AgentContext,
+  AgentContinueFn,
+  AgentOptions,
+  AgentRunFn,
+  AgentRunResult,
+} from "./types.js";
@@ -3,11 +3,12 @@ import type { CasRef, StepNodePayload, ThreadId } from "@uncaged/workflow-protoc
 import { config as loadDotenv } from "dotenv";
 import { buildOutputFormatInstruction } from "./build-output-format-instruction.js";
 import { buildContextWithMeta } from "./context.js";
-import { extract } from "./extract.js";
 import { tryFrontmatterFastPath } from "./frontmatter.js";
 import type { AgentStore } from "./storage.js";
-import { getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
-import type { AgentContext, AgentOptions, AgentRunResult } from "./types.js";
+import { getEnvPath, resolveStorageRoot } from "./storage.js";
+import type { AgentOptions } from "./types.js";
+
+const MAX_FRONTMATTER_RETRIES = 2;

 function fail(message: string): never {
  process.stderr.write(`${message}\n`);
@@ -66,31 +67,16 @@ async function writeStepNode(options: {
  return hash;
 }

-async function runAgent(options: AgentOptions, ctx: AgentContext): Promise<AgentRunResult> {
-  return runWithMessage("agent run failed", () => options.run(ctx));
-}
-
-async function extractOutput(
+async function tryExtractOutput(
  rawOutput: string,
  outputSchema: CasRef,
-  storageRoot: string,
  ctx: Awaited<ReturnType<typeof buildContextWithMeta>>,
-): Promise<CasRef> {
-  const fastPath = await runWithMessage("frontmatter fast path", () =>
-    tryFrontmatterFastPath(rawOutput, outputSchema, ctx.meta.store),
-  ).catch(() => null);
-
+): Promise<CasRef | null> {
+  const fastPath = await tryFrontmatterFastPath(rawOutput, outputSchema, ctx.meta.store);
  if (fastPath !== null) {
    return fastPath.outputHash;
  }
-
-  const config = await runWithMessage("failed to load config", () =>
-    loadWorkflowConfig(storageRoot),
-  );
-  const extracted = await runWithMessage("extract failed", () =>
-    extract(rawOutput, outputSchema, config),
-  );
-  return extracted.hash;
+  return null;
 }

 async function persistStep(options: {
@@ -112,11 +98,6 @@ async function persistStep(options: {
  });
 }

-/**
- * Create an agent CLI entrypoint.
- * Parses argv (`<thread-id> <role>`), runs the agent, extracts structured output,
- * writes StepNode to CAS, and prints the new node hash to stdout.
- */
 export function createAgent(options: AgentOptions): () => Promise<void> {
  return async function main(): Promise<void> {
    const { threadId, role } = parseArgv(process.argv);
@@ -130,13 +111,36 @@ export function createAgent(options: AgentOptions): () => Promise<void> {
      fail(`unknown role: ${role}`);
    }

-    const metaSchema = getSchema(ctx.meta.store, roleDef.meta);
-    if (metaSchema !== null) {
-      ctx.outputFormatInstruction = buildOutputFormatInstruction(metaSchema);
+    const frontmatterSchema = getSchema(ctx.meta.store, roleDef.frontmatter);
+    if (frontmatterSchema !== null) {
+      ctx.outputFormatInstruction = buildOutputFormatInstruction(frontmatterSchema);
+    }
+
+    let agentResult = await runWithMessage("agent run failed", () => options.run(ctx));
+
+    // Try to extract frontmatter; retry via continue if it fails
+    let outputHash = await tryExtractOutput(agentResult.output, roleDef.frontmatter, ctx);
+
+    for (let retry = 0; retry < MAX_FRONTMATTER_RETRIES && outputHash === null; retry++) {
+      const correctionMessage =
+        "Your previous response did not contain valid YAML frontmatter matching the role schema.\n" +
+        "You MUST begin your response with a YAML frontmatter block (--- delimited).\n" +
+        "Please output ONLY the corrected frontmatter block followed by your work.";
+
+      agentResult = await runWithMessage("agent continue failed", () =>
+        options.continue(agentResult.sessionId, correctionMessage, ctx.meta.store),
+      );
+      outputHash = await tryExtractOutput(agentResult.output, roleDef.frontmatter, ctx);
+    }
+
+    if (outputHash === null) {
+      fail(
+        "Agent output does not contain valid YAML frontmatter matching the role schema " +
+          `after ${MAX_FRONTMATTER_RETRIES} retries.\n` +
+          `Raw output (first 500 chars): ${agentResult.output.slice(0, 500)}`,
+      );
    }

-    const agentResult = await runAgent(options, ctx);
-    const outputHash = await extractOutput(agentResult.output, roleDef.meta, storageRoot, ctx);
    const stepHash = await persistStep({
      ctx,
      outputHash,
@@ -17,11 +17,19 @@ export type AgentContext = ModeratorContext & {
 export type AgentRunResult = {
  output: string;
  detailHash: string;
+  sessionId: string;
 };

+export type AgentContinueFn = (
+  sessionId: string,
+  message: string,
+  store: AgentContext["store"],
+) => Promise<AgentRunResult>;
+
 export type AgentRunFn = (ctx: AgentContext) => Promise<AgentRunResult>;

 export type AgentOptions = {
  name: string;
  run: AgentRunFn;
+  continue: AgentContinueFn;
 };
@@ -35,11 +35,11 @@ const solveIssueWorkflow: WorkflowPayload = {
  conditions: {
    needsClarification: {
      description: "Planner requests clarification from user",
-      expression: "$exists(steps[-1].output.needsClarification)",
+      expression: "$exists($last('planner').needsClarification)",
    },
-    notApproved: {
+    rejected: {
      description: "Reviewer rejected the implementation",
-      expression: "steps[-1].output.approved = false",
+      expression: "$last('reviewer').approved = false",
    },
  },
  graph: {
@@ -50,7 +50,7 @@ const solveIssueWorkflow: WorkflowPayload = {
    ],
    developer: [{ role: "reviewer", condition: null }],
    reviewer: [
-      { role: "developer", condition: "notApproved" },
+      { role: "developer", condition: "rejected" },
      { role: "$END", condition: null },
    ],
  },
@@ -72,7 +72,7 @@ describe("evaluate", () => {
    expect(result).toEqual({ ok: true, value: "planner" });
  });

-  test("condition match (notApproved → developer)", async () => {
+  test("condition match (rejected → developer)", async () => {
    const context = makeContext([
      {
        role: "reviewer",
@@ -126,4 +126,116 @@ describe("evaluate", () => {
    const result = await evaluate(solveIssueWorkflow, context);
    expect(result).toEqual({ ok: true, value: "developer" });
  });
+
+  test("$last returns most recent matching role's frontmatter", async () => {
+    const workflow: WorkflowPayload = {
+      ...solveIssueWorkflow,
+      conditions: {
+        devFailed: {
+          description: "Developer failed",
+          expression: "$last('developer').status = 'failed'",
+        },
+      },
+      graph: {
+        $START: [{ role: "developer", condition: null }],
+        developer: [
+          { role: "$END", condition: "devFailed" },
+          { role: "reviewer", condition: null },
+        ],
+      },
+    };
+    const context = makeContext([
+      {
+        role: "developer",
+        output: { status: "done" },
+        detail: "1VPBG9SM5E7WK",
+        agent: "uwf-hermes",
+      },
+      {
+        role: "reviewer",
+        output: { approved: false },
+        detail: "2MXBG6PN4A8JR",
+        agent: "uwf-hermes",
+      },
+      {
+        role: "developer",
+        output: { status: "failed" },
+        detail: "3QNTH7WK8D2PA",
+        agent: "uwf-hermes",
+      },
+    ]);
+    const result = await evaluate(workflow, context);
+    expect(result).toEqual({ ok: true, value: "$END" });
+  });
+
+  test("$first returns earliest matching role's frontmatter", async () => {
+    const workflow: WorkflowPayload = {
+      ...solveIssueWorkflow,
+      conditions: {
+        firstPlanReady: {
+          description: "First planner run was ready",
+          expression: "$first('planner').status = 'ready'",
+        },
+      },
+      graph: {
+        $START: [{ role: "planner", condition: null }],
+        planner: [
+          { role: "$END", condition: "firstPlanReady" },
+          { role: "developer", condition: null },
+        ],
+      },
+    };
+    const context = makeContext([
+      {
+        role: "planner",
+        output: { status: "ready", plan: "ABC123" },
+        detail: "7BQST3VW9F2MA",
+        agent: "uwf-hermes",
+      },
+      {
+        role: "developer",
+        output: { status: "done" },
+        detail: "1VPBG9SM5E7WK",
+        agent: "uwf-hermes",
+      },
+      {
+        role: "planner",
+        output: { status: "revised", plan: "DEF456" },
+        detail: "4RNMK6PX8B3WQ",
+        agent: "uwf-hermes",
+      },
+    ]);
+    const result = await evaluate(workflow, context);
+    expect(result).toEqual({ ok: true, value: "$END" });
+  });
+
+  test("$last returns undefined for unmatched role", async () => {
+    const workflow: WorkflowPayload = {
+      ...solveIssueWorkflow,
+      conditions: {
+        hasReviewer: {
+          description: "Reviewer has run",
+          expression: "$exists($last('reviewer'))",
+        },
+      },
+      graph: {
+        $START: [{ role: "planner", condition: null }],
+        planner: [
+          { role: "$END", condition: "hasReviewer" },
+          { role: "developer", condition: null },
+        ],
+      },
+    };
+    const context = makeContext([
+      {
+        role: "planner",
+        output: { status: "ready" },
+        detail: "7BQST3VW9F2MA",
+        agent: "uwf-hermes",
+      },
+    ]);
+    const result = await evaluate(workflow, context);
+    // no reviewer step → $exists returns false → fallback to developer
+    expect(result).toEqual({ ok: true, value: "developer" });
+  });
 });
@@ -21,12 +21,44 @@ function isTruthy(value: unknown): boolean {
  return true;
 }

+function findByRole(
+  steps: ModeratorContext["steps"],
+  role: string,
+  direction: "first" | "last",
+): unknown {
+  if (direction === "last") {
+    for (let i = steps.length - 1; i >= 0; i--) {
+      if (steps[i].role === role) {
+        return steps[i].output;
+      }
+    }
+  } else {
+    for (const step of steps) {
+      if (step.role === role) {
+        return step.output;
+      }
+    }
+  }
+  return undefined;
+}
+
 async function evaluateJsonata(
  expression: string,
  context: ModeratorContext,
 ): Promise<Result<unknown, Error>> {
  try {
-    const result = await jsonata(expression).evaluate(context);
+    const expr = jsonata(expression);
+    expr.registerFunction(
+      "first",
+      (role: string) => findByRole(context.steps, role, "first"),
+      "<s:x>",
+    );
+    expr.registerFunction(
+      "last",
+      (role: string) => findByRole(context.steps, role, "last"),
+      "<s:x>",
+    );
+    const result = await expr.evaluate(context);
    return { ok: true, value: result };
  } catch (error) {
    return {
@@ -2,14 +2,14 @@ import type { JSONSchema } from "@uncaged/json-cas";

 const ROLE_DEFINITION: JSONSchema = {
  type: "object",
-  required: ["description", "goal", "capabilities", "procedure", "output", "meta"],
+  required: ["description", "goal", "capabilities", "procedure", "output", "frontmatter"],
  properties: {
    description: { type: "string" },
    goal: { type: "string" },
    capabilities: { type: "array", items: { type: "string" } },
    procedure: { type: "string" },
    output: { type: "string" },
-    meta: { type: "string", format: "cas_ref" },
+    frontmatter: { type: "string", format: "cas_ref" },
  },
  additionalProperties: false,
 };
@@ -22,7 +22,7 @@ export type RoleDefinition = {
  capabilities: string[];
  procedure: string;
  output: string;
-  meta: CasRef;
+  frontmatter: CasRef;
 };

 export type Transition = {
@@ -0,0 +1,74 @@
+// MAINTENANCE: This string must be kept in sync with the actual uwf CLI commands.
+// Update whenever commands are added, removed, or their signatures change.
+export function generateCliReference(): string {
+  return `# uwf CLI Reference
+
+## Setup
+
+\`\`\`
+uwf setup                                         # interactive setup wizard
+uwf setup --provider <name> --base-url <url> \\
+           --api-key <key> --model <name>          # non-interactive setup
+           [--agent <name>]                        # optional: default agent alias
+\`\`\`
+
+## Workflow Commands
+
+\`\`\`
+uwf workflow put <file>           # register a workflow from YAML file
+uwf workflow show <id>            # show workflow by name or CAS hash
+uwf workflow list                 # list all registered workflows
+\`\`\`
+
+## Thread Commands
+
+\`\`\`
+uwf thread start <workflow> -p <prompt>           # create a thread (no execution)
+uwf thread step <thread-id>                       # execute one moderator→agent→extract cycle
+               [--agent <cmd>]                    # override agent command
+uwf thread show <thread-id>                       # show thread head pointer
+uwf thread list                                   # list active threads
+               [--all]                            # include archived threads
+uwf thread kill <thread-id>                       # terminate and archive a thread
+uwf thread steps <thread-id>                      # list all steps in a thread
+uwf thread read <thread-id>                       # render thread context as markdown
+               [--quota <chars>]                  # max output characters (default 32000)
+               [--before <step-hash>]             # load steps before this hash (exclusive)
+               [--start]                          # include start step in output
+uwf thread fork <step-hash>                       # fork a thread from a specific step
+uwf thread step-details <step-hash>               # dump full detail node of a step as YAML
+\`\`\`
+
+## CAS Commands
+
+\`\`\`
+uwf cas get <hash>                # read a CAS node (type + payload)
+            [--timestamp]         # include timestamp in output
+uwf cas put <type-hash> <data>    # store a node, print its hash
+                                  # <data>: JSON file path or inline JSON string
+uwf cas put-text <text>           # store a plain text string, print its hash
+                                  # shortcut for put with the built-in text schema
+uwf cas has <hash>                # check if a hash exists
+uwf cas refs <hash>               # list direct CAS references from a node
+uwf cas walk <hash>               # recursive traversal from a node
+uwf cas reindex                   # rebuild type index from all CAS nodes
+uwf cas schema list               # list all registered schemas
+uwf cas schema get <hash>         # show a schema by its type hash
+\`\`\`
+
+## Global Options
+
+\`\`\`
+uwf --format <fmt>                # output format: json (default) or yaml
+uwf -V, --version                 # print version
+\`\`\`
+
+## Key Concepts
+
+- **Workflow**: YAML definition with roles, conditions, and a routing graph; stored as a CAS node identified by its XXH64 hash.
+- **Thread**: A single workflow execution (ULID). State is an immutable CAS chain; active threads are indexed in \`threads.yaml\`.
+- **Step**: One moderator→agent→extract cycle. Run \`uwf thread step\` repeatedly until \`$END\`.
+- **CAS**: Content-Addressed Storage — all nodes are immutable and identified by hash.
+- **Role**: Named actor with goal, capabilities, procedure, output, and meta; the moderator routes between roles.
+`;
+}
@@ -1,4 +1,5 @@
 export { encodeUint64AsCrockford } from "./base32.js";
+export { generateCliReference } from "./cli-reference.js";
 export { env } from "./env.js";
 export type {
  AgentFrontmatter,
Author	SHA1	Message	Date
xiaoju	76fab22827	fix: explicitly forbid extra frontmatter fields in output format instruction buildOutputFormatInstruction now includes explicit language telling agents to output ONLY schema-defined fields and to focus on their role's deliverable. Fixes #394	2026-05-22 10:49:04 +00:00
xiaomo	669875fb46	Merge pull request 'feat: validate model connectivity during uwf setup' (#392 ) from feat/335-setup-validate-model into main	2026-05-22 10:32:01 +00:00
xiaoju	6d94be34a9	feat: validate model connectivity during uwf setup Send a test completion request after configuration to verify the model is reachable. If validation fails, warn the user and suggest trying a different model or checking their settings. Fixes #335	2026-05-22 10:30:39 +00:00
xiaomo	d95fe45a3d	Merge pull request 'feat: add --count/-c flag to uwf thread step' (#390 ) from feat/373-thread-step-count into main	2026-05-22 10:11:13 +00:00
xiaoju	b9252b5ce2	fix: dynamic frontmatter instruction from role schema (closes #389 )	2026-05-22 10:03:56 +00:00
xiaoju	4d47effd39	fix: generate frontmatter instruction dynamically from role schema Replace hardcoded 5-field example with schema-driven generation. Now shows actual enum values, types, and required markers for each role's frontmatter schema. Fixes #389 小橘 <xiaoju@shazhou.work>	2026-05-22 10:03:45 +00:00
xiaoju	7b93ce8f3e	fix: dynamic frontmatter field extraction from role schema (closes #388 )	2026-05-22 09:57:45 +00:00
xiaoju	67870392ab	fix: dynamic frontmatter field extraction from role schema Replace hardcoded 5-field candidate with schema-driven extraction. Now reads outputSchema properties and picks matching fields from parsed frontmatter, supporting role-specific fields like plan, approved, success. Falls back to standard 5 fields when schema has no properties. Fixes #388 小橘 <xiaoju@shazhou.work>	2026-05-22 09:57:30 +00:00
xiaomo	6b9ff9781d	Merge pull request 'fix: revert unnecessary output protocol changes from #385 ' (#386 ) from fix/385-revert-output-protocol into main	2026-05-22 09:40:33 +00:00
xiaoju	487c48effa	fix: revert output protocol changes from #385 Agent CLI outputs plain CAS hash (not JSON), engine parses plain hash. StepOutput no longer carries sessionId — session info is already in CAS detail. Keeps the valuable parts of #385: sessionId in AgentRunResult (process-internal), continue support, and frontmatter retry loop.	2026-05-22 09:39:36 +00:00
xiaomo	4eca2d533c	Merge pull request 'feat: agent session protocol — sessionId, continue, frontmatter retry' (#385 ) from feat/384-agent-session-protocol into main	2026-05-22 09:20:35 +00:00
xiaoju	f0f840e6e0	fix: StepOutput.sessionId → string \| null, legacy fallback → null	2026-05-22 09:16:13 +00:00
xiaoju	7ff90cef4f	feat: agent session protocol — sessionId in result, continue support, frontmatter retry Breaking changes: - AgentRunResult now requires sessionId field - AgentOptions now requires continue function - Agent CLI outputs JSON {stepHash, sessionId} instead of plain CAS hash - Engine parses JSON output (with legacy CAS hash fallback) New features: - Frontmatter validation retry: if agent output lacks valid frontmatter, engine calls agent.continue() up to 2 times with correction message - Session tracking: sessionId flows from agent → engine → StepOutput - Hermes agent: session parse failure is now a hard error (no raw text fallback) - Hermes agent: supports --resume for continue sessions Closes #384	2026-05-22 09:13:05 +00:00
xiaoju	e62d51d845	Merge remote-tracking branch 'origin/feat/remove-llm-extract' into feat/384-agent-session-protocol	2026-05-22 09:06:24 +00:00
xiaoju	a803fcb4fc	fix: solve-issue.yaml meta.plan → frontmatter.plan Follows #375 rename.	2026-05-22 09:04:34 +00:00
xiaomo	d00c93fc19	Merge pull request 'feat: uwf cas put-text for storing plain text in CAS' (#382 ) from feat/cas-put-text into main	2026-05-22 09:02:09 +00:00
xiaoju	99a2890be2	feat: remove LLM extract fallback, require YAML frontmatter Agent output must contain valid YAML frontmatter matching the role schema. If frontmatter parsing fails, the step fails immediately with a clear error instead of falling back to an LLM extraction that can fabricate values. The extract module remains as a public API export but is no longer used in the agent run loop. Breaking change: agents that relied on LLM extraction to produce valid output will now fail. They must output proper frontmatter.	2026-05-22 08:58:01 +00:00
xiaoju	3b7d0564bb	feat: uwf cas put-text for storing plain text in CAS - Register built-in text schema ({type: 'string'}) alongside workflow schemas - Add cmdCasPutText command: uwf cas put-text <text> - Update CLI reference in workflow-util - Update solve-issue.yaml procedure to use put-text Refs #380	2026-05-22 08:53:27 +00:00
xiaoju	45dacf540b	feat: thread step --count/-c <number> to run multiple steps Add --count/-c flag to 'uwf thread step' for running N steps in one invocation, stopping early if $END is reached. - cmdThreadStep now loops up to count times, delegates to cmdThreadStepOnce - CLI parses -c/--count, defaults to 1 (backward compatible single output) - Validation rejects 0, negative, and non-integer counts - 7 new tests covering CLI parsing and count validation Fixes #373 Co-authored-by: uwf-hermes (solve-issue workflow)	2026-05-22 08:06:26 +00:00
xiaomo	2eb5ee0666	Merge pull request 'fix: accept omitted condition in fallback transitions' (#378 ) from fix/fallback-transition-validation into main	2026-05-22 07:56:18 +00:00
xiaoju	e67932c83c	fix: accept omitted condition in fallback transitions Fallback transitions (last entry in graph node) omit the condition field in YAML, resulting in undefined instead of null. The validator and materializer now handle this: - validate.ts: accept undefined as valid condition value - workflow.ts: normalizeGraph() coerces undefined → null before CAS put This was broken by the graph fallback pattern introduced in #370.	2026-05-22 07:38:24 +00:00
xiaomo	04a12231c3	Merge pull request 'feat: register $first/$last JSONata functions in moderator' (#377 ) from feat/376-first-last-jsonata into main	2026-05-22 07:32:17 +00:00
xiaoju	e5ae9a134c	feat: register $first/$last JSONata functions in moderator Register custom $first(role) and $last(role) functions in the JSONata evaluator. These search the steps array and return the matching role's frontmatter (output) directly, replacing verbose steps[-1].output.x expressions with semantic $last('role').field syntax. - workflow-moderator: register functions via expr.registerFunction() - Updated all condition expressions in .workflows/ and examples/ - Added tests for $last, $first, and unmatched role (undefined) Fixes #376	2026-05-22 06:29:56 +00:00
xiaomo	bdafaf3aa1	Merge pull request 'refactor!: rename RoleDefinition.meta → frontmatter' (#375 ) from refactor/374-meta-to-frontmatter into main	2026-05-22 06:06:06 +00:00
xiaoju	02f7f0b708	refactor!: rename RoleDefinition.meta → frontmatter BREAKING CHANGE: All workflow YAML files must use 'frontmatter' instead of 'meta'. - workflow-protocol: RoleDefinition.meta → frontmatter, schema updated - cli-workflow: validate.ts, workflow.ts — resolveMetaRef → resolveFrontmatterRef - workflow-agent-kit: run.ts — metaSchema → frontmatterSchema - All YAML files updated (examples/, .workflows/) Fixes #374	2026-05-22 06:05:07 +00:00
xiaoju	8ea554bb5e	Merge pull request 'feat: create .workflows/solve-issue.yaml' (#372 ) from feat/370-solve-issue-workflow into main	2026-05-22 06:02:15 +00:00
xiaoju	8a425521da	fix: output instructions now specify required frontmatter meta fields	2026-05-22 05:42:17 +00:00
xiaoju	f174f2fd0a	fix: remove redundant condition null from $START	2026-05-22 05:33:39 +00:00
xiaoju	355594d074	refactor: graph fallback pattern + positive condition names - Last transition in each graph node is now the fallback (no condition) - Remove redundant positive conditions (ready, devDone, approved, passed, pushSuccess) - notApproved → rejected (positive naming)	2026-05-22 05:31:43 +00:00
xiaoju	fd7609fe90	fix: address review feedback from xingyue 1. npm/npx → bun/bunx (project standard) 2. Fix tea CLI usage (tea comment + -r flag) 3. cursor-agent → coding (abstract capability) 4. Clarify committer inherits developer's worktree 5. Mark meta.plan required when status=ready 6. PR description must follow What/Why/Changes/Ref template 7. Note maxRounds loop protection in description	2026-05-22 05:27:21 +00:00
xiaoju	dacecfbbb7	feat: create .workflows/solve-issue.yaml TDD-driven issue resolution workflow with 5 roles: - planner: analyzes issue, outputs TDD test spec (stored in CAS) - developer: implements code following TDD - reviewer: code standards compliance check (not functionality) - tester: functional correctness verification - committer: commits and creates PR Graph handles bounce-backs: reviewer→developer, tester→developer, tester→planner (fix_spec), committer→developer (hook_failed). Refs #370	2026-05-22 05:21:19 +00:00
xiaomo	3238eaeddf	Merge pull request 'feat: add uwf skill cli command and Prepare section' (#371 ) from feat/369-uwf-skill-cli into main	2026-05-22 04:50:12 +00:00
xiaoju	995f273fa5	address review: move CLI reference to workflow-util, inline in prompt - Move generateCliReference() to @uncaged/workflow-util - buildRolePrompt inlines CLI reference directly (no agent tool call) - Fix Role terminology to use new field names - Add maintenance comment in cli-reference.ts - Fix test assertions	2026-05-22 03:29:01 +00:00
xiaoju	866154ad73	feat: add uwf skill cli command and Prepare section in role prompt - Add 'uwf skill cli' command that prints markdown CLI reference - buildRolePrompt now generates ## Prepare section: - Always prompts agent to run 'uwf skill cli' (explicit skill) - Renders capabilities as keyword hints for implicit skill loading Fixes #369	2026-05-22 03:20:04 +00:00
xiaomo	8efc5050cb	Merge pull request 'chore: exclude legacy code from biome check' (#368 ) from chore/ignore-legacy-biome into main	2026-05-22 02:10:20 +00:00
xiaoju	3fb60ee649	chore: exclude legacy-packages and scripts from biome check - Add legacy-packages/ and scripts/ to biome ignore - Allow noDefaultExport in vitest.config.* and .d.ts - Allow console in cli.ts and setup.ts (CLI user output) - Fix unused imports in cas.ts and setup.ts	2026-05-22 02:09:18 +00:00
xiaomo	e181f67a2d	Merge pull request 'feat: support project-local workflow discovery' (#367 ) from feat/365-project-local-workflows into main	2026-05-22 02:07:33 +00:00
				`@@ -0,0 +1 @@`
				`export { generateCliReference as cmdSkillCli } from "@uncaged/workflow-util";`