fix: explicitly forbid extra frontmatter fields in output format instruction

buildOutputFormatInstruction now includes explicit language telling agents to output ONLY schema-defined fields and to focus on their role's deliverable. Fixes #394
Merge pull request 'feat: validate model connectivity during uwf setup' (#392 ) from feat/335-setup-validate-model into main
2026-05-22 10:49:04 +00:00 · 2026-05-22 10:32:01 +00:00 · 2026-05-22 10:30:39 +00:00 · 2026-05-22 10:11:13 +00:00 · 2026-05-22 10:03:56 +00:00 · 2026-05-22 10:03:45 +00:00
18 changed files with 981 additions and 112 deletions
@@ -0,0 +1,83 @@
+# Test Spec: uwf setup model connectivity validation (#335)
+
+## Context
+
+File: `packages/cli-workflow/src/commands/setup.ts`
+Test file: `packages/cli-workflow/src/__tests__/setup-validate.test.ts`
+
+After `cmdSetup` writes config, it should send a test chat completion request to verify the configured model is reachable. If validation fails, warn the user (don't abort — config is already saved).
+
+## Implementation Notes
+
+- Add a `validateModel(baseUrl, apiKey, model)` function that sends a minimal chat completion request (`POST /chat/completions` with `messages: [{role:"user",content:"hi"}]`, `max_tokens: 1`)
+- Returns `Result<void, string>` — ok if 2xx response, error with reason string otherwise
+- Use `AbortSignal.timeout(15_000)` for the request
+- Both `cmdSetup` and `cmdSetupInteractive` should call it after saving config
+- `cmdSetup` returns validation result in its return object: `{ ...existing, validation: { ok: true } | { ok: false, error: string } }`
+- `cmdSetupInteractive` prints a warning to console if validation fails, success message if it passes
+- Use the project logger (`createLogger`) — no raw `console.log` except in interactive CLI output (per CLAUDE.md)
+
+## Test Cases (vitest)
+
+### 1. `validateModel` — success path
+- Mock `fetch` to return `{ status: 200, ok: true, json: () => ({}) }`
+- Call `validateModel(baseUrl, apiKey, model)`
+- Assert returns `{ ok: true, value: undefined }`
+- Assert fetch was called with correct URL (`${baseUrl}/chat/completions`), correct headers (`Authorization: Bearer ${apiKey}`), correct body (model, messages, max_tokens: 1)
+
+### 2. `validateModel` — HTTP error (401 unauthorized)
+- Mock `fetch` to return `{ status: 401, ok: false, statusText: "Unauthorized" }`
+- Call `validateModel(baseUrl, apiKey, model)`
+- Assert returns `{ ok: false, error: <string containing "401"> }`
+
+### 3. `validateModel` — HTTP error (404 model not found)
+- Mock `fetch` to return `{ status: 404, ok: false, statusText: "Not Found" }`
+- Assert returns `{ ok: false, error: <string containing "404"> }`
+
+### 4. `validateModel` — network timeout
+- Mock `fetch` to throw `DOMException` with name `AbortError`
+- Assert returns `{ ok: false, error: <string containing "timeout" or "unreachable"> }`
+
+### 5. `validateModel` — network error (DNS failure, connection refused)
+- Mock `fetch` to throw `TypeError("fetch failed")`
+- Assert returns `{ ok: false, error: <string mentioning connectivity> }`
+
+### 6. `cmdSetup` — includes validation result on success
+- Mock global `fetch` for `/chat/completions` to succeed
+- Call `cmdSetup({ provider, baseUrl, apiKey, model, storageRoot })`
+- Assert returned object has `validation: { ok: true, value: undefined }`
+- Assert config files are still written (existing behavior preserved)
+
+### 7. `cmdSetup` — includes validation result on failure (config still saved)
+- Mock global `fetch` for `/chat/completions` to return 401
+- Call `cmdSetup({ ... })`
+- Assert returned object has `validation: { ok: false, error: ... }`
+- Assert `config.yaml` and `.env` are still written (validation failure doesn't prevent saving)
+
+### 8. `cmdSetupInteractive` — prints success message on validation pass
+- Mock `fetch` for both `/models` and `/chat/completions` to succeed
+- Mock stdin to provide valid selections
+- Capture console output
+- Assert output contains a success message like "Model verified" or "✓"
+
+### 9. `cmdSetupInteractive` — prints warning on validation failure
+- Mock `fetch`: `/models` succeeds, `/chat/completions` returns 401
+- Mock stdin for valid selections
+- Capture console output
+- Assert output contains a warning about model not being reachable and suggests trying a different model
+
+### 10. `validateModel` — request body correctness
+- Mock `fetch` to capture the request body
+- Call `validateModel(baseUrl, apiKey, "test-model")`
+- Assert body is `{ model: "test-model", messages: [{role: "user", content: "hi"}], max_tokens: 1 }`
+
+## Export Requirements
+
+- `validateModel` must be exported (for direct unit testing)
+- Signature: `async function validateModel(baseUrl: string, apiKey: string, model: string): Promise<Result<void, string>>`
+- `Result` type: `{ ok: true; value: T } | { ok: false; error: E }` (project convention)
+
+## Files to Create/Modify
+
+- **New**: `packages/cli-workflow/src/__tests__/setup-validate.test.ts` — all test cases above
+- **Modify**: `packages/cli-workflow/src/commands/setup.ts` — add `validateModel`, integrate into `cmdSetup` and `cmdSetupInteractive`
@@ -20,8 +20,8 @@ roles:
      2. Revise the test spec accordingly

      After producing the test spec:
-      1. Store it via `uwf cas put "<markdown content>"` and capture the returned hash
-      2. Put the hash in meta.plan (required when status=ready)
+      1. Store it via `uwf cas put-text "<markdown content>"` and capture the returned hash
+      2. Put the hash in frontmatter.plan (required when status=ready)
    output: "Output a brief summary of the test spec. Frontmatter must include: status (ready or insufficient_info) and plan (CAS hash of the test spec, required when status=ready)."
    frontmatter:
      type: object
@@ -0,0 +1,150 @@
+import { mkdtemp, rm } from "node:fs/promises";
+import { tmpdir } from "node:os";
+import { join } from "node:path";
+import { afterEach, beforeEach, describe, expect, test, vi } from "vitest";
+import { cmdSetup, validateModel } from "../commands/setup.js";
+
+describe("validateModel", () => {
+  const BASE_URL = "https://api.example.com/v1";
+  const API_KEY = "sk-test-key";
+  const MODEL = "test-model";
+
+  afterEach(() => {
+    vi.restoreAllMocks();
+  });
+
+  test("success path — returns ok on 200", async () => {
+    const mockFetch = vi
+      .spyOn(globalThis, "fetch")
+      .mockResolvedValue(new Response(JSON.stringify({}), { status: 200 }));
+
+    const result = await validateModel(BASE_URL, API_KEY, MODEL);
+
+    expect(result).toEqual({ ok: true, value: undefined });
+    expect(mockFetch).toHaveBeenCalledOnce();
+
+    const [url, opts] = mockFetch.mock.calls[0]!;
+    expect(url).toBe(`${BASE_URL}/chat/completions`);
+    expect((opts as RequestInit).headers).toEqual(
+      expect.objectContaining({ Authorization: `Bearer ${API_KEY}` }),
+    );
+    const body = JSON.parse((opts as RequestInit).body as string);
+    expect(body).toEqual({
+      model: MODEL,
+      messages: [{ role: "user", content: "hi" }],
+      max_tokens: 1,
+    });
+  });
+
+  test("HTTP 401 — returns error containing 401", async () => {
+    vi.spyOn(globalThis, "fetch").mockResolvedValue(
+      new Response("Unauthorized", { status: 401, statusText: "Unauthorized" }),
+    );
+
+    const result = await validateModel(BASE_URL, API_KEY, MODEL);
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.error).toContain("401");
+    }
+  });
+
+  test("HTTP 404 — returns error containing 404", async () => {
+    vi.spyOn(globalThis, "fetch").mockResolvedValue(
+      new Response("Not Found", { status: 404, statusText: "Not Found" }),
+    );
+
+    const result = await validateModel(BASE_URL, API_KEY, MODEL);
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.error).toContain("404");
+    }
+  });
+
+  test("network timeout — returns error mentioning timeout", async () => {
+    const err = new DOMException("signal timed out", "AbortError");
+    vi.spyOn(globalThis, "fetch").mockRejectedValue(err);
+
+    const result = await validateModel(BASE_URL, API_KEY, MODEL);
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.error.toLowerCase()).toMatch(/timeout|timed out/);
+    }
+  });
+
+  test("network error (DNS/connection) — returns error mentioning connectivity", async () => {
+    vi.spyOn(globalThis, "fetch").mockRejectedValue(new TypeError("fetch failed"));
+
+    const result = await validateModel(BASE_URL, API_KEY, MODEL);
+
+    expect(result.ok).toBe(false);
+    if (!result.ok) {
+      expect(result.error.toLowerCase()).toMatch(/connect|reach|network/);
+    }
+  });
+
+  test("request body correctness", async () => {
+    const mockFetch = vi
+      .spyOn(globalThis, "fetch")
+      .mockResolvedValue(new Response(JSON.stringify({}), { status: 200 }));
+
+    await validateModel(BASE_URL, API_KEY, "my-special-model");
+
+    const body = JSON.parse((mockFetch.mock.calls[0]![1] as RequestInit).body as string);
+    expect(body).toEqual({
+      model: "my-special-model",
+      messages: [{ role: "user", content: "hi" }],
+      max_tokens: 1,
+    });
+  });
+});
+
+describe("cmdSetup with validation", () => {
+  let storageRoot: string;
+
+  beforeEach(async () => {
+    storageRoot = await mkdtemp(join(tmpdir(), "uwf-setup-validate-"));
+  });
+
+  afterEach(async () => {
+    vi.restoreAllMocks();
+    await rm(storageRoot, { recursive: true, force: true });
+  });
+
+  const setupArgs = () => ({
+    provider: "testprovider",
+    baseUrl: "https://api.test.com/v1",
+    apiKey: "sk-test",
+    model: "test-model",
+    storageRoot,
+  });
+
+  test("includes validation result on success", async () => {
+    vi.spyOn(globalThis, "fetch").mockResolvedValue(
+      new Response(JSON.stringify({}), { status: 200 }),
+    );
+
+    const result = await cmdSetup(setupArgs());
+
+    expect(result.validation).toEqual({ ok: true, value: undefined });
+    // Config files should still be written
+    expect(result.configPath).toBeTruthy();
+    expect(result.envPath).toBeTruthy();
+  });
+
+  test("includes validation failure — config still saved", async () => {
+    vi.spyOn(globalThis, "fetch").mockResolvedValue(
+      new Response("Unauthorized", { status: 401, statusText: "Unauthorized" }),
+    );
+
+    const result = await cmdSetup(setupArgs());
+
+    expect(result.validation).toBeDefined();
+    expect((result.validation as { ok: boolean }).ok).toBe(false);
+    // Config files should still be written despite validation failure
+    expect(result.configPath).toBeTruthy();
+    expect(result.envPath).toBeTruthy();
+  });
+});
@@ -0,0 +1,71 @@
+import { execFileSync } from "node:child_process";
+import { join } from "node:path";
+import { describe, expect, test } from "vitest";
+
+const CLI_PATH = join(import.meta.dirname, "..", "cli.js");
+
+function runCli(args: string[]): { stdout: string; stderr: string; exitCode: number } {
+  try {
+    const stdout = execFileSync("bun", ["run", CLI_PATH, ...args], {
+      encoding: "utf8",
+      env: { ...process.env, WORKFLOW_STORAGE_ROOT: "/tmp/uwf-test-nonexistent" },
+      stdio: ["ignore", "pipe", "pipe"],
+    });
+    return { stdout, stderr: "", exitCode: 0 };
+  } catch (e: unknown) {
+    const err = e as NodeJS.ErrnoException & { stdout?: string; stderr?: string; status?: number };
+    return {
+      stdout: err.stdout ?? "",
+      stderr: err.stderr ?? "",
+      exitCode: err.status ?? 1,
+    };
+  }
+}
+
+describe("thread step --count CLI parsing", () => {
+  test("--help shows -c/--count option", () => {
+    const result = runCli(["thread", "step", "--help"]);
+    expect(result.stdout).toContain("--count");
+    expect(result.stdout).toContain("-c");
+  });
+
+  test("description says 'one or more steps'", () => {
+    const result = runCli(["thread", "step", "--help"]);
+    expect(result.stdout).toContain("one or more steps");
+  });
+});
+
+describe("cmdThreadStep count logic", () => {
+  test("count=0 fails with validation error", () => {
+    const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "0"]);
+    expect(result.exitCode).not.toBe(0);
+    expect(result.stderr).toContain("positive integer");
+  });
+
+  test("negative count fails with validation error", () => {
+    const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "-1"]);
+    expect(result.exitCode).not.toBe(0);
+    expect(result.stderr).toContain("positive integer");
+  });
+
+  test("non-integer count fails with validation error", () => {
+    const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "1.5"]);
+    expect(result.exitCode).not.toBe(0);
+    expect(result.stderr).toContain("positive integer");
+  });
+
+  test("count=1 is the default (no -c flag)", () => {
+    // Without -c, it should attempt to run 1 step (failing on missing thread, not on count validation)
+    const result = runCli(["thread", "step", "FAKE_THREAD_ID"]);
+    expect(result.exitCode).not.toBe(0);
+    // Should NOT contain "positive integer" error — should fail on thread lookup instead
+    expect(result.stderr).not.toContain("positive integer");
+  });
+
+  test("count=3 passes validation (fails on thread lookup)", () => {
+    const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "3"]);
+    expect(result.exitCode).not.toBe(0);
+    // Should NOT contain "positive integer" error — should fail on thread/storage lookup
+    expect(result.stderr).not.toContain("positive integer");
+  });
+});
@@ -7,6 +7,7 @@ import {
  cmdCasGet,
  cmdCasHas,
  cmdCasPut,
+  cmdCasPutText,
  cmdCasRefs,
  cmdCasReindex,
  cmdCasSchemaGet,
@@ -108,15 +109,21 @@ thread

 thread
  .command("step")
-  .description("Execute one step")
+  .description("Execute one or more steps")
  .argument("<thread-id>", "Thread ULID")
  .option("--agent <cmd>", "Override agent command")
-  .action((threadId: string, opts: { agent: string | undefined }) => {
+  .option("-c, --count <number>", "Number of steps to run (default: 1)")
+  .action((threadId: string, opts: { agent: string | undefined; count: string | undefined }) => {
    const storageRoot = resolveStorageRoot();
    runAction(async () => {
      const agentOverride = opts.agent ?? null;
-      const result = await cmdThreadStep(storageRoot, threadId, agentOverride);
-      writeOutput(result);
+      const count = opts.count !== undefined ? Number(opts.count) : 1;
+      const results = await cmdThreadStep(storageRoot, threadId, agentOverride, count);
+      if (results.length === 1) {
+        writeOutput(results[0]);
+      } else {
+        writeOutput(results);
+      }
    });
  });

@@ -295,6 +302,17 @@ cas
    });
  });

+cas
+  .command("put-text")
+  .description("Store a plain text string, print its hash")
+  .argument("<text>", "Text content to store")
+  .action((text: string) => {
+    const storageRoot = resolveStorageRoot();
+    runAction(async () => {
+      writeOutput(await cmdCasPutText(storageRoot, text));
+    });
+  });
+
 cas
  .command("has")
  .description("Check if a hash exists")
@@ -2,9 +2,11 @@ import { readFileSync } from "node:fs";
 import { join } from "node:path";

 import type { JSONSchema, Store } from "@uncaged/json-cas";
-import { bootstrap, getSchema, refs, walk } from "@uncaged/json-cas";
+import { bootstrap, getSchema, putSchema, refs, walk } from "@uncaged/json-cas";
 import { createFsStore } from "@uncaged/json-cas-fs";

+import { TEXT_SCHEMA } from "../schemas.js";
+
 // ---- Helpers ----

 function openStore(storageRoot: string): Store {
@@ -121,3 +123,10 @@ export async function cmdCasSchemaGet(storageRoot: string, hash: string): Promis
  }
  return schema;
 }
+
+export async function cmdCasPutText(storageRoot: string, text: string): Promise<{ hash: string }> {
+  const store = openStore(storageRoot);
+  const typeHash = await putSchema(store, TEXT_SCHEMA);
+  const hash = await store.put(typeHash, text);
+  return { hash };
+}
@@ -2,9 +2,45 @@ import { existsSync, mkdirSync, readFileSync, writeFileSync } from "node:fs";
 import { join } from "node:path";
 import { stdin as input, stdout as output } from "node:process";
 import { createInterface } from "node:readline/promises";
-
+import type { Result } from "@uncaged/workflow-util";
 import { parse, stringify } from "yaml";

+/**
+ * Send a minimal chat completion request to verify the model is reachable.
+ * Returns ok on 2xx, error with reason string otherwise.
+ */
+export async function validateModel(
+  baseUrl: string,
+  apiKey: string,
+  model: string,
+): Promise<Result<void, string>> {
+  try {
+    const url = `${baseUrl.replace(/\/+$/, "")}/chat/completions`;
+    const res = await fetch(url, {
+      method: "POST",
+      headers: {
+        Authorization: `Bearer ${apiKey}`,
+        "Content-Type": "application/json",
+      },
+      body: JSON.stringify({
+        model,
+        messages: [{ role: "user", content: "hi" }],
+        max_tokens: 1,
+      }),
+      signal: AbortSignal.timeout(15_000),
+    });
+    if (!res.ok) {
+      return { ok: false, error: `HTTP ${res.status} ${res.statusText}` };
+    }
+    return { ok: true, value: undefined };
+  } catch (err: unknown) {
+    if (err instanceof DOMException && err.name === "AbortError") {
+      return { ok: false, error: "Request timed out — model endpoint unreachable" };
+    }
+    return { ok: false, error: `Network error — could not reach endpoint (${String(err)})` };
+  }
+}
+
 /**
 * Preset provider list — embedded to avoid runtime YAML loading dependency.
 * Keep in sync with providers.yaml in cli-workflow.
@@ -163,12 +199,16 @@ export async function cmdSetup(args: SetupArgs): Promise<Record<string, unknown>
  envData[envName] = args.apiKey;
  saveEnvFile(envPath, envData);

+  // Validate model connectivity
+  const validation = await validateModel(args.baseUrl, args.apiKey, args.model);
+
  return {
    configPath,
    envPath,
    provider: args.provider,
    model: args.model,
    defaultAgent: merged.defaultAgent,
+    validation,
  };
 }

@@ -328,7 +368,7 @@ export async function cmdSetupInteractive(storageRoot: string): Promise<Record<s

    console.log(`  → ${providerName}/${model}\n`);

-    await cmdSetup({
+    const setupResult = await cmdSetup({
      provider: providerName,
      baseUrl,
      apiKey,
@@ -336,6 +376,19 @@ export async function cmdSetupInteractive(storageRoot: string): Promise<Record<s
      storageRoot,
    });

+    // Show validation result
+    if (setupResult.validation && typeof setupResult.validation === "object") {
+      const v = setupResult.validation as { ok: boolean; error?: string };
+      if (v.ok) {
+        console.log("✓ Model verified — connection successful.\n");
+      } else {
+        console.log(`\n⚠ Warning: Could not reach model — ${v.error}`);
+        console.log(
+          "  Config saved, but you may want to try a different model or check your API key.\n",
+        );
+      }
+    }
+
    console.log("Setup complete! Get started:\n");
    console.log("  uwf workflow put <workflow.yaml>   Register a workflow");
    console.log('  uwf thread start <name> -p "..."   Start a thread');
@@ -673,6 +673,27 @@ export async function cmdThreadStep(
  storageRoot: string,
  threadId: ThreadId,
  agentOverride: string | null,
+  count: number,
+): Promise<StepOutput[]> {
+  if (count < 1 || !Number.isInteger(count)) {
+    fail(`--count must be a positive integer, got: ${count}`);
+  }
+
+  const results: StepOutput[] = [];
+  for (let i = 0; i < count; i++) {
+    const result = await cmdThreadStepOnce(storageRoot, threadId, agentOverride);
+    results.push(result);
+    if (result.done) {
+      break;
+    }
+  }
+  return results;
+}
+
+async function cmdThreadStepOnce(
+  storageRoot: string,
+  threadId: ThreadId,
+  agentOverride: string | null,
 ): Promise<StepOutput> {
  const index = await loadThreadsIndex(storageRoot);
  const headHash = index[threadId];
@@ -2,10 +2,13 @@ import type { Hash, Store } from "@uncaged/json-cas";
 import { putSchema } from "@uncaged/json-cas";
 import { START_NODE_SCHEMA, STEP_NODE_SCHEMA, WORKFLOW_SCHEMA } from "@uncaged/workflow-protocol";

+export const TEXT_SCHEMA = { type: "string" as const };
+
 export type UwfSchemaHashes = {
  workflow: Hash;
  startNode: Hash;
  stepNode: Hash;
+  text: Hash;
 };

 /**
@@ -13,10 +16,11 @@ export type UwfSchemaHashes = {
 * Idempotent: safe to call on every CLI invocation.
 */
 export async function registerUwfSchemas(store: Store): Promise<UwfSchemaHashes> {
-  const [workflow, startNode, stepNode] = await Promise.all([
+  const [workflow, startNode, stepNode, text] = await Promise.all([
    putSchema(store, WORKFLOW_SCHEMA),
    putSchema(store, START_NODE_SCHEMA),
    putSchema(store, STEP_NODE_SCHEMA),
+    putSchema(store, TEXT_SCHEMA),
  ]);
-  return { workflow, startNode, stepNode };
+  return { workflow, startNode, stepNode, text };
 }
@@ -1,4 +1,5 @@
 import { spawn } from "node:child_process";
+import type { Store } from "@uncaged/json-cas";

 import {
  type AgentContext,
@@ -10,7 +11,6 @@ import {
 import {
  loadHermesSession,
  parseSessionIdFromStdout,
-  storeHermesRawOutput,
  storeHermesSessionDetail,
 } from "./session-detail.js";

@@ -52,17 +52,8 @@ export function buildHermesPrompt(ctx: AgentContext): string {
  return parts.join("\n");
 }

-function spawnHermesChat(prompt: string): Promise<{ stdout: string; stderr: string }> {
+function spawnHermes(args: string[]): Promise<{ stdout: string; stderr: string }> {
  return new Promise((resolve, reject) => {
-    const args = [
-      "chat",
-      "-q",
-      prompt,
-      "--yolo",
-      "--max-turns",
-      String(HERMES_MAX_TURNS),
-      "--quiet",
-    ];
    const child = spawn(HERMES_COMMAND, args, {
      env: process.env,
      shell: false,
@@ -94,23 +85,73 @@ function spawnHermesChat(prompt: string): Promise<{ stdout: string; stderr: stri
  });
 }

+function spawnHermesChat(prompt: string): Promise<{ stdout: string; stderr: string }> {
+  return spawnHermes([
+    "chat",
+    "-q",
+    prompt,
+    "--yolo",
+    "--max-turns",
+    String(HERMES_MAX_TURNS),
+    "--quiet",
+  ]);
+}
+
+function spawnHermesResume(
+  sessionId: string,
+  message: string,
+): Promise<{ stdout: string; stderr: string }> {
+  return spawnHermes([
+    "chat",
+    "--resume",
+    sessionId,
+    "-q",
+    message,
+    "--yolo",
+    "--max-turns",
+    String(HERMES_MAX_TURNS),
+    "--quiet",
+  ]);
+}
+
+function parseSessionId(stdout: string, stderr: string): string {
+  const sessionId = parseSessionIdFromStdout(stderr) ?? parseSessionIdFromStdout(stdout);
+  if (sessionId === null) {
+    throw new Error(
+      "Failed to parse session_id from hermes output.\n" +
+        `stderr (first 200 chars): ${stderr.slice(0, 200)}\n` +
+        `stdout (first 200 chars): ${stdout.slice(0, 200)}`,
+    );
+  }
+  return sessionId;
+}
+
+async function buildResultFromSession(sessionId: string, store: Store): Promise<AgentRunResult> {
+  const session = await loadHermesSession(sessionId);
+  if (session === null) {
+    throw new Error(`Failed to load hermes session file for session_id: ${sessionId}`);
+  }
+  const { detailHash, output } = await storeHermesSessionDetail(store, session);
+  return { output, detailHash, sessionId };
+}
+
 async function runHermes(ctx: AgentContext): Promise<AgentRunResult> {
  const fullPrompt = buildHermesPrompt(ctx);
  const { stdout, stderr } = await spawnHermesChat(fullPrompt);
-  const { store } = ctx;
+  const sessionId = parseSessionId(stdout, stderr);
+  return buildResultFromSession(sessionId, ctx.store);
+}

-  // --quiet mode: session_id may be on stdout or stderr
-  const sessionId = parseSessionIdFromStdout(stderr) ?? parseSessionIdFromStdout(stdout);
-  if (sessionId !== null) {
-    const session = await loadHermesSession(sessionId);
-    if (session !== null) {
-      const { detailHash, output } = await storeHermesSessionDetail(store, session);
-      return { output, detailHash };
-    }
-  }
-
-  const detailHash = await storeHermesRawOutput(store, stdout);
-  return { output: stdout, detailHash };
+async function continueHermes(
+  sessionId: string,
+  message: string,
+  store: Store,
+): Promise<AgentRunResult> {
+  const { stdout, stderr } = await spawnHermesResume(sessionId, message);
+  // Resume may return a new session_id
+  const newSessionId = parseSessionIdFromStdout(stderr) ?? parseSessionIdFromStdout(stdout);
+  const resolvedId = newSessionId ?? sessionId;
+  return buildResultFromSession(resolvedId, store);
 }

 /** Agent CLI factory: parses argv, runs Hermes, extracts output, writes StepNode. */
@@ -118,5 +159,6 @@ export function createHermesAgent(): () => Promise<void> {
  return createAgent({
    name: "hermes",
    run: runHermes,
+    continue: continueHermes,
  });
 }
@@ -2,13 +2,32 @@ import { describe, expect, test } from "vitest";

 import { buildOutputFormatInstruction } from "../src/build-output-format-instruction.js";

+const PLANNER_SCHEMA = {
+  type: "object",
+  properties: {
+    status: { type: "string", enum: ["ready", "insufficient_info"] },
+    plan: { type: "string" },
+  },
+  required: ["status"],
+  additionalProperties: false,
+};
+
+const REVIEWER_SCHEMA = {
+  type: "object",
+  properties: {
+    approved: { type: "boolean" },
+  },
+  required: ["approved"],
+  additionalProperties: false,
+};
+
 describe("buildOutputFormatInstruction", () => {
  test("always includes the frontmatter example block", () => {
    const result = buildOutputFormatInstruction({});
    expect(result).toContain("---");
-    expect(result).toContain("status: done");
-    expect(result).toContain("confidence:");
-    expect(result).toContain("scope: role");
+    expect(result).not.toContain("status: done");
+    expect(result).not.toContain("confidence:");
+    expect(result).not.toContain("scope: role");
  });

  test("always marks frontmatter as the primary deliverable", () => {
@@ -16,17 +35,36 @@ describe("buildOutputFormatInstruction", () => {
    expect(result).toContain("primary deliverable");
  });

-  test("lists fields from a flat object schema", () => {
+  test("generates planner-specific YAML example from schema", () => {
+    const result = buildOutputFormatInstruction(PLANNER_SCHEMA);
+    expect(result).toContain("status: ready  # required | ready | insufficient_info");
+    expect(result).toContain("plan: <string>");
+    expect(result).not.toContain("status: done");
+    expect(result).not.toContain("confidence:");
+    expect(result).not.toContain("artifacts:");
+  });
+
+  test("generates reviewer-specific YAML example from schema", () => {
+    const result = buildOutputFormatInstruction(REVIEWER_SCHEMA);
+    expect(result).toContain("approved: true  # required | true | false");
+    expect(result).not.toContain("status:");
+  });
+
+  test("lists fields from a flat object schema with required marker", () => {
    const schema = {
      type: "object",
      properties: {
        status: { type: "string" },
        confidence: { type: "number" },
      },
+      required: ["status"],
    };
    const result = buildOutputFormatInstruction(schema);
-    expect(result).toContain("`status`");
+    expect(result).toContain("`status` (required)");
    expect(result).toContain("`confidence`");
+    expect(result).not.toContain("`confidence` (required)");
+    expect(result).toContain("status: <string>  # required");
+    expect(result).toContain("confidence: <number>");
  });

  test("lists union of fields from an anyOf schema", () => {
@@ -45,6 +83,8 @@ describe("buildOutputFormatInstruction", () => {
    const result = buildOutputFormatInstruction(schema);
    expect(result).toContain("`alpha`");
    expect(result).toContain("`beta`");
+    expect(result).toContain("alpha: <string>");
+    expect(result).toContain("beta: <number>");
  });

  test("lists union of fields from a oneOf schema", () => {
@@ -63,6 +103,8 @@ describe("buildOutputFormatInstruction", () => {
    const result = buildOutputFormatInstruction(schema);
    expect(result).toContain("`foo`");
    expect(result).toContain("`bar`");
+    expect(result).toContain("foo: <string>");
+    expect(result).toContain("bar: true  # true | false");
  });

  test("falls back gracefully for a non-object schema with no properties", () => {
@@ -80,6 +122,45 @@ describe("buildOutputFormatInstruction", () => {
    const result = buildOutputFormatInstruction(schema);
    const matches = [...result.matchAll(/`shared`/g)];
    expect(matches.length).toBe(1);
+    expect(result).toContain("shared: <string>");
+  });
+
+  test("marks required when any union variant requires the field", () => {
+    const schema = {
+      anyOf: [
+        {
+          type: "object",
+          properties: { shared: { type: "string" } },
+          required: ["shared"],
+        },
+        { type: "object", properties: { shared: { type: "number" } } },
+      ],
+    };
+    const result = buildOutputFormatInstruction(schema);
+    expect(result).toContain("`shared` (required)");
+    expect(result).toContain("shared: <string>  # required");
+  });
+
+  test("explicitly forbids extra frontmatter fields", () => {
+    const result = buildOutputFormatInstruction(PLANNER_SCHEMA);
+    expect(result).toMatch(/\b(only|exclusively)\b.*fields/i);
+    expect(result).toMatch(/do not add (extra|additional|other) fields/i);
+  });
+
+  test("forbids extra fields even for empty schema", () => {
+    const result = buildOutputFormatInstruction({});
+    expect(result).toMatch(/do not add (extra|additional|other) fields/i);
+  });
+
+  test("forbids extra fields for anyOf/oneOf schemas", () => {
+    const schema = {
+      anyOf: [
+        { type: "object", properties: { alpha: { type: "string" } } },
+        { type: "object", properties: { beta: { type: "number" } } },
+      ],
+    };
+    const result = buildOutputFormatInstruction(schema);
+    expect(result).toMatch(/do not add (extra|additional|other) fields/i);
  });

  test("includes focus reminder about role scope", () => {
@@ -29,6 +29,27 @@ const STRICT_SCHEMA = {
  additionalProperties: false,
 };

+/** Role-specific schema (reviewer) — only approved, no standard agent fields. */
+const REVIEWER_SCHEMA = {
+  type: "object",
+  properties: {
+    approved: { type: "boolean" },
+  },
+  required: ["approved"],
+  additionalProperties: false,
+};
+
+/** Role-specific schema (planner) — custom status enum + plan hash. */
+const PLANNER_SCHEMA = {
+  type: "object",
+  properties: {
+    status: { type: "string", enum: ["ready", "insufficient_info"] },
+    plan: { type: "string" },
+  },
+  required: ["status"],
+  additionalProperties: false,
+};
+
 async function makeStoreWithSchema(schema: Record<string, unknown>) {
  const store = createMemoryStore();
  const schemaHash = await putSchema(store, schema);
@@ -134,3 +155,48 @@ describe("tryFrontmatterFastPath — fallback: schema mismatch", () => {
    expect(result).toBeNull();
  });
 });
+
+// ── Role-specific schema fields ───────────────────────────────────────────────
+
+describe("tryFrontmatterFastPath — role-specific fields", () => {
+  test("extracts approved only for reviewer schema (no extra standard fields)", async () => {
+    const { store, schemaHash } = await makeStoreWithSchema(REVIEWER_SCHEMA);
+
+    const raw = "---\napproved: true\n---\n\nReview passed.";
+
+    const result = await tryFrontmatterFastPath(raw, schemaHash, store);
+    expect(result).not.toBeNull();
+
+    const node = store.get(result!.outputHash);
+    expect(node).not.toBeNull();
+    const payload = node!.payload as Record<string, unknown>;
+    expect(payload).toEqual({ approved: true });
+    expect(payload.status).toBeUndefined();
+    expect(payload.scope).toBeUndefined();
+  });
+
+  test("extracts plan and role-specific status for planner schema", async () => {
+    const { store, schemaHash } = await makeStoreWithSchema(PLANNER_SCHEMA);
+
+    const raw = "---\nstatus: ready\nplan: 01HASHPLANNER0001\n---\n\nSpec summary.";
+
+    const result = await tryFrontmatterFastPath(raw, schemaHash, store);
+    expect(result).not.toBeNull();
+
+    const node = store.get(result!.outputHash);
+    expect(node).not.toBeNull();
+    const payload = node!.payload as Record<string, unknown>;
+    expect(payload.status).toBe("ready");
+    expect(payload.plan).toBe("01HASHPLANNER0001");
+    expect(payload.scope).toBeUndefined();
+  });
+
+  test("returns null when required role-specific field is missing", async () => {
+    const { store, schemaHash } = await makeStoreWithSchema(REVIEWER_SCHEMA);
+
+    const raw = "---\nstatus: done\nscope: role\n---\n\nBody.";
+
+    const result = await tryFrontmatterFastPath(raw, schemaHash, store);
+    expect(result).toBeNull();
+  });
+});
@@ -1,5 +1,11 @@
 import type { JSONSchema } from "@uncaged/json-cas";

+type SchemaProperty = {
+  name: string;
+  schema: JSONSchema;
+  required: boolean;
+};
+
 /**
 * Extract top-level property names from a JSON Schema object.
 *
@@ -9,9 +15,44 @@ import type { JSONSchema } from "@uncaged/json-cas";
 *
 * Returns an empty array for schemas with no inspectable property definitions.
 */
-function extractSchemaFields(schema: JSONSchema): string[] {
+export function extractSchemaFields(schema: JSONSchema): string[] {
+  return extractSchemaProperties(schema).map((p) => p.name);
+}
+
+function extractSchemaProperties(schema: JSONSchema): SchemaProperty[] {
+  const objectSchemas = collectObjectSchemas(schema);
+  if (objectSchemas.length === 0) {
+    return [];
+  }
+
+  const byName = new Map<string, SchemaProperty>();
+
+  for (const objectSchema of objectSchemas) {
+    const requiredSet = new Set(
+      Array.isArray(objectSchema.required) ? (objectSchema.required as string[]) : [],
+    );
+    const properties = objectSchema.properties as Record<string, JSONSchema> | null | undefined;
+    if (typeof properties !== "object" || properties === null) {
+      continue;
+    }
+
+    for (const [name, propSchema] of Object.entries(properties)) {
+      const required = requiredSet.has(name);
+      const existing = byName.get(name);
+      if (existing === undefined) {
+        byName.set(name, { name, schema: propSchema, required });
+      } else if (required) {
+        byName.set(name, { ...existing, required: true });
+      }
+    }
+  }
+
+  return [...byName.values()];
+}
+
+function collectObjectSchemas(schema: JSONSchema): JSONSchema[] {
  if (typeof schema.properties === "object" && schema.properties !== null) {
-    return Object.keys(schema.properties as Record<string, unknown>);
+    return [schema];
  }

  const unionKey = Array.isArray(schema.anyOf)
@@ -20,18 +61,109 @@ function extractSchemaFields(schema: JSONSchema): string[] {
      ? "oneOf"
      : null;

-  if (unionKey !== null) {
-    const variants = schema[unionKey] as JSONSchema[];
-    const fieldSet = new Set<string>();
-    for (const variant of variants) {
-      for (const field of extractSchemaFields(variant)) {
-        fieldSet.add(field);
-      }
-    }
-    return [...fieldSet];
+  if (unionKey === null) {
+    return [];
  }

-  return [];
+  const variants = schema[unionKey] as JSONSchema[];
+  const result: JSONSchema[] = [];
+  for (const variant of variants) {
+    result.push(...collectObjectSchemas(variant));
+  }
+  return result;
+}
+
+function resolvePropertySchema(prop: JSONSchema): JSONSchema {
+  if (Array.isArray(prop.enum) && prop.enum.length > 0) {
+    return prop;
+  }
+
+  const unionKey = Array.isArray(prop.anyOf) ? "anyOf" : Array.isArray(prop.oneOf) ? "oneOf" : null;
+
+  if (unionKey !== null) {
+    const variants = prop[unionKey] as JSONSchema[];
+    const nonNull = variants.filter((v) => v.type !== "null");
+    if (nonNull.length === 1) {
+      return nonNull[0];
+    }
+  }
+
+  return prop;
+}
+
+function formatYamlScalar(value: unknown): string {
+  if (typeof value === "boolean") {
+    return String(value);
+  }
+  if (typeof value === "number") {
+    return String(value);
+  }
+  return String(value);
+}
+
+function buildPropertyComment(parts: string[]): string {
+  const filtered = parts.filter((p) => p.length > 0);
+  return filtered.length > 0 ? `  # ${filtered.join(" | ")}` : "";
+}
+
+function buildPropertyExampleLine(prop: SchemaProperty): string {
+  const resolved = resolvePropertySchema(prop.schema);
+  const commentParts: string[] = [];
+  if (prop.required) {
+    commentParts.push("required");
+  }
+
+  if (Array.isArray(resolved.enum) && resolved.enum.length > 0) {
+    const enumValues = resolved.enum.map((v) => String(v));
+    commentParts.push(...enumValues);
+    const first = resolved.enum[0];
+    return `${prop.name}: ${formatYamlScalar(first)}${buildPropertyComment(commentParts)}`;
+  }
+
+  if (resolved.type === "boolean") {
+    commentParts.push("true", "false");
+    return `${prop.name}: true${buildPropertyComment(commentParts)}`;
+  }
+
+  if (resolved.type === "string") {
+    return `${prop.name}: <string>${buildPropertyComment(commentParts)}`;
+  }
+
+  if (resolved.type === "number" || resolved.type === "integer") {
+    return `${prop.name}: <number>${buildPropertyComment(commentParts)}`;
+  }
+
+  if (resolved.type === "array") {
+    return `${prop.name}:\n  - <item>${buildPropertyComment(commentParts)}`;
+  }
+
+  if (resolved.type === "object") {
+    return `${prop.name}: <object>${buildPropertyComment(commentParts)}`;
+  }
+
+  return `${prop.name}: <value>${buildPropertyComment(commentParts)}`;
+}
+
+function buildYamlExampleBlock(properties: SchemaProperty[]): string {
+  if (properties.length === 0) {
+    return "---\n\n... your markdown work here ...";
+  }
+
+  const lines = properties.map((p) => buildPropertyExampleLine(p));
+  return `---\n${lines.join("\n")}\n---\n\n... your markdown work here ...`;
+}
+
+function buildFieldList(properties: SchemaProperty[]): string {
+  if (properties.length === 0) {
+    return "  (schema fields will be extracted automatically)";
+  }
+
+  return properties
+    .map((p) => {
+      const suffix = p.required ? " (required)" : "";
+      return `  - \`${p.name}\`${suffix}`;
+    })
+    .join("\n");
 }

 /**
@@ -42,28 +174,16 @@ function extractSchemaFields(schema: JSONSchema): string[] {
 * system prompt so the deliverable format is the first thing the agent sees.
 */
 export function buildOutputFormatInstruction(schema: JSONSchema): string {
-  const fields = extractSchemaFields(schema);
-
-  const fieldList =
-    fields.length > 0
-      ? fields.map((f) => `  - \`${f}\``).join("\n")
-      : "  (schema fields will be extracted automatically)";
+  const properties = extractSchemaProperties(schema);
+  const yamlExample = buildYamlExampleBlock(properties);
+  const fieldList = buildFieldList(properties);

  return `## Deliverable Format

 Your response MUST begin with a YAML frontmatter block followed by your markdown work:

 \`\`\`
---
-status: done          # done | needs_input | in_progress | failed
-next: <role-name>     # suggested next role, or omit
-confidence: 0.9       # 0.0–1.0, your self-assessed confidence
-artifacts:            # list of file paths or CAS hashes you produced
-  - path/to/file.ts
-scope: role           # role | thread
---
-
-... your markdown work here ...
+${yamlExample}
 \`\`\`

 The frontmatter is the **primary deliverable** — the engine reads it directly.
@@ -71,5 +191,7 @@ Your meta output must satisfy these fields:

 ${fieldList}

+Output ONLY the fields listed above. Do not add extra fields that are not specified in the schema.
+
 Focus exclusively on YOUR role's deliverable. Do not perform actions outside your role's scope.`;
 }
@@ -1,13 +1,139 @@
 import type { Store } from "@uncaged/json-cas";
-import { validate } from "@uncaged/json-cas";
+import { getSchema, validate } from "@uncaged/json-cas";
 import type { CasRef } from "@uncaged/workflow-protocol";
-import { parseFrontmatterMarkdown, validateFrontmatter } from "@uncaged/workflow-util";
+import {
+  type AgentFrontmatter,
+  createLogger,
+  parseFrontmatterMarkdown,
+  validateFrontmatter,
+} from "@uncaged/workflow-util";
+import { parse as parseYaml } from "yaml";
+
+import { extractSchemaFields } from "./build-output-format-instruction.js";
+
+const log = createLogger({ sink: { kind: "stderr" } });
+
+const STANDARD_KEYS = ["status", "next", "confidence", "artifacts", "scope"] as const;
+
+type StandardKey = (typeof STANDARD_KEYS)[number];

 export type FrontmatterFastPathResult = {
  body: string;
  outputHash: CasRef;
 };

+function extractYamlBlock(raw: string): string | null {
+  const fence = "---";
+  if (!raw.startsWith(fence)) {
+    return null;
+  }
+
+  const rest = raw.slice(fence.length);
+  if (rest.length > 0 && rest[0] !== "\n" && rest[0] !== "\r") {
+    return null;
+  }
+
+  const afterOpen = rest.startsWith("\n") ? rest.slice(1) : rest;
+  const closeIndex = afterOpen.indexOf(`\n${fence}`);
+  if (closeIndex === -1) {
+    return null;
+  }
+
+  return afterOpen.slice(0, closeIndex);
+}
+
+function parseRawFrontmatterFields(raw: string): Record<string, unknown> {
+  const yamlText = extractYamlBlock(raw);
+  if (yamlText === null) {
+    return {};
+  }
+
+  try {
+    const parsed = parseYaml(yamlText);
+    if (typeof parsed !== "object" || parsed === null || Array.isArray(parsed)) {
+      return {};
+    }
+    return parsed as Record<string, unknown>;
+  } catch {
+    return {};
+  }
+}
+
+function defaultCandidate(frontmatter: AgentFrontmatter): Record<string, unknown> {
+  return {
+    status: frontmatter.status,
+    next: frontmatter.next,
+    confidence: frontmatter.confidence,
+    artifacts: [...frontmatter.artifacts],
+    scope: frontmatter.scope,
+  };
+}
+
+function pickStandardField(frontmatter: AgentFrontmatter, key: StandardKey): unknown {
+  switch (key) {
+    case "status":
+      return frontmatter.status;
+    case "next":
+      return frontmatter.next;
+    case "confidence":
+      return frontmatter.confidence;
+    case "artifacts":
+      return [...frontmatter.artifacts];
+    case "scope":
+      return frontmatter.scope;
+  }
+}
+
+function isStandardKey(key: string): key is StandardKey {
+  return (STANDARD_KEYS as readonly string[]).includes(key);
+}
+
+function pickFieldValue(
+  field: string,
+  frontmatter: AgentFrontmatter,
+  rawFields: Record<string, unknown>,
+): unknown | undefined {
+  if (!isStandardKey(field)) {
+    return Object.hasOwn(rawFields, field) ? rawFields[field] : undefined;
+  }
+
+  const coerced = pickStandardField(frontmatter, field);
+  if (field === "artifacts" || field === "scope") {
+    return coerced;
+  }
+  if (coerced !== null) {
+    return coerced;
+  }
+  return Object.hasOwn(rawFields, field) ? rawFields[field] : coerced;
+}
+
+/**
+ * Build a CAS candidate object from schema property keys and parsed frontmatter.
+ *
+ * When the schema has no inspectable properties, falls back to the five standard
+ * agent frontmatter fields for backward compatibility.
+ */
+function buildCandidate(
+  frontmatter: AgentFrontmatter,
+  rawFields: Record<string, unknown>,
+  schemaFields: string[],
+): Record<string, unknown> {
+  if (schemaFields.length === 0) {
+    return defaultCandidate(frontmatter);
+  }
+
+  const candidate: Record<string, unknown> = {};
+
+  for (const field of schemaFields) {
+    const value = pickFieldValue(field, frontmatter, rawFields);
+    if (value !== undefined) {
+      candidate[field] = value;
+    }
+  }
+
+  return candidate;
+}
+
 /**
 * Try to satisfy `outputSchema` from frontmatter fields alone.
 *
@@ -32,16 +158,22 @@ export async function tryFrontmatterFastPath(

  const validationErrors = validateFrontmatter(frontmatter);
  if (validationErrors.length > 0) {
+    log(
+      "9GNPS4WY",
+      `frontmatter validation errors: ${validationErrors.map((e) => e.message).join("; ")}`,
+    );
    return null;
  }

-  const candidate: Record<string, unknown> = {
-    status: frontmatter.status,
-    next: frontmatter.next,
-    confidence: frontmatter.confidence,
-    artifacts: [...frontmatter.artifacts],
-    scope: frontmatter.scope,
-  };
+  const schema = getSchema(store, outputSchema);
+  if (schema === null) {
+    log("8FHMR2QX", `output schema not found in CAS: ${outputSchema}`);
+    return null;
+  }
+
+  const schemaFields = extractSchemaFields(schema);
+  const rawFields = parseRawFrontmatterFields(raw);
+  const candidate = buildCandidate(frontmatter, rawFields, schemaFields);

  let outputHash: CasRef;
  let node: ReturnType<Store["get"]>;
@@ -50,10 +182,12 @@ export async function tryFrontmatterFastPath(
    outputHash = await store.put(outputSchema, candidate);
    node = store.get(outputHash);
  } catch {
+    log("2KMQT7NR", "failed to store frontmatter candidate in CAS");
    return null;
  }

  if (node === null || !validate(store, node)) {
+    log("2KMQT7NR", "stored frontmatter candidate failed schema validation");
    return null;
  }

@@ -12,4 +12,10 @@ export type { FrontmatterFastPathResult } from "./frontmatter.js";
 export { tryFrontmatterFastPath } from "./frontmatter.js";
 export { createAgent } from "./run.js";
 export { getConfigPath, getEnvPath, loadWorkflowConfig } from "./storage.js";
-export type { AgentContext, AgentOptions, AgentRunFn, AgentRunResult } from "./types.js";
+export type {
+  AgentContext,
+  AgentContinueFn,
+  AgentOptions,
+  AgentRunFn,
+  AgentRunResult,
+} from "./types.js";
@@ -3,11 +3,12 @@ import type { CasRef, StepNodePayload, ThreadId } from "@uncaged/workflow-protoc
 import { config as loadDotenv } from "dotenv";
 import { buildOutputFormatInstruction } from "./build-output-format-instruction.js";
 import { buildContextWithMeta } from "./context.js";
-import { extract } from "./extract.js";
 import { tryFrontmatterFastPath } from "./frontmatter.js";
 import type { AgentStore } from "./storage.js";
-import { getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
-import type { AgentContext, AgentOptions, AgentRunResult } from "./types.js";
+import { getEnvPath, resolveStorageRoot } from "./storage.js";
+import type { AgentOptions } from "./types.js";
+
+const MAX_FRONTMATTER_RETRIES = 2;

 function fail(message: string): never {
  process.stderr.write(`${message}\n`);
@@ -66,31 +67,16 @@ async function writeStepNode(options: {
  return hash;
 }

-async function runAgent(options: AgentOptions, ctx: AgentContext): Promise<AgentRunResult> {
-  return runWithMessage("agent run failed", () => options.run(ctx));
-}
-
-async function extractOutput(
+async function tryExtractOutput(
  rawOutput: string,
  outputSchema: CasRef,
-  storageRoot: string,
  ctx: Awaited<ReturnType<typeof buildContextWithMeta>>,
-): Promise<CasRef> {
-  const fastPath = await runWithMessage("frontmatter fast path", () =>
-    tryFrontmatterFastPath(rawOutput, outputSchema, ctx.meta.store),
-  ).catch(() => null);
-
+): Promise<CasRef | null> {
+  const fastPath = await tryFrontmatterFastPath(rawOutput, outputSchema, ctx.meta.store);
  if (fastPath !== null) {
    return fastPath.outputHash;
  }
-
-  const config = await runWithMessage("failed to load config", () =>
-    loadWorkflowConfig(storageRoot),
-  );
-  const extracted = await runWithMessage("extract failed", () =>
-    extract(rawOutput, outputSchema, config),
-  );
-  return extracted.hash;
+  return null;
 }

 async function persistStep(options: {
@@ -112,11 +98,6 @@ async function persistStep(options: {
  });
 }

-/**
- * Create an agent CLI entrypoint.
- * Parses argv (`<thread-id> <role>`), runs the agent, extracts structured output,
- * writes StepNode to CAS, and prints the new node hash to stdout.
- */
 export function createAgent(options: AgentOptions): () => Promise<void> {
  return async function main(): Promise<void> {
    const { threadId, role } = parseArgv(process.argv);
@@ -135,13 +116,31 @@ export function createAgent(options: AgentOptions): () => Promise<void> {
      ctx.outputFormatInstruction = buildOutputFormatInstruction(frontmatterSchema);
    }

-    const agentResult = await runAgent(options, ctx);
-    const outputHash = await extractOutput(
-      agentResult.output,
-      roleDef.frontmatter,
-      storageRoot,
-      ctx,
-    );
+    let agentResult = await runWithMessage("agent run failed", () => options.run(ctx));
+
+    // Try to extract frontmatter; retry via continue if it fails
+    let outputHash = await tryExtractOutput(agentResult.output, roleDef.frontmatter, ctx);
+
+    for (let retry = 0; retry < MAX_FRONTMATTER_RETRIES && outputHash === null; retry++) {
+      const correctionMessage =
+        "Your previous response did not contain valid YAML frontmatter matching the role schema.\n" +
+        "You MUST begin your response with a YAML frontmatter block (--- delimited).\n" +
+        "Please output ONLY the corrected frontmatter block followed by your work.";
+
+      agentResult = await runWithMessage("agent continue failed", () =>
+        options.continue(agentResult.sessionId, correctionMessage, ctx.meta.store),
+      );
+      outputHash = await tryExtractOutput(agentResult.output, roleDef.frontmatter, ctx);
+    }
+
+    if (outputHash === null) {
+      fail(
+        "Agent output does not contain valid YAML frontmatter matching the role schema " +
+          `after ${MAX_FRONTMATTER_RETRIES} retries.\n` +
+          `Raw output (first 500 chars): ${agentResult.output.slice(0, 500)}`,
+      );
+    }
+
    const stepHash = await persistStep({
      ctx,
      outputHash,
@@ -17,11 +17,19 @@ export type AgentContext = ModeratorContext & {
 export type AgentRunResult = {
  output: string;
  detailHash: string;
+  sessionId: string;
 };

+export type AgentContinueFn = (
+  sessionId: string,
+  message: string,
+  store: AgentContext["store"],
+) => Promise<AgentRunResult>;
+
 export type AgentRunFn = (ctx: AgentContext) => Promise<AgentRunResult>;

 export type AgentOptions = {
  name: string;
  run: AgentRunFn;
+  continue: AgentContinueFn;
 };
@@ -46,6 +46,8 @@ uwf cas get <hash>                # read a CAS node (type + payload)
            [--timestamp]         # include timestamp in output
 uwf cas put <type-hash> <data>    # store a node, print its hash
                                  # <data>: JSON file path or inline JSON string
+uwf cas put-text <text>           # store a plain text string, print its hash
+                                  # shortcut for put with the built-in text schema
 uwf cas has <hash>                # check if a hash exists
 uwf cas refs <hash>               # list direct CAS references from a node
 uwf cas walk <hash>               # recursive traversal from a node
Author	SHA1	Message	Date
xiaoju	76fab22827	fix: explicitly forbid extra frontmatter fields in output format instruction buildOutputFormatInstruction now includes explicit language telling agents to output ONLY schema-defined fields and to focus on their role's deliverable. Fixes #394	2026-05-22 10:49:04 +00:00
xiaomo	669875fb46	Merge pull request 'feat: validate model connectivity during uwf setup' (#392 ) from feat/335-setup-validate-model into main	2026-05-22 10:32:01 +00:00
xiaoju	6d94be34a9	feat: validate model connectivity during uwf setup Send a test completion request after configuration to verify the model is reachable. If validation fails, warn the user and suggest trying a different model or checking their settings. Fixes #335	2026-05-22 10:30:39 +00:00
xiaomo	d95fe45a3d	Merge pull request 'feat: add --count/-c flag to uwf thread step' (#390 ) from feat/373-thread-step-count into main	2026-05-22 10:11:13 +00:00
xiaoju	b9252b5ce2	fix: dynamic frontmatter instruction from role schema (closes #389 )	2026-05-22 10:03:56 +00:00
xiaoju	4d47effd39	fix: generate frontmatter instruction dynamically from role schema Replace hardcoded 5-field example with schema-driven generation. Now shows actual enum values, types, and required markers for each role's frontmatter schema. Fixes #389 小橘 <xiaoju@shazhou.work>	2026-05-22 10:03:45 +00:00
xiaoju	7b93ce8f3e	fix: dynamic frontmatter field extraction from role schema (closes #388 )	2026-05-22 09:57:45 +00:00
xiaoju	67870392ab	fix: dynamic frontmatter field extraction from role schema Replace hardcoded 5-field candidate with schema-driven extraction. Now reads outputSchema properties and picks matching fields from parsed frontmatter, supporting role-specific fields like plan, approved, success. Falls back to standard 5 fields when schema has no properties. Fixes #388 小橘 <xiaoju@shazhou.work>	2026-05-22 09:57:30 +00:00
xiaomo	6b9ff9781d	Merge pull request 'fix: revert unnecessary output protocol changes from #385 ' (#386 ) from fix/385-revert-output-protocol into main	2026-05-22 09:40:33 +00:00
xiaoju	487c48effa	fix: revert output protocol changes from #385 Agent CLI outputs plain CAS hash (not JSON), engine parses plain hash. StepOutput no longer carries sessionId — session info is already in CAS detail. Keeps the valuable parts of #385: sessionId in AgentRunResult (process-internal), continue support, and frontmatter retry loop.	2026-05-22 09:39:36 +00:00
xiaomo	4eca2d533c	Merge pull request 'feat: agent session protocol — sessionId, continue, frontmatter retry' (#385 ) from feat/384-agent-session-protocol into main	2026-05-22 09:20:35 +00:00
xiaoju	f0f840e6e0	fix: StepOutput.sessionId → string \| null, legacy fallback → null	2026-05-22 09:16:13 +00:00
xiaoju	7ff90cef4f	feat: agent session protocol — sessionId in result, continue support, frontmatter retry Breaking changes: - AgentRunResult now requires sessionId field - AgentOptions now requires continue function - Agent CLI outputs JSON {stepHash, sessionId} instead of plain CAS hash - Engine parses JSON output (with legacy CAS hash fallback) New features: - Frontmatter validation retry: if agent output lacks valid frontmatter, engine calls agent.continue() up to 2 times with correction message - Session tracking: sessionId flows from agent → engine → StepOutput - Hermes agent: session parse failure is now a hard error (no raw text fallback) - Hermes agent: supports --resume for continue sessions Closes #384	2026-05-22 09:13:05 +00:00
xiaoju	e62d51d845	Merge remote-tracking branch 'origin/feat/remove-llm-extract' into feat/384-agent-session-protocol	2026-05-22 09:06:24 +00:00
xiaoju	a803fcb4fc	fix: solve-issue.yaml meta.plan → frontmatter.plan Follows #375 rename.	2026-05-22 09:04:34 +00:00
xiaomo	d00c93fc19	Merge pull request 'feat: uwf cas put-text for storing plain text in CAS' (#382 ) from feat/cas-put-text into main	2026-05-22 09:02:09 +00:00
xiaoju	99a2890be2	feat: remove LLM extract fallback, require YAML frontmatter Agent output must contain valid YAML frontmatter matching the role schema. If frontmatter parsing fails, the step fails immediately with a clear error instead of falling back to an LLM extraction that can fabricate values. The extract module remains as a public API export but is no longer used in the agent run loop. Breaking change: agents that relied on LLM extraction to produce valid output will now fail. They must output proper frontmatter.	2026-05-22 08:58:01 +00:00
xiaoju	3b7d0564bb	feat: uwf cas put-text for storing plain text in CAS - Register built-in text schema ({type: 'string'}) alongside workflow schemas - Add cmdCasPutText command: uwf cas put-text <text> - Update CLI reference in workflow-util - Update solve-issue.yaml procedure to use put-text Refs #380	2026-05-22 08:53:27 +00:00
xiaoju	45dacf540b	feat: thread step --count/-c <number> to run multiple steps Add --count/-c flag to 'uwf thread step' for running N steps in one invocation, stopping early if $END is reached. - cmdThreadStep now loops up to count times, delegates to cmdThreadStepOnce - CLI parses -c/--count, defaults to 1 (backward compatible single output) - Validation rejects 0, negative, and non-integer counts - 7 new tests covering CLI parsing and count validation Fixes #373 Co-authored-by: uwf-hermes (solve-issue workflow)	2026-05-22 08:06:26 +00:00
xiaomo	2eb5ee0666	Merge pull request 'fix: accept omitted condition in fallback transitions' (#378 ) from fix/fallback-transition-validation into main	2026-05-22 07:56:18 +00:00