fix(builtin): split prompt into system/user messages

System message = agent identity (role prompt + output format instruction) User message = moderator speech (task + edge prompt + history) This reflects the workflow's core model: moderator speaks to agent via the graph's edge prompt. Previously all content was in a single system message with no user message, causing Claude API 400 errors. - buildBuiltinPrompt now returns { system, user } instead of string - agent.ts sends system + user as separate messages - Tests updated accordingly
chore(debate): remove round limit, let step control drive pacing
2026-05-23 17:15:23 +08:00 · 2026-05-23 08:31:07 +00:00 · 2026-05-23 08:19:20 +00:00 · 2026-05-23 08:16:47 +00:00 · 2026-05-23 07:59:53 +00:00 · 2026-05-23 07:57:44 +00:00
15 changed files with 748 additions and 163 deletions
@@ -0,0 +1,73 @@
+# Issue #418: ACP session/resume 返回空文本
+
+## 调研日期: 2026-05-23
+
+## 根因
+
+`session/resume` 在 restore 路径下 `_make_agent()` 失败，异常被静默吞掉。
+
+### 完整调用链
+
+```
+resume_session(sid)
+  → update_cwd(sid)
+    → get_session(sid) → _restore(sid)
+      → _make_agent()
+        → resolve_runtime_provider("custom") 失败（line 548-561）
+        → AIAgent() 抛出 "No LLM provider configured"（line 564）
+      → except Exception 静默吞掉（line 482-484）→ return None
+    → return None
+  → state is None → fallback: create_session()（新 sid，无历史）
+```
+
+### 关键代码位置（acp_adapter/session.py）
+
+- `_restore()` line 426-498: 从 DB 恢复 session，但 except 太宽泛
+- `_make_agent()` line 520-568: provider 解析在 restore 路径下不完整
+- Line 548-561: `resolve_runtime_provider("custom")` 失败后，`base_url` 虽然从 DB 取到了但没传给 AIAgent
+
+### 实测行为
+
+1. Phase 1: `session/new` + `prompt` → 正常，有 `agent_message_chunk`
+2. Phase 2: `session/resume` + `prompt`
+   - resume 返回成功，但 `available_commands_update` 里 sessionId 是新的（create_session fallback）
+   - 用原始 sid 发 prompt → `stopReason: "refusal"`（session 不在内存中）
+   - 用新 sid 发 prompt → 能跑但无历史（agent 回答"不知道 secret code"）
+
+### 验证脚本
+
+```python
+# 直接调用 _restore 验证
+cd ~/.hermes/hermes-agent
+python3 -c "
+import sys; sys.path.insert(0, '.')
+from acp_adapter.session import SessionManager
+sm = SessionManager()
+result = sm._restore('SESSION_ID_HERE')
+print(result)  # None — _make_agent 抛异常被吞掉
+"
+```
+
+### 两个 bug
+
+1. **`_make_agent` provider fallback 不完整**: restore 时 DB 里有 `base_url` 和 `api_mode`，但 `resolve_runtime_provider` 失败后这些值没被正确传递给 AIAgent
+2. **`_restore` 的 except 太宽泛**: 静默吞掉所有异常，连 warning 都只在 debug 级别，导致 resume 失败完全无感知
+
+### Hermes 版本
+
+- v0.10.0 (2026.4.16) — 初始测试
+- v0.14.0 (2026.5.16) — 更新后重新测试，bug 仍在
+- 代码路径: ~/.hermes/hermes-agent/acp_adapter/session.py
+
+### v0.14.0 测试结果 (2026-05-23)
+
+- `_restore` 仍因 `custom` provider 解析失败返回 None
+- 日志更清晰了：`WARNING: Failed to recreate agent for ACP session ...`
+- resume fallback 创建新 session（新 sid），但 agent 居然能回答之前的问题（可能通过 memory/session search）
+- 核心问题不变：sessionId 变了，client 用旧 sid 发 prompt → refusal
+
+### 上游 Issue
+
+- https://github.com/NousResearch/hermes-agent/issues/13489 — 已评论根因分析
+- https://github.com/NousResearch/hermes-agent/issues/8083 — resume 静默创建新 session
+- https://github.com/NousResearch/hermes-agent/issues/18452 — _make_agent fallback 不完整
@@ -0,0 +1,77 @@
+name: "debate"
+description: "Structured debate between two sides. Tests cross-process session resume."
+roles:
+  against:
+    description: "Argues against the proposition"
+    goal: |
+      You are a skilled debater arguing AGAINST the proposition.
+      Be logical, cite evidence, and directly address your opponent's points.
+      Keep each argument concise (under 200 words).
+    capabilities:
+      - argumentation
+      - critical-thinking
+    procedure: |
+      1. If this is the opening, present your strongest argument against the proposition.
+      2. If responding to the other side, directly counter their points with evidence and logic.
+      3. If you find yourself genuinely convinced by the other side, you may concede.
+    output: |
+      Provide your argument in the frontmatter.
+      Set conceded to true ONLY if you are genuinely convinced and wish to stop debating.
+    frontmatter:
+      type: object
+      properties:
+        argument:
+          type: string
+        conceded:
+          type: boolean
+      required: [argument, conceded]
+  for:
+    description: "Argues for the proposition"
+    goal: |
+      You are a skilled debater arguing FOR the proposition.
+      Be logical, cite evidence, and directly address your opponent's points.
+      Keep each argument concise (under 200 words).
+    capabilities:
+      - argumentation
+      - critical-thinking
+    procedure: |
+      1. Read the opposing side's latest argument carefully.
+      2. Counter their points with evidence and logic.
+      3. If you find yourself genuinely convinced by the other side, you may concede.
+    output: |
+      Provide your argument in the frontmatter.
+      Set conceded to true ONLY if you are genuinely convinced and wish to stop debating.
+    frontmatter:
+      type: object
+      properties:
+        argument:
+          type: string
+        conceded:
+          type: boolean
+      required: [argument, conceded]
+conditions:
+  againstConceded:
+    description: "The against side conceded"
+    expression: "$last('against').conceded = true"
+  forConceded:
+    description: "The for side conceded"
+    expression: "$last('for').conceded = true"
+graph:
+  $START:
+    - role: "against"
+      condition: null
+      prompt: "Present your opening argument against the proposition."
+  against:
+    - role: "$END"
+      condition: "againstConceded"
+      prompt: "The against side conceded. Debate over."
+    - role: "for"
+      condition: null
+      prompt: "Counter the opposing argument. Address their points directly."
+  for:
+    - role: "$END"
+      condition: "forConceded"
+      prompt: "The for side conceded. Debate over."
+    - role: "against"
+      condition: null
+      prompt: "Counter the opposing argument. Address their points directly."
@@ -26,22 +26,30 @@ function minimalContext(overrides: Partial<AgentContext> = {}): AgentContext {
    start: { workflow: "wf-hash", prompt: "Fix the bug" },
    steps: [],
    outputFormatInstruction: "---\nstatus: done\n---",
+    edgePrompt: "Implement the fix described in the plan.",
+    isFirstVisit: true,
    ...overrides,
  };
 }

 describe("buildBuiltinPrompt", () => {
-  test("includes output format, task, and role goal", () => {
-    const prompt = buildBuiltinPrompt(minimalContext());
-    expect(prompt).toContain("status: done");
-    expect(prompt).toContain("## Goal");
-    expect(prompt).toContain("Ship the fix");
-    expect(prompt).toContain("## Task");
-    expect(prompt).toContain("Fix the bug");
+  test("system includes output format and role goal", () => {
+    const { system } = buildBuiltinPrompt(minimalContext());
+    expect(system).toContain("status: done");
+    expect(system).toContain("## Goal");
+    expect(system).toContain("Ship the fix");
  });

-  test("includes history when steps exist", () => {
-    const prompt = buildBuiltinPrompt(
+  test("user includes task and edge prompt", () => {
+    const { user } = buildBuiltinPrompt(minimalContext());
+    expect(user).toContain("## Task");
+    expect(user).toContain("Fix the bug");
+    expect(user).toContain("## Current Step Instruction");
+    expect(user).toContain("Implement the fix");
+  });
+
+  test("user includes history when steps exist", () => {
+    const { user } = buildBuiltinPrompt(
      minimalContext({
        steps: [
          {
@@ -53,7 +61,7 @@ describe("buildBuiltinPrompt", () => {
        ],
      }),
    );
-    expect(prompt).toContain("## Previous Steps");
-    expect(prompt).toContain("planner");
+    expect(user).toContain("## Previous Steps");
+    expect(user).toContain("planner");
  });
 });
@@ -69,8 +69,11 @@ async function runBuiltin(ctx: AgentContext): Promise<AgentRunResult> {
  const provider = resolveModel(config, config.defaultModel);

  const sessionId = generateUlid(Date.now());
-  const systemPrompt = buildBuiltinPrompt(ctx);
-  const messages: ChatMessage[] = [{ role: "system", content: systemPrompt }];
+  const promptParts = buildBuiltinPrompt(ctx);
+  const messages: ChatMessage[] = [
+    { role: "system", content: promptParts.system },
+    { role: "user", content: promptParts.user },
+  ];

  const session: BuiltinSessionState = {
    sessionId,
@@ -19,18 +19,32 @@ function buildHistorySummary(steps: AgentContext["steps"]): string {
  return lines.join("\n");
 }

-/** Assemble output format, role prompt, task, and history (aligned with buildHermesPrompt). */
-export function buildBuiltinPrompt(ctx: AgentContext): string {
+export type BuiltinPromptParts = {
+  system: string;
+  user: string;
+};
+
+/** Assemble system prompt (role + format) and user prompt (task + edge + history). */
+export function buildBuiltinPrompt(ctx: AgentContext): BuiltinPromptParts {
  const roleDef = ctx.workflow.roles[ctx.role];
  const rolePrompt = roleDef !== undefined ? buildRolePrompt(roleDef) : "";
-  const parts: string[] = [];
+  const systemParts: string[] = [];
  if (ctx.outputFormatInstruction !== "") {
-    parts.push(ctx.outputFormatInstruction, "");
+    systemParts.push(ctx.outputFormatInstruction, "");
+  }
+  systemParts.push(rolePrompt);
+
+  const userParts: string[] = ["## Task", ctx.start.prompt];
+  if (ctx.edgePrompt !== "") {
+    userParts.push("", "## Current Step Instruction", ctx.edgePrompt);
  }
-  parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
  const historyBlock = buildHistorySummary(ctx.steps);
  if (historyBlock !== "") {
-    parts.push("", historyBlock);
+    userParts.push("", historyBlock);
  }
-  return parts.join("\n");
+
+  return {
+    system: systemParts.join("\n"),
+    user: userParts.join("\n"),
+  };
 }
@@ -2,6 +2,7 @@ import { describe, expect, test } from "bun:test";
 import { createMemoryStore, walk } from "@uncaged/json-cas";
 import {
  parseClaudeCodeJsonOutput,
+  parseClaudeCodeStreamOutput,
  storeClaudeCodeDetail,
  storeClaudeCodeRawOutput,
 } from "../src/session-detail.js";
@@ -17,6 +18,8 @@ describe("parseClaudeCodeJsonOutput", () => {
      num_turns: 3,
      total_cost_usd: 0.08,
      duration_ms: 10276,
+      stop_reason: "end_turn",
+      usage: { input_tokens: 100, output_tokens: 50 },
    });
    const parsed = parseClaudeCodeJsonOutput(stdout);
    expect(parsed).not.toBeNull();
@@ -27,22 +30,10 @@ describe("parseClaudeCodeJsonOutput", () => {
    expect(parsed!.numTurns).toBe(3);
    expect(parsed!.totalCostUsd).toBe(0.08);
    expect(parsed!.durationMs).toBe(10276);
-  });
-
-  test("parses error_max_turns result", () => {
-    const stdout = JSON.stringify({
-      type: "result",
-      subtype: "error_max_turns",
-      result: "Ran out of turns",
-      session_id: "abc-def",
-      num_turns: 90,
-      total_cost_usd: 1.5,
-      duration_ms: 50000,
-    });
-    const parsed = parseClaudeCodeJsonOutput(stdout);
-    expect(parsed).not.toBeNull();
-    expect(parsed!.subtype).toBe("error_max_turns");
-    expect(parsed!.result).toBe("Ran out of turns");
+    expect(parsed!.stopReason).toBe("end_turn");
+    expect(parsed!.usage.inputTokens).toBe(100);
+    expect(parsed!.usage.outputTokens).toBe(50);
+    expect(parsed!.turns).toEqual([]);
  });

  test("returns null for non-JSON output", () => {
@@ -57,45 +48,157 @@ describe("parseClaudeCodeJsonOutput", () => {
  });
 });

-describe("storeClaudeCodeDetail", () => {
-  test("stores claude-code-detail CAS node and returns output + detailHash", async () => {
-    const store = createMemoryStore();
-    const parsed: ClaudeCodeParsedResult = {
-      type: "result",
-      subtype: "success",
-      result: "The answer",
-      sessionId: "abc-123",
-      numTurns: 5,
-      totalCostUsd: 0.12,
-      durationMs: 15000,
-    };
+describe("parseClaudeCodeStreamOutput", () => {
+  test("parses stream-json output with turns", () => {
+    const lines = [
+      JSON.stringify({
+        type: "system",
+        subtype: "init",
+        session_id: "sess-123",
+        model: "claude-sonnet-4.5",
+        tools: ["Bash", "Read"],
+      }),
+      JSON.stringify({
+        type: "assistant",
+        message: {
+          role: "assistant",
+          content: [
+            { type: "text", text: "I'll list the files." },
+            { type: "tool_use", id: "tool_1", name: "Bash", input: { command: "ls" } },
+          ],
+        },
+        session_id: "sess-123",
+      }),
+      JSON.stringify({
+        type: "user",
+        message: {
+          role: "user",
+          content: [
+            { type: "tool_result", tool_use_id: "tool_1", content: "file1.ts\nfile2.ts" },
+          ],
+        },
+        session_id: "sess-123",
+      }),
+      JSON.stringify({
+        type: "assistant",
+        message: {
+          role: "assistant",
+          content: [{ type: "text", text: "There are 2 files." }],
+        },
+        session_id: "sess-123",
+      }),
+      JSON.stringify({
+        type: "result",
+        subtype: "success",
+        result: "There are 2 files.",
+        session_id: "sess-123",
+        num_turns: 2,
+        total_cost_usd: 0.05,
+        duration_ms: 5000,
+        stop_reason: "end_turn",
+        usage: {
+          input_tokens: 200,
+          output_tokens: 30,
+          cache_read_input_tokens: 100,
+          cache_creation_input_tokens: 0,
+        },
+      }),
+    ];
+    const stdout = lines.join("\n");
+    const parsed = parseClaudeCodeStreamOutput(stdout);
+
+    expect(parsed).not.toBeNull();
+    expect(parsed!.model).toBe("claude-sonnet-4.5");
+    expect(parsed!.sessionId).toBe("sess-123");
+    expect(parsed!.result).toBe("There are 2 files.");
+    expect(parsed!.stopReason).toBe("end_turn");
+    expect(parsed!.usage.inputTokens).toBe(200);
+    expect(parsed!.usage.outputTokens).toBe(30);
+    expect(parsed!.usage.cacheReadInputTokens).toBe(100);
+
+    // Turns: assistant(text+tool), tool_result, assistant(text)
+    expect(parsed!.turns).toHaveLength(3);
+    expect(parsed!.turns[0]!.role).toBe("assistant");
+    expect(parsed!.turns[0]!.content).toBe("I'll list the files.");
+    expect(parsed!.turns[0]!.toolCalls).toHaveLength(1);
+    expect(parsed!.turns[0]!.toolCalls![0]!.name).toBe("Bash");
+    expect(parsed!.turns[1]!.role).toBe("tool_result");
+    expect(parsed!.turns[1]!.content).toBe("file1.ts\nfile2.ts");
+    expect(parsed!.turns[2]!.role).toBe("assistant");
+    expect(parsed!.turns[2]!.content).toBe("There are 2 files.");
+    expect(parsed!.turns[2]!.toolCalls).toBeNull();
+  });
+
+  test("returns null when no result line", () => {
+    const stdout = JSON.stringify({ type: "system", model: "test" });
+    expect(parseClaudeCodeStreamOutput(stdout)).toBeNull();
+  });
+
+  test("skips invalid JSON lines gracefully", () => {
+    const lines = [
+      "not json",
+      JSON.stringify({
+        type: "result",
+        subtype: "success",
+        result: "ok",
+        session_id: "s1",
+        num_turns: 1,
+        total_cost_usd: 0.01,
+        duration_ms: 1000,
+        stop_reason: "end_turn",
+        usage: {},
+      }),
+    ];
+    const parsed = parseClaudeCodeStreamOutput(lines.join("\n"));
+    expect(parsed).not.toBeNull();
+    expect(parsed!.result).toBe("ok");
+    expect(parsed!.turns).toHaveLength(0);
+  });
+});
+
+describe("storeClaudeCodeDetail", () => {
+  const baseParsed: ClaudeCodeParsedResult = {
+    type: "result",
+    subtype: "success",
+    result: "The answer",
+    sessionId: "abc-123",
+    numTurns: 5,
+    totalCostUsd: 0.12,
+    durationMs: 15000,
+    model: "claude-sonnet-4.5",
+    stopReason: "end_turn",
+    usage: { inputTokens: 100, outputTokens: 50, cacheReadInputTokens: 0, cacheCreationInputTokens: 0 },
+    turns: [
+      { index: 0, role: "assistant", content: "hello", toolCalls: null },
+      { index: 1, role: "tool_result", content: "world", toolCalls: null },
+    ],
+  };
+
+  test("stores detail with per-turn CAS nodes", async () => {
+    const store = createMemoryStore();
+    const { detailHash, output, sessionId } = await storeClaudeCodeDetail(store, baseParsed);

-    const { detailHash, output, sessionId } = await storeClaudeCodeDetail(store, parsed);
    expect(detailHash).toHaveLength(13);
    expect(output).toBe("The answer");
    expect(sessionId).toBe("abc-123");

    const node = await store.get(detailHash);
    expect(node).not.toBeNull();
-    expect(node!.payload.sessionId).toBe("abc-123");
-    expect(node!.payload.numTurns).toBe(5);
-    expect(node!.payload.totalCostUsd).toBe(0.12);
-    expect(node!.payload.durationMs).toBe(15000);
+    expect(node!.payload.model).toBe("claude-sonnet-4.5");
+    expect(node!.payload.stopReason).toBe("end_turn");
+    expect(node!.payload.usage.inputTokens).toBe(100);
+    expect(node!.payload.turns).toHaveLength(2);
+
+    // Verify turn CAS nodes
+    const turn0 = await store.get(node!.payload.turns[0]);
+    expect(turn0).not.toBeNull();
+    expect(turn0!.payload.role).toBe("assistant");
+    expect(turn0!.payload.content).toBe("hello");
  });

  test("detail node is walkable from root", async () => {
    const store = createMemoryStore();
-    const parsed: ClaudeCodeParsedResult = {
-      type: "result",
-      subtype: "success",
-      result: "walkable test",
-      sessionId: "walk-123",
-      numTurns: 1,
-      totalCostUsd: 0.01,
-      durationMs: 1000,
-    };
-
-    const { detailHash } = await storeClaudeCodeDetail(store, parsed);
+    const { detailHash } = await storeClaudeCodeDetail(store, baseParsed);
    const visited: string[] = [];
    walk(store, detailHash, (hash) => visited.push(hash));
    expect(visited.length).toBeGreaterThan(0);
@@ -1,14 +1,20 @@
 import { spawn } from "node:child_process";
 import type { Store } from "@uncaged/json-cas";

+import { createLogger } from "@uncaged/workflow-util";
+
 import {
  type AgentContext,
  type AgentRunResult,
  buildRolePrompt,
  createAgent,
+  getCachedSessionId,
+  setCachedSessionId,
 } from "@uncaged/workflow-agent-kit";

-import { parseClaudeCodeJsonOutput, storeClaudeCodeDetail } from "./session-detail.js";
+import { parseClaudeCodeStreamOutput, storeClaudeCodeDetail } from "./session-detail.js";
+
+const log = createLogger({ sink: { kind: "stderr" } });

 const CLAUDE_COMMAND = "claude";
 const CLAUDE_MAX_TURNS = 90;
@@ -86,7 +92,8 @@ function spawnClaudeRun(prompt: string): Promise<{ stdout: string; stderr: strin
    "-p",
    prompt,
    "--output-format",
-    "json",
+    "stream-json",
+    "--verbose",
    "--dangerously-skip-permissions",
    "--max-turns",
    String(CLAUDE_MAX_TURNS),
@@ -103,7 +110,8 @@ function spawnClaudeResume(
    "--resume",
    sessionId,
    "--output-format",
-    "json",
+    "stream-json",
+    "--verbose",
    "--dangerously-skip-permissions",
    "--max-turns",
    String(CLAUDE_MAX_TURNS),
@@ -111,7 +119,7 @@ function spawnClaudeResume(
 }

 async function processClaudeOutput(stdout: string, store: Store): Promise<AgentRunResult> {
-  const parsed = parseClaudeCodeJsonOutput(stdout);
+  const parsed = parseClaudeCodeStreamOutput(stdout);

  if (parsed !== null) {
    const { detailHash, output, sessionId } = await storeClaudeCodeDetail(store, parsed);
@@ -119,14 +127,36 @@ async function processClaudeOutput(stdout: string, store: Store): Promise<AgentR
  }

  throw new Error(
-    `Claude Code returned non-JSON output (first 200 chars): ${stdout.slice(0, 200)}`,
+    `Claude Code returned unparseable output (first 200 chars): ${stdout.slice(0, 200)}`,
  );
 }

 async function runClaudeCode(ctx: AgentContext): Promise<AgentRunResult> {
  const fullPrompt = buildClaudeCodePrompt(ctx);
+
+  // Try resuming a cached session for re-entry scenarios (e.g. reviewer reject → developer re-entry).
+  if (!ctx.isFirstVisit) {
+    const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role);
+    if (cachedSessionId !== null) {
+      try {
+        const { stdout } = await spawnClaudeResume(cachedSessionId, fullPrompt);
+        const result = await processClaudeOutput(stdout, ctx.store);
+        if (result.sessionId !== undefined && result.sessionId !== "") {
+          await setCachedSessionId(ctx.threadId, ctx.role, result.sessionId);
+        }
+        return result;
+      } catch (err) {
+        log("5VKR8N3Q", "resume failed for session %s, falling back to fresh run: %s", cachedSessionId, err);
+      }
+    }
+  }
+
  const { stdout } = await spawnClaudeRun(fullPrompt);
-  return processClaudeOutput(stdout, ctx.store);
+  const result = await processClaudeOutput(stdout, ctx.store);
+  if (result.sessionId !== undefined && result.sessionId !== "") {
+    await setCachedSessionId(ctx.threadId, ctx.role, result.sessionId);
+  }
+  return result;
 }

 async function continueClaudeCode(
@@ -1,6 +1,7 @@
 export { buildClaudeCodePrompt, createClaudeCodeAgent } from "./claude-code.js";
 export {
  parseClaudeCodeJsonOutput,
+  parseClaudeCodeStreamOutput,
  storeClaudeCodeDetail,
  storeClaudeCodeRawOutput,
 } from "./session-detail.js";
@@ -3,13 +3,52 @@ import type { JSONSchema } from "@uncaged/json-cas";
 export const CLAUDE_CODE_DETAIL_SCHEMA: JSONSchema = {
  title: "claude-code-detail",
  type: "object",
-  required: ["sessionId", "numTurns", "totalCostUsd", "durationMs", "subtype"],
+  required: [
+    "sessionId",
+    "model",
+    "subtype",
+    "durationMs",
+    "numTurns",
+    "totalCostUsd",
+    "stopReason",
+    "usage",
+    "turns",
+  ],
  properties: {
    sessionId: { type: "string" },
+    model: { type: "string" },
+    subtype: { type: "string" },
+    durationMs: { type: "integer" },
    numTurns: { type: "integer" },
    totalCostUsd: { type: "number" },
-    durationMs: { type: "integer" },
-    subtype: { type: "string" },
+    stopReason: { type: "string" },
+    usage: {
+      type: "object",
+      properties: {
+        inputTokens: { type: "integer" },
+        outputTokens: { type: "integer" },
+        cacheReadInputTokens: { type: "integer" },
+        cacheCreationInputTokens: { type: "integer" },
+      },
+      required: ["inputTokens", "outputTokens", "cacheReadInputTokens", "cacheCreationInputTokens"],
+    },
+    turns: {
+      type: "array",
+      items: { type: "string" },
+    },
+  },
+  additionalProperties: false,
+};
+
+export const CLAUDE_CODE_TURN_SCHEMA: JSONSchema = {
+  title: "claude-code-turn",
+  type: "object",
+  required: ["index", "role", "content", "toolCalls"],
+  properties: {
+    index: { type: "integer" },
+    role: { type: "string" },
+    content: { type: "string" },
+    toolCalls: {},
  },
  additionalProperties: false,
 };
@@ -1,13 +1,171 @@
 import { bootstrap, putSchema, type Store } from "@uncaged/json-cas";

-import { CLAUDE_CODE_DETAIL_SCHEMA, CLAUDE_CODE_RAW_OUTPUT_SCHEMA } from "./schemas.js";
-import type { ClaudeCodeDetailPayload, ClaudeCodeParsedResult } from "./types.js";
+import {
+  CLAUDE_CODE_DETAIL_SCHEMA,
+  CLAUDE_CODE_RAW_OUTPUT_SCHEMA,
+  CLAUDE_CODE_TURN_SCHEMA,
+} from "./schemas.js";
+import type {
+  ClaudeCodeDetailPayload,
+  ClaudeCodeParsedResult,
+  ClaudeCodeToolCall,
+  ClaudeCodeTurnPayload,
+} from "./types.js";

 function isRecord(value: unknown): value is Record<string, unknown> {
  return typeof value === "object" && value !== null && !Array.isArray(value);
 }

-/** Parse Claude Code JSON stdout (`claude -p --output-format json`). */
+function safeNumber(v: unknown, fallback = 0): number {
+  return typeof v === "number" ? v : fallback;
+}
+
+function safeString(v: unknown, fallback = ""): string {
+  return typeof v === "string" ? v : fallback;
+}
+
+/**
+ * Extract tool calls from an assistant message content array.
+ */
+function extractToolCalls(content: unknown[]): ClaudeCodeToolCall[] {
+  const calls: ClaudeCodeToolCall[] = [];
+  for (const item of content) {
+    if (isRecord(item) && item.type === "tool_use" && typeof item.name === "string") {
+      calls.push({
+        name: item.name,
+        input: typeof item.input === "string" ? item.input : JSON.stringify(item.input ?? {}),
+      });
+    }
+  }
+  return calls;
+}
+
+/**
+ * Extract text content from a message content array.
+ */
+function extractTextContent(content: unknown[]): string {
+  const texts: string[] = [];
+  for (const item of content) {
+    if (isRecord(item) && item.type === "text" && typeof item.text === "string") {
+      texts.push(item.text);
+    }
+  }
+  return texts.join("\n");
+}
+
+/**
+ * Extract tool result content from a user message content array.
+ */
+function extractToolResultContent(content: unknown[]): string {
+  const results: string[] = [];
+  for (const item of content) {
+    if (isRecord(item) && item.type === "tool_result") {
+      const text = typeof item.content === "string" ? item.content : "";
+      results.push(text);
+    }
+  }
+  return results.join("\n");
+}
+
+/**
+ * Parse Claude Code stream-json (NDJSON) output.
+ * Each line is a JSON object with type: "system" | "assistant" | "user" | "result".
+ */
+export function parseClaudeCodeStreamOutput(stdout: string): ClaudeCodeParsedResult | null {
+  const lines = stdout.trim().split("\n");
+  const turns: ClaudeCodeTurnPayload[] = [];
+  let resultLine: Record<string, unknown> | null = null;
+  let model = "";
+  let turnIndex = 0;
+
+  for (const line of lines) {
+    let parsed: unknown;
+    try {
+      parsed = JSON.parse(line);
+    } catch {
+      continue;
+    }
+    if (!isRecord(parsed)) continue;
+
+    const type = parsed.type;
+
+    if (type === "system" && typeof parsed.model === "string") {
+      model = parsed.model;
+    }
+
+    if (type === "assistant" && isRecord(parsed.message)) {
+      const msg = parsed.message;
+      const content = Array.isArray(msg.content) ? msg.content : [];
+      const textContent = extractTextContent(content as unknown[]);
+      const toolCalls = extractToolCalls(content as unknown[]);
+
+      // Only record turns that have actual content
+      if (textContent !== "" || toolCalls.length > 0) {
+        turns.push({
+          index: turnIndex++,
+          role: "assistant",
+          content: textContent,
+          toolCalls: toolCalls.length > 0 ? toolCalls : null,
+        });
+      }
+    }
+
+    if (type === "user" && isRecord(parsed.message)) {
+      const msg = parsed.message;
+      const content = Array.isArray(msg.content) ? msg.content : [];
+      const resultContent = extractToolResultContent(content as unknown[]);
+
+      if (resultContent !== "") {
+        turns.push({
+          index: turnIndex++,
+          role: "tool_result",
+          content: resultContent,
+          toolCalls: null,
+        });
+      }
+    }
+
+    if (type === "result") {
+      resultLine = parsed;
+    }
+  }
+
+  if (resultLine === null) return null;
+
+  const sessionId = resultLine.session_id;
+  const result = resultLine.result;
+  const subtype = resultLine.subtype;
+
+  if (typeof sessionId !== "string" || typeof result !== "string" || typeof subtype !== "string") {
+    return null;
+  }
+
+  const usage = isRecord(resultLine.usage) ? resultLine.usage : {};
+
+  return {
+    type: safeString(resultLine.type, "result"),
+    subtype: subtype as ClaudeCodeParsedResult["subtype"],
+    result,
+    sessionId,
+    numTurns: safeNumber(resultLine.num_turns),
+    totalCostUsd: safeNumber(resultLine.total_cost_usd),
+    durationMs: safeNumber(resultLine.duration_ms),
+    model,
+    stopReason: safeString(resultLine.stop_reason),
+    usage: {
+      inputTokens: safeNumber(usage.input_tokens),
+      outputTokens: safeNumber(usage.output_tokens),
+      cacheReadInputTokens: safeNumber(usage.cache_read_input_tokens),
+      cacheCreationInputTokens: safeNumber(usage.cache_creation_input_tokens),
+    },
+    turns,
+  };
+}
+
+/**
+ * Legacy: parse Claude Code plain JSON output (non-streaming).
+ * Falls back when stream-json is not available.
+ */
 export function parseClaudeCodeJsonOutput(stdout: string): ClaudeCodeParsedResult | null {
  let parsed: unknown;
  try {
@@ -16,9 +174,7 @@ export function parseClaudeCodeJsonOutput(stdout: string): ClaudeCodeParsedResul
    return null;
  }

-  if (!isRecord(parsed)) {
-    return null;
-  }
+  if (!isRecord(parsed)) return null;

  const sessionId = parsed.session_id;
  const result = parsed.result;
@@ -28,44 +184,68 @@ export function parseClaudeCodeJsonOutput(stdout: string): ClaudeCodeParsedResul
    return null;
  }

+  const usage = isRecord(parsed.usage) ? parsed.usage : {};
+
  return {
-    type: typeof parsed.type === "string" ? parsed.type : "result",
+    type: safeString(parsed.type, "result"),
    subtype: subtype as ClaudeCodeParsedResult["subtype"],
    result,
    sessionId,
-    numTurns: typeof parsed.num_turns === "number" ? parsed.num_turns : 0,
-    totalCostUsd: typeof parsed.total_cost_usd === "number" ? parsed.total_cost_usd : 0,
-    durationMs: typeof parsed.duration_ms === "number" ? parsed.duration_ms : 0,
+    numTurns: safeNumber(parsed.num_turns),
+    totalCostUsd: safeNumber(parsed.total_cost_usd),
+    durationMs: safeNumber(parsed.duration_ms),
+    model: "",
+    stopReason: safeString(parsed.stop_reason),
+    usage: {
+      inputTokens: safeNumber(usage.input_tokens),
+      outputTokens: safeNumber(usage.output_tokens),
+      cacheReadInputTokens: safeNumber(usage.cache_read_input_tokens),
+      cacheCreationInputTokens: safeNumber(usage.cache_creation_input_tokens),
+    },
+    turns: [],
  };
 }

 type ClaudeCodeSchemaHashes = {
  detail: string;
+  turn: string;
  rawOutput: string;
 };

 async function registerSchemas(store: Store): Promise<ClaudeCodeSchemaHashes> {
  await bootstrap(store);
-  const [detail, rawOutput] = await Promise.all([
+  const [detail, turn, rawOutput] = await Promise.all([
    putSchema(store, CLAUDE_CODE_DETAIL_SCHEMA),
+    putSchema(store, CLAUDE_CODE_TURN_SCHEMA),
    putSchema(store, CLAUDE_CODE_RAW_OUTPUT_SCHEMA),
  ]);
-  return { detail, rawOutput };
+  return { detail, turn, rawOutput };
 }

-/** Store parsed Claude Code result as a CAS detail node. */
+/** Store parsed Claude Code result with per-turn breakdown as CAS detail nodes. */
 export async function storeClaudeCodeDetail(
  store: Store,
  parsed: ClaudeCodeParsedResult,
 ): Promise<{ detailHash: string; output: string; sessionId: string }> {
  const schemas = await registerSchemas(store);

+  // Store each turn as an individual CAS node
+  const turnHashes: string[] = [];
+  for (const turn of parsed.turns) {
+    const hash = await store.put(schemas.turn, turn);
+    turnHashes.push(hash);
+  }
+
  const detail: ClaudeCodeDetailPayload = {
    sessionId: parsed.sessionId,
+    model: parsed.model,
+    subtype: parsed.subtype,
+    durationMs: parsed.durationMs,
    numTurns: parsed.numTurns,
    totalCostUsd: parsed.totalCostUsd,
-    durationMs: parsed.durationMs,
-    subtype: parsed.subtype,
+    stopReason: parsed.stopReason,
+    usage: parsed.usage,
+    turns: turnHashes,
  };

  const detailHash = await store.put(schemas.detail, detail);
@@ -1,5 +1,38 @@
 export type ClaudeCodeResultSubtype = "success" | "error_max_turns" | "error_budget";

+/** A single tool call within an assistant turn. */
+export type ClaudeCodeToolCall = {
+  name: string;
+  input: string;
+};
+
+/** A single turn (assistant text, tool use, or tool result). */
+export type ClaudeCodeTurnPayload = {
+  index: number;
+  role: "assistant" | "tool_result";
+  content: string;
+  toolCalls: ClaudeCodeToolCall[] | null;
+};
+
+/** Top-level detail stored as CAS node. */
+export type ClaudeCodeDetailPayload = {
+  sessionId: string;
+  model: string;
+  subtype: string;
+  durationMs: number;
+  numTurns: number;
+  totalCostUsd: number;
+  stopReason: string;
+  usage: {
+    inputTokens: number;
+    outputTokens: number;
+    cacheReadInputTokens: number;
+    cacheCreationInputTokens: number;
+  };
+  turns: string[]; // CAS hashes of ClaudeCodeTurnPayload
+};
+
+/** Intermediate parsed result from stream-json output. */
 export type ClaudeCodeParsedResult = {
  type: string;
  subtype: ClaudeCodeResultSubtype;
@@ -8,12 +41,13 @@ export type ClaudeCodeParsedResult = {
  numTurns: number;
  totalCostUsd: number;
  durationMs: number;
-};
-
-export type ClaudeCodeDetailPayload = {
-  sessionId: string;
-  numTurns: number;
-  totalCostUsd: number;
-  durationMs: number;
-  subtype: string;
+  model: string;
+  stopReason: string;
+  usage: {
+    inputTokens: number;
+    outputTokens: number;
+    cacheReadInputTokens: number;
+    cacheCreationInputTokens: number;
+  };
+  turns: ClaudeCodeTurnPayload[];
 };
@@ -1,70 +1,17 @@
-import { mkdir, readFile, writeFile } from "node:fs/promises";
-import { dirname, join } from "node:path";
-
-import { resolveStorageRoot } from "@uncaged/workflow-agent-kit";
-import type { ThreadId } from "@uncaged/workflow-protocol";
-
-type HermesSessionCache = Record<string, string>;
-
-function getCachePath(): string {
-  return join(resolveStorageRoot(), "cache", "hermes-sessions.json");
-}
-
-function cacheKey(threadId: ThreadId, role: string): string {
-  return `${threadId}:${role}`;
-}
-
-function isRecord(value: unknown): value is Record<string, unknown> {
-  return typeof value === "object" && value !== null && !Array.isArray(value);
-}
-
-async function readCache(): Promise<HermesSessionCache> {
-  const path = getCachePath();
-  try {
-    const text = await readFile(path, "utf8");
-    const raw = JSON.parse(text) as unknown;
-    if (!isRecord(raw)) {
-      return {};
-    }
-    const cache: HermesSessionCache = {};
-    for (const [key, value] of Object.entries(raw)) {
-      if (typeof value === "string" && value !== "") {
-        cache[key] = value;
-      }
-    }
-    return cache;
-  } catch (e) {
-    const err = e as NodeJS.ErrnoException;
-    if (err.code === "ENOENT") {
-      return {};
-    }
-    throw e;
-  }
-}
-
-async function writeCache(cache: HermesSessionCache): Promise<void> {
-  const path = getCachePath();
-  await mkdir(dirname(path), { recursive: true });
-  await writeFile(path, `${JSON.stringify(cache, null, 2)}\n`, "utf8");
-}
+// Re-export session cache from the shared agent-kit package.
+export { getCachedSessionId, setCachedSessionId } from "@uncaged/workflow-agent-kit";

 export function isResumeDisabled(): boolean {
-  const flag = process.env.UWF_NO_RESUME;
-  return flag !== undefined && flag !== "";
-}
-
-export async function getCachedSessionId(threadId: ThreadId, role: string): Promise<string | null> {
-  const cache = await readCache();
-  const sessionId = cache[cacheKey(threadId, role)];
-  return sessionId ?? null;
-}
-
-export async function setCachedSessionId(
-  threadId: ThreadId,
-  role: string,
-  sessionId: string,
-): Promise<void> {
-  const cache = await readCache();
-  cache[cacheKey(threadId, role)] = sessionId;
-  await writeCache(cache);
+  // Hermes ACP session/resume is broken: _restore fails for custom providers
+  // because resolve_runtime_provider("custom") throws and base_url/api_mode
+  // are lost in the fallback path.  Resume silently creates a new session
+  // (different sessionId, no history), causing empty-text responses.
+  // See: https://github.com/NousResearch/hermes-agent/issues/13489
+  // Disable by default until upstream fixes the bug.  Set UWF_HERMES_RESUME=1
+  // to opt back in.
+  const enableFlag = process.env.UWF_HERMES_RESUME;
+  if (enableFlag === "1" || enableFlag === "true") {
+    return false;
+  }
+  return true;
 }
@@ -13,6 +13,7 @@ export type { FrontmatterFastPathResult } from "./frontmatter.js";
 export { tryFrontmatterFastPath } from "./frontmatter.js";
 export { createAgent } from "./run.js";
 export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
+export { getCachedSessionId, setCachedSessionId } from "./session-cache.js";
 export type {
  AgentContext,
  AgentContinueFn,
@@ -0,0 +1,75 @@
+import { mkdir, readFile, rename, writeFile } from "node:fs/promises";
+import { randomBytes } from "node:crypto";
+import { dirname, join } from "node:path";
+
+import type { ThreadId } from "@uncaged/workflow-protocol";
+
+import { resolveStorageRoot } from "./storage.js";
+
+type SessionCache = Record<string, string>;
+
+function getCachePath(): string {
+  return join(resolveStorageRoot(), "cache", "agent-sessions.json");
+}
+
+function cacheKey(threadId: ThreadId, role: string): string {
+  return `${threadId}:${role}`;
+}
+
+function isRecord(value: unknown): value is Record<string, unknown> {
+  return typeof value === "object" && value !== null && !Array.isArray(value);
+}
+
+async function readCache(): Promise<SessionCache> {
+  const path = getCachePath();
+  try {
+    const text = await readFile(path, "utf8");
+    const raw = JSON.parse(text) as unknown;
+    if (!isRecord(raw)) {
+      return {};
+    }
+    const cache: SessionCache = {};
+    for (const [key, value] of Object.entries(raw)) {
+      if (typeof value === "string" && value !== "") {
+        cache[key] = value;
+      }
+    }
+    return cache;
+  } catch (e) {
+    const err = e as NodeJS.ErrnoException;
+    if (err.code === "ENOENT") {
+      return {};
+    }
+    throw e;
+  }
+}
+
+async function writeCache(cache: SessionCache): Promise<void> {
+  const path = getCachePath();
+  const dir = dirname(path);
+  await mkdir(dir, { recursive: true });
+  // Atomic write: write to temp file then rename to avoid partial reads on concurrent access.
+  // NOTE: Current workflow execution is serial (execFileSync), so true concurrency doesn't occur.
+  // This is a safety net for future parallel execution.
+  const tmpPath = join(dir, `.agent-sessions.${randomBytes(4).toString("hex")}.tmp`);
+  await writeFile(tmpPath, `${JSON.stringify(cache, null, 2)}\n`, "utf8");
+  await rename(tmpPath, path);
+}
+
+/** Read the cached session ID for a thread+role pair. */
+export async function getCachedSessionId(threadId: ThreadId, role: string): Promise<string | null> {
+  const cache = await readCache();
+  const sessionId = cache[cacheKey(threadId, role)];
+  return sessionId ?? null;
+}
+
+/** Write the session ID for a thread+role pair into the cache. */
+export async function setCachedSessionId(
+  threadId: ThreadId,
+  role: string,
+  sessionId: string,
+): Promise<void> {
+  const cache = await readCache();
+  cache[cacheKey(threadId, role)] = sessionId;
+  await writeCache(cache);
+}
Author	SHA1	Message	Date
xingyue	44147da419	fix(builtin): split prompt into system/user messages System message = agent identity (role prompt + output format instruction) User message = moderator speech (task + edge prompt + history) This reflects the workflow's core model: moderator speaks to agent via the graph's edge prompt. Previously all content was in a single system message with no user message, causing Claude API 400 errors. - buildBuiltinPrompt now returns { system, user } instead of string - agent.ts sends system + user as separate messages - Tests updated accordingly	2026-05-23 17:15:23 +08:00
xiaoju	0e5b494e12	chore(debate): remove round limit, let step control drive pacing	2026-05-23 08:31:07 +00:00
xiaomo	747b318cc5	Merge pull request 'feat(claude-code): enrich step details with per-turn breakdown' (#423 ) from feat/422-claude-code-detail-enrichment into main	2026-05-23 08:19:20 +00:00
xiaoju	d16ce44bc3	feat(claude-code): enrich step details with per-turn breakdown Switch from --output-format json to stream-json --verbose to capture per-turn data. Detail now includes: - model name - usage (input/output/cache tokens) - stopReason - turns[] as individual CAS nodes with role, content, tool calls Also addresses PR #421 review fixes: - sessionId guard: skip cache write when sessionId is empty/undefined - silent catch: log resume failures with debug tag 5VKR8N3Q - atomic write: session cache uses temp+rename for crash safety Closes #422	2026-05-23 08:16:47 +00:00
xiaomo	45122bc458	Merge pull request 'fix: disable hermes resume, add claude-code resume support, debate workflow' (#421 ) from test/418-resume-e2e-repro into main	2026-05-23 07:59:53 +00:00
xiaomo	3183b4c879	Merge pull request 'feat: add @uncaged/workflow-agent-builtin package' (#420 ) from feat/builtin-agent into main	2026-05-23 07:57:44 +00:00
xiaoju	03eacbabb2	feat: add debate workflow for resume integration testing Two-role debate (against/for) with up to 3 rounds per side. Each role re-enters with session resume, making this an ideal integration test for cross-process session continuity. Supports early termination via concession (conceded=true in frontmatter). Refs #418	2026-05-23 07:50:38 +00:00
xiaoju	1afaeacd57	feat: extract session cache to agent-kit, add resume to claude-code agent Move getCachedSessionId/setCachedSessionId from workflow-agent-hermes into workflow-agent-kit so all agent adapters can share the same session cache logic. Add cross-process session resume to workflow-agent-claude-code: on re-entry (isFirstVisit=false), look up the cached sessionId and use 'claude --resume' to continue with full conversation history. Cache file renamed from hermes-sessions.json to agent-sessions.json to reflect its shared nature. Refs #418	2026-05-23 07:44:02 +00:00
xiaoju	aad2792754	fix(hermes): disable ACP session/resume by default Hermes ACP _restore fails for custom providers — resolve_runtime_provider throws and base_url/api_mode are lost, causing resume to silently create a new session with no history. Prompt then returns empty text or refusal. Disable resume by default. Set UWF_HERMES_RESUME=1 to opt back in. Includes investigation notes in docs/investigations/. Refs #418	2026-05-23 07:23:14 +00:00