fix: address PR review — sessionId guard, resume error logging, atomic cache write

1. Guard against undefined sessionId before writing to cache 2. Log resume failures instead of silent catch 3. Atomic write (temp + rename) for session cache file 4. Add @uncaged/workflow-util dependency to claude-code agent Refs #418
feat: add debate workflow for resume integration testing
2026-05-23 08:03:39 +00:00 · 2026-05-23 07:50:38 +00:00 · 2026-05-23 07:44:02 +00:00 · 2026-05-23 07:23:14 +00:00 · 2026-05-23 06:43:47 +00:00
9 changed files with 337 additions and 83 deletions
@@ -0,0 +1,73 @@
+# Issue #418: ACP session/resume 返回空文本
+
+## 调研日期: 2026-05-23
+
+## 根因
+
+`session/resume` 在 restore 路径下 `_make_agent()` 失败，异常被静默吞掉。
+
+### 完整调用链
+
+```
+resume_session(sid)
+  → update_cwd(sid)
+    → get_session(sid) → _restore(sid)
+      → _make_agent()
+        → resolve_runtime_provider("custom") 失败（line 548-561）
+        → AIAgent() 抛出 "No LLM provider configured"（line 564）
+      → except Exception 静默吞掉（line 482-484）→ return None
+    → return None
+  → state is None → fallback: create_session()（新 sid，无历史）
+```
+
+### 关键代码位置（acp_adapter/session.py）
+
+- `_restore()` line 426-498: 从 DB 恢复 session，但 except 太宽泛
+- `_make_agent()` line 520-568: provider 解析在 restore 路径下不完整
+- Line 548-561: `resolve_runtime_provider("custom")` 失败后，`base_url` 虽然从 DB 取到了但没传给 AIAgent
+
+### 实测行为
+
+1. Phase 1: `session/new` + `prompt` → 正常，有 `agent_message_chunk`
+2. Phase 2: `session/resume` + `prompt`
+   - resume 返回成功，但 `available_commands_update` 里 sessionId 是新的（create_session fallback）
+   - 用原始 sid 发 prompt → `stopReason: "refusal"`（session 不在内存中）
+   - 用新 sid 发 prompt → 能跑但无历史（agent 回答"不知道 secret code"）
+
+### 验证脚本
+
+```python
+# 直接调用 _restore 验证
+cd ~/.hermes/hermes-agent
+python3 -c "
+import sys; sys.path.insert(0, '.')
+from acp_adapter.session import SessionManager
+sm = SessionManager()
+result = sm._restore('SESSION_ID_HERE')
+print(result)  # None — _make_agent 抛异常被吞掉
+"
+```
+
+### 两个 bug
+
+1. **`_make_agent` provider fallback 不完整**: restore 时 DB 里有 `base_url` 和 `api_mode`，但 `resolve_runtime_provider` 失败后这些值没被正确传递给 AIAgent
+2. **`_restore` 的 except 太宽泛**: 静默吞掉所有异常，连 warning 都只在 debug 级别，导致 resume 失败完全无感知
+
+### Hermes 版本
+
+- v0.10.0 (2026.4.16) — 初始测试
+- v0.14.0 (2026.5.16) — 更新后重新测试，bug 仍在
+- 代码路径: ~/.hermes/hermes-agent/acp_adapter/session.py
+
+### v0.14.0 测试结果 (2026-05-23)
+
+- `_restore` 仍因 `custom` provider 解析失败返回 None
+- 日志更清晰了：`WARNING: Failed to recreate agent for ACP session ...`
+- resume fallback 创建新 session（新 sid），但 agent 居然能回答之前的问题（可能通过 memory/session search）
+- 核心问题不变：sessionId 变了，client 用旧 sid 发 prompt → refusal
+
+### 上游 Issue
+
+- https://github.com/NousResearch/hermes-agent/issues/13489 — 已评论根因分析
+- https://github.com/NousResearch/hermes-agent/issues/8083 — resume 静默创建新 session
+- https://github.com/NousResearch/hermes-agent/issues/18452 — _make_agent fallback 不完整
@@ -0,0 +1,83 @@
+name: "debate"
+description: "Structured debate between two sides. Tests cross-process session resume."
+roles:
+  against:
+    description: "Argues against the proposition"
+    goal: |
+      You are a skilled debater arguing AGAINST the proposition.
+      Be logical, cite evidence, and directly address your opponent's points.
+      Keep each argument concise (under 200 words).
+    capabilities:
+      - argumentation
+      - critical-thinking
+    procedure: |
+      1. If this is the opening, present your strongest argument against the proposition.
+      2. If responding to the other side, directly counter their points with evidence and logic.
+      3. If you find yourself genuinely convinced by the other side, you may concede.
+    output: |
+      Provide your argument in the frontmatter.
+      Set conceded to true ONLY if you are genuinely convinced and wish to stop debating.
+    frontmatter:
+      type: object
+      properties:
+        argument:
+          type: string
+        conceded:
+          type: boolean
+      required: [argument, conceded]
+  for:
+    description: "Argues for the proposition"
+    goal: |
+      You are a skilled debater arguing FOR the proposition.
+      Be logical, cite evidence, and directly address your opponent's points.
+      Keep each argument concise (under 200 words).
+    capabilities:
+      - argumentation
+      - critical-thinking
+    procedure: |
+      1. Read the opposing side's latest argument carefully.
+      2. Counter their points with evidence and logic.
+      3. If you find yourself genuinely convinced by the other side, you may concede.
+    output: |
+      Provide your argument in the frontmatter.
+      Set conceded to true ONLY if you are genuinely convinced and wish to stop debating.
+    frontmatter:
+      type: object
+      properties:
+        argument:
+          type: string
+        conceded:
+          type: boolean
+      required: [argument, conceded]
+conditions:
+  againstConceded:
+    description: "The against side conceded"
+    expression: "$last('against').conceded = true"
+  forConceded:
+    description: "The for side conceded"
+    expression: "$last('for').conceded = true"
+  moreRounds:
+    description: "Fewer than 3 rounds completed per side"
+    expression: "$count(steps[role = 'against']) < 3"
+graph:
+  $START:
+    - role: "against"
+      condition: null
+      prompt: "Present your opening argument against the proposition."
+  against:
+    - role: "$END"
+      condition: "againstConceded"
+      prompt: "The against side conceded. Debate over."
+    - role: "for"
+      condition: null
+      prompt: "Counter the opposing argument. Address their points directly."
+  for:
+    - role: "$END"
+      condition: "forConceded"
+      prompt: "The for side conceded. Debate over."
+    - role: "against"
+      condition: "moreRounds"
+      prompt: "Counter the opposing argument. Address their points directly."
+    - role: "$END"
+      condition: null
+      prompt: "Maximum rounds reached. Debate over."
@@ -22,7 +22,8 @@
  },
  "dependencies": {
    "@uncaged/json-cas": "^0.4.0",
-    "@uncaged/workflow-agent-kit": "workspace:^"
+    "@uncaged/workflow-agent-kit": "workspace:^",
+    "@uncaged/workflow-util": "workspace:^"
  },
  "devDependencies": {
    "typescript": "^5.8.3"
@@ -6,13 +6,18 @@ import {
  type AgentRunResult,
  buildRolePrompt,
  createAgent,
+  getCachedSessionId,
+  setCachedSessionId,
 } from "@uncaged/workflow-agent-kit";
+import { createLogger } from "@uncaged/workflow-util";

 import { parseClaudeCodeJsonOutput, storeClaudeCodeDetail } from "./session-detail.js";

 const CLAUDE_COMMAND = "claude";
 const CLAUDE_MAX_TURNS = 90;

+const log = createLogger({ sink: { kind: "stderr" } });
+
 function buildHistorySummary(steps: AgentContext["steps"]): string {
  if (steps.length === 0) {
    return "";
@@ -125,8 +130,31 @@ async function processClaudeOutput(stdout: string, store: Store): Promise<AgentR

 async function runClaudeCode(ctx: AgentContext): Promise<AgentRunResult> {
  const fullPrompt = buildClaudeCodePrompt(ctx);
+
+  // Try resuming a cached session for re-entry scenarios (e.g. reviewer reject → developer re-entry).
+  if (!ctx.isFirstVisit) {
+    const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role);
+    if (cachedSessionId !== null) {
+      try {
+        const { stdout } = await spawnClaudeResume(cachedSessionId, fullPrompt);
+        const result = await processClaudeOutput(stdout, ctx.store);
+        if (result.sessionId !== "") {
+          await setCachedSessionId(ctx.threadId, ctx.role, result.sessionId);
+        }
+        return result;
+      } catch (error) {
+        const message = error instanceof Error ? error.message : String(error);
+        log("5VKR8N3Q", `session resume failed, falling back to new session: ${message}`);
+      }
+    }
+  }
+
  const { stdout } = await spawnClaudeRun(fullPrompt);
-  return processClaudeOutput(stdout, ctx.store);
+  const result = await processClaudeOutput(stdout, ctx.store);
+  if (result.sessionId !== "") {
+    await setCachedSessionId(ctx.threadId, ctx.role, result.sessionId);
+  }
+  return result;
 }

 async function continueClaudeCode(
@@ -0,0 +1,56 @@
+import { afterEach, describe, expect, it } from "bun:test";
+
+import { HermesAcpClient } from "../src/acp-client.js";
+
+/**
+ * E2E test for cross-process session resume.
+ *
+ * Simulates the workflow re-entry scenario:
+ * 1. Client A: connect → prompt → close (developer first run)
+ * 2. Client B: resume(sessionId) → prompt (developer re-entry after reviewer reject)
+ *
+ * This is what happens when uwf thread step spawns uwf-hermes twice for the same role.
+ */
+describe("HermesAcpClient cross-process resume", () => {
+  const clients: HermesAcpClient[] = [];
+
+  afterEach(async () => {
+    for (const c of clients) {
+      await c.close();
+    }
+    clients.length = 0;
+  });
+
+  it(
+    "resume() after close — second prompt returns non-empty text",
+    async () => {
+      // --- Client A: first run ---
+      const clientA = new HermesAcpClient();
+      clients.push(clientA);
+
+      await clientA.connect(process.cwd());
+      const first = await clientA.prompt(
+        "Remember the secret code: WATERMELON. Reply with exactly: ACKNOWLEDGED",
+      );
+      expect(first.text.length).toBeGreaterThan(0);
+      const sessionId = first.sessionId;
+
+      // Close client A (simulates uwf-hermes process exit)
+      await clientA.close();
+
+      // --- Client B: resume (simulates re-entry) ---
+      const clientB = new HermesAcpClient();
+      clients.push(clientB);
+
+      await clientB.resume(sessionId, process.cwd());
+      const second = await clientB.prompt(
+        "What was the secret code I told you earlier? Reply with just the code word.",
+      );
+
+      // The critical assertion: resumed session produces non-empty output
+      expect(second.text.length).toBeGreaterThan(0);
+      expect(second.sessionId).toBe(sessionId);
+    },
+    { timeout: 3 * 60 * 1000 },
+  );
+});
@@ -1,70 +1,17 @@
-import { mkdir, readFile, writeFile } from "node:fs/promises";
-import { dirname, join } from "node:path";
-
-import { resolveStorageRoot } from "@uncaged/workflow-agent-kit";
-import type { ThreadId } from "@uncaged/workflow-protocol";
-
-type HermesSessionCache = Record<string, string>;
-
-function getCachePath(): string {
-  return join(resolveStorageRoot(), "cache", "hermes-sessions.json");
-}
-
-function cacheKey(threadId: ThreadId, role: string): string {
-  return `${threadId}:${role}`;
-}
-
-function isRecord(value: unknown): value is Record<string, unknown> {
-  return typeof value === "object" && value !== null && !Array.isArray(value);
-}
-
-async function readCache(): Promise<HermesSessionCache> {
-  const path = getCachePath();
-  try {
-    const text = await readFile(path, "utf8");
-    const raw = JSON.parse(text) as unknown;
-    if (!isRecord(raw)) {
-      return {};
-    }
-    const cache: HermesSessionCache = {};
-    for (const [key, value] of Object.entries(raw)) {
-      if (typeof value === "string" && value !== "") {
-        cache[key] = value;
-      }
-    }
-    return cache;
-  } catch (e) {
-    const err = e as NodeJS.ErrnoException;
-    if (err.code === "ENOENT") {
-      return {};
-    }
-    throw e;
-  }
-}
-
-async function writeCache(cache: HermesSessionCache): Promise<void> {
-  const path = getCachePath();
-  await mkdir(dirname(path), { recursive: true });
-  await writeFile(path, `${JSON.stringify(cache, null, 2)}\n`, "utf8");
-}
+// Re-export session cache from the shared agent-kit package.
+export { getCachedSessionId, setCachedSessionId } from "@uncaged/workflow-agent-kit";

 export function isResumeDisabled(): boolean {
-  const flag = process.env.UWF_NO_RESUME;
-  return flag !== undefined && flag !== "";
-}
-
-export async function getCachedSessionId(threadId: ThreadId, role: string): Promise<string | null> {
-  const cache = await readCache();
-  const sessionId = cache[cacheKey(threadId, role)];
-  return sessionId ?? null;
-}
-
-export async function setCachedSessionId(
-  threadId: ThreadId,
-  role: string,
-  sessionId: string,
-): Promise<void> {
-  const cache = await readCache();
-  cache[cacheKey(threadId, role)] = sessionId;
-  await writeCache(cache);
+  // Hermes ACP session/resume is broken: _restore fails for custom providers
+  // because resolve_runtime_provider("custom") throws and base_url/api_mode
+  // are lost in the fallback path.  Resume silently creates a new session
+  // (different sessionId, no history), causing empty-text responses.
+  // See: https://github.com/NousResearch/hermes-agent/issues/13489
+  // Disable by default until upstream fixes the bug.  Set UWF_HERMES_RESUME=1
+  // to opt back in.
+  const enableFlag = process.env.UWF_HERMES_RESUME;
+  if (enableFlag === "1" || enableFlag === "true") {
+    return false;
+  }
+  return true;
 }
@@ -13,6 +13,7 @@ export type { FrontmatterFastPathResult } from "./frontmatter.js";
 export { tryFrontmatterFastPath } from "./frontmatter.js";
 export { createAgent } from "./run.js";
 export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
+export { getCachedSessionId, setCachedSessionId } from "./session-cache.js";
 export type {
  AgentContext,
  AgentContinueFn,
@@ -0,0 +1,78 @@
+import { mkdir, readFile, rename, writeFile } from "node:fs/promises";
+import { dirname, join } from "node:path";
+
+import type { ThreadId } from "@uncaged/workflow-protocol";
+
+import { resolveStorageRoot } from "./storage.js";
+
+type SessionCache = Record<string, string>;
+
+function getCachePath(): string {
+  return join(resolveStorageRoot(), "cache", "agent-sessions.json");
+}
+
+function cacheKey(threadId: ThreadId, role: string): string {
+  return `${threadId}:${role}`;
+}
+
+function isRecord(value: unknown): value is Record<string, unknown> {
+  return typeof value === "object" && value !== null && !Array.isArray(value);
+}
+
+async function readCache(): Promise<SessionCache> {
+  const path = getCachePath();
+  try {
+    const text = await readFile(path, "utf8");
+    const raw = JSON.parse(text) as unknown;
+    if (!isRecord(raw)) {
+      return {};
+    }
+    const cache: SessionCache = {};
+    for (const [key, value] of Object.entries(raw)) {
+      if (typeof value === "string" && value !== "") {
+        cache[key] = value;
+      }
+    }
+    return cache;
+  } catch (e) {
+    const err = e as NodeJS.ErrnoException;
+    if (err.code === "ENOENT") {
+      return {};
+    }
+    throw e;
+  }
+}
+
+/**
+ * Atomic write: write to a temp file, then rename.
+ * Prevents partial reads if another process reads mid-write.
+ * Note: read-modify-write is still not concurrency-safe across processes;
+ * the current workflow engine runs agent steps sequentially (execFileSync),
+ * so this is sufficient.  If parallel execution is added later, a proper
+ * lockfile (e.g. proper-lockfile) will be needed.
+ */
+async function writeCache(cache: SessionCache): Promise<void> {
+  const path = getCachePath();
+  const tmpPath = `${path}.${process.pid}.tmp`;
+  await mkdir(dirname(path), { recursive: true });
+  await writeFile(tmpPath, `${JSON.stringify(cache, null, 2)}\n`, "utf8");
+  await rename(tmpPath, path);
+}
+
+/** Read the cached session ID for a thread+role pair. */
+export async function getCachedSessionId(threadId: ThreadId, role: string): Promise<string | null> {
+  const cache = await readCache();
+  const sessionId = cache[cacheKey(threadId, role)];
+  return sessionId ?? null;
+}
+
+/** Write the session ID for a thread+role pair into the cache. */
+export async function setCachedSessionId(
+  threadId: ThreadId,
+  role: string,
+  sessionId: string,
+): Promise<void> {
+  const cache = await readCache();
+  cache[cacheKey(threadId, role)] = sessionId;
+  await writeCache(cache);
+}
@@ -26,7 +26,6 @@ uwf workflow list                 # list all registered workflows
 uwf thread start <workflow> -p <prompt>           # create a thread (no execution)
 uwf thread step <thread-id>                       # execute one moderator→agent→extract cycle
               [--agent <cmd>]                    # override agent command
-               [-c, --count <number>]             # run multiple steps (default: 1)
 uwf thread show <thread-id>                       # show thread head pointer
 uwf thread list                                   # list active threads
               [--all]                            # include archived threads
@@ -57,17 +56,6 @@ uwf cas schema list               # list all registered schemas
 uwf cas schema get <hash>         # show a schema by its type hash
 \`\`\`

-## Log Commands
-
-\`\`\`
-uwf log list                      # list log files with sizes
-uwf log show                      # show all log entries
-           [--thread <thread-id>] # filter by thread ID
-           [--process <pid>]      # filter by process ID
-           [--date <YYYY-MM-DD>]  # filter by date
-uwf log clean --before <date>     # delete log files before given date
-\`\`\`
-
 ## Global Options

 \`\`\`
@@ -81,7 +69,6 @@ uwf -V, --version                 # print version
 - **Thread**: A single workflow execution (ULID). State is an immutable CAS chain; active threads are indexed in \`threads.yaml\`.
 - **Step**: One moderator→agent→extract cycle. Run \`uwf thread step\` repeatedly until \`$END\`.
 - **CAS**: Content-Addressed Storage — all nodes are immutable and identified by hash.
- **Role**: Named actor with goal, capabilities, procedure, output, and frontmatter schema; the moderator routes between roles.
- **Edge Prompt**: Required instruction on each graph edge — the moderator's dispatch message to the agent.
+- **Role**: Named actor with goal, capabilities, procedure, output, and meta; the moderator routes between roles.
 `;
 }
Author	SHA1	Message	Date
xiaoju	24802f51db	fix: address PR review — sessionId guard, resume error logging, atomic cache write 1. Guard against undefined sessionId before writing to cache 2. Log resume failures instead of silent catch 3. Atomic write (temp + rename) for session cache file 4. Add @uncaged/workflow-util dependency to claude-code agent Refs #418	2026-05-23 08:03:39 +00:00
xiaoju	03eacbabb2	feat: add debate workflow for resume integration testing Two-role debate (against/for) with up to 3 rounds per side. Each role re-enters with session resume, making this an ideal integration test for cross-process session continuity. Supports early termination via concession (conceded=true in frontmatter). Refs #418	2026-05-23 07:50:38 +00:00
xiaoju	1afaeacd57	feat: extract session cache to agent-kit, add resume to claude-code agent Move getCachedSessionId/setCachedSessionId from workflow-agent-hermes into workflow-agent-kit so all agent adapters can share the same session cache logic. Add cross-process session resume to workflow-agent-claude-code: on re-entry (isFirstVisit=false), look up the cached sessionId and use 'claude --resume' to continue with full conversation history. Cache file renamed from hermes-sessions.json to agent-sessions.json to reflect its shared nature. Refs #418	2026-05-23 07:44:02 +00:00
xiaoju	aad2792754	fix(hermes): disable ACP session/resume by default Hermes ACP _restore fails for custom providers — resolve_runtime_provider throws and base_url/api_mode are lost, causing resume to silently create a new session with no history. Prompt then returns empty text or refusal. Disable resume by default. Set UWF_HERMES_RESUME=1 to opt back in. Includes investigation notes in docs/investigations/. Refs #418	2026-05-23 07:23:14 +00:00
xiaoju	3b6aa6525f	test: add failing e2e test for session resume bug (#418 ) Cross-process resume returns empty text on subsequent prompt. This test documents the bug — expected to fail until #418 is fixed.	2026-05-23 06:43:47 +00:00