refactor: address review feedback for CLI restructure

- Extract shared module (shared.ts) — walkChain, expandDeep, etc. deduplicated - Hide step read command (half-baked, not ready for users) - Remove cmdThreadKill dead code - Revert unrelated protocol type change - Revert unrelated package.json change - Fix unused imports (biome) Refs #463
feat(step): expand detail CAS refs by default in step list
2026-05-24 11:32:47 +00:00 · 2026-05-24 11:12:22 +00:00 · 2026-05-24 11:04:02 +00:00 · 2026-05-24 10:50:49 +00:00 · 2026-05-24 10:46:31 +00:00 · 2026-05-24 10:40:32 +00:00
18 changed files with 2045 additions and 512 deletions
@@ -38,19 +38,26 @@ roles:
    capabilities:
      - coding
    procedure: |
-      Before starting any work, ensure a clean worktree:
+      IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
-      1. `git checkout main && git pull` to get the latest code
+
-      2. `git checkout -b fix/<issue-number>-<short-description>` to create a fresh branch
+      Before starting any work, set up an isolated worktree:
-         - If bounced back from reviewer or tester, reuse the existing branch and rebase onto latest main:
+      1. `cd ~/repos/workflow && git fetch origin` to get latest refs
-           `git checkout main && git pull && git checkout <branch> && git rebase main`
+      2. First time (no existing branch):
         - `git worktree add ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
         - `cd ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug> && bun install`
      3. If bounced back from reviewer or tester (branch already exists):
         - The worktree should already exist at `~/repos/workflow-worktrees/fix/<issue-number>-<short-slug>`
         - `cd ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug>`
         - `git fetch origin && git rebase origin/main`
      4. ALL subsequent work must happen inside the worktree directory.
      Then implement TDD:
-      3. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
+      5. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
-      4. If bounced back from reviewer or tester: read the previous role's output to understand what needs fixing
+      6. If bounced back from reviewer or tester: read the previous role's output to understand what needs fixing
-      5. Write tests first based on the spec
+      7. Write tests first based on the spec
-      6. Implement the code to make tests pass
+      8. Implement the code to make tests pass
-      7. Ensure `bun run build` passes with no errors
+      9. Ensure `bun run build` passes with no errors
-      8. Run `bun test` to verify all tests pass
+      10. Run `bun test` to verify all tests pass
    output: "List all files changed and provide a summary. Frontmatter must include: status (done or failed)."
    frontmatter:
      type: object
@@ -66,6 +73,8 @@ roles:
      - code-review
      - static-analysis
    procedure: |
      First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
      Before reviewing, verify the git branch:
      1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
      2. If the branch doesn't correspond to the issue, flag it in your output and reject
@@ -99,6 +108,8 @@ roles:
    capabilities:
      - testing
    procedure: |
      First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
      1. Run `bun test` for automated test verification
      2. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
      3. Verify each scenario in the spec is covered and passing
@@ -119,6 +130,8 @@ roles:
    goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue."
    capabilities: []
    procedure: |
      First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
      Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
      1. Stage all changes: `git add -A`
      2. Commit with a descriptive message referencing the issue: `git commit -m "type: description\n\nFixes #N"`
@@ -126,6 +139,9 @@ roles:
         - If push hook fails: capture the error log in your output, mark hook_failed
      4. On push success: create a PR via `tea pr create --title "..." --description "..."`
         - PR description must follow the project template: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
      5. After PR creation, clean up the worktree:
         - `cd ~/repos/workflow`
         - `git worktree remove ~/repos/workflow-worktrees/fix/<issue-number>-<slug>`
    output: "Include PR URL on success or error log on failure. Frontmatter must include: success (true or false)."
    frontmatter:
      type: object
@@ -6,6 +6,18 @@
 Layer 4 entry point for the workflow engine. The `uwf` binary orchestrates one step per invocation: load thread head from `threads.yaml`, run the moderator, spawn the configured agent CLI, run extract, append a CAS step node, and update the head pointer (or archive when `$END`).
 ### Four-Layer Architecture
 ```
 workflow → thread → step → turn
 模板定义   执行实例   单步结果   agent内部交互
 ```
 - **Workflow** (layer 1): YAML template with roles and routing graph
 - **Thread** (layer 2): Single workflow execution instance
 - **Step** (layer 3): One moderator→agent→extract cycle
 - **Turn** (layer 4): Agent-internal interactions (use `step read` or CAS to inspect)
 This package has no library `src/index.ts` — it is consumed as a CLI binary only.
 **Dependencies:** `@uncaged/json-cas`, `@uncaged/json-cas-fs`, `@uncaged/workflow-agent-kit`, `@uncaged/workflow-moderator`, `@uncaged/workflow-protocol`, `@uncaged/workflow-util`, `commander`, `dotenv`, `yaml`
@@ -30,34 +42,53 @@ bun link packages/cli-workflow
 -h, --help             Show help
 ```
-### Thread
+### Thread (Layer 2: Execution Instances)
 | Command | Description |
 |---------|-------------|
 | `uwf thread start <workflow> -p <prompt>` | Create a thread without executing |
-| `uwf thread step <thread-id> [--agent <cmd>] [-c <count>]` | Execute one or more moderator→agent→extract cycles |
+| `uwf thread exec <thread-id> [--agent <cmd>] [-c <count>] [--background]` | Execute one or more moderator→agent→extract cycles |
 | `uwf thread show <thread-id>` | Show thread head pointer |
-| `uwf thread list [--all]` | List active threads (`--all` includes archived) |
+| `uwf thread list [--status <idle\|running\|completed>]` | List threads, optionally filtered by status |
 | `uwf thread steps <thread-id>` | List all steps chronologically |
 | `uwf thread read <thread-id> [--quota N] [--before <hash>] [--start]` | Render thread as readable markdown |
-| `uwf thread fork <step-hash>` | Fork from a specific step |
+| `uwf thread stop <thread-id>` | Stop background execution (keep thread active) |
-| `uwf thread step-details <step-hash>` | Dump full detail node as YAML |
+| `uwf thread cancel <thread-id>` | Cancel thread (stop + archive to history) |
 | `uwf thread kill <thread-id>` | Terminate and archive |
 Examples:
 ```bash
 uwf thread start solve-issue -p "Fix the login redirect bug"
-uwf thread step 01ARZ3NDEKTSV4RRFFQ69G5FAV
+uwf thread exec 01ARZ3NDEKTSV4RRFFQ69G5FAV
-uwf thread step 01ARZ3NDEKTSV4RRFFQ69G5FAV -c 3 --agent uwf-builtin
+uwf thread exec 01ARZ3NDEKTSV4RRFFQ69G5FAV -c 3 --agent uwf-builtin
 uwf thread exec 01ARZ3NDEKTSV4RRFFQ69G5FAV --background
 uwf thread list --status running
 uwf thread read 01ARZ3NDEKTSV4RRFFQ69G5FAV --quota 8000
 uwf thread stop 01ARZ3NDEKTSV4RRFFQ69G5FAV
 ```
-### Workflow
+### Step (Layer 3: Single Cycle Results)
 | Command | Description |
 |---------|-------------|
-| `uwf workflow put <file.yaml>` | Register a workflow from YAML |
+| `uwf step list <thread-id>` | List all steps in a thread chronologically |
 | `uwf step show <step-hash>` | Show step metadata and frontmatter |
 | `uwf step read <step-hash> [--before N]` | Read step output as markdown |
 | `uwf step fork <step-hash>` | Fork a thread from a specific step |
 Examples:
 ```bash
 uwf step list 01ARZ3NDEKTSV4RRFFQ69G5FAV
 uwf step show 32GCDE899RRQ3
 uwf step read 32GCDE899RRQ3 --before 3
 uwf step fork 32GCDE899RRQ3
 ```
 ### Workflow (Layer 1: Templates)
 | Command | Description |
 |---------|-------------|
 | `uwf workflow add <file.yaml>` | Register a workflow from YAML |
 | `uwf workflow show <name-or-hash>` | Show workflow definition |
 | `uwf workflow list` | List registered workflows |
@@ -99,6 +130,52 @@ Config: `~/.uncaged/workflow/config.yaml`. API keys: `~/.uncaged/workflow/.env`.
 | `uwf log show [--thread <id>] [--process <pid>] [--date YYYY-MM-DD]` | Show filtered log entries |
 | `uwf log clean [--before YYYY-MM-DD]` | Delete old log files |
 ## Migration Guide
 ### Breaking Changes (v0.x → v1.x)
 The CLI was reorganized to clarify the four-layer architecture. **No backward compatibility** — old commands have been removed.
 #### Renamed Commands
 | Old Command | New Command | Notes |
 |------------|-------------|-------|
 | `workflow put` | `workflow add` | More intuitive verb |
 | `thread step` | `thread exec` | Eliminates ambiguity with "step" noun |
 | `thread list --all` | `thread list --status completed` | Unified status filtering |
 #### Removed Commands (Merged)
 | Old Command | New Command | Notes |
 |------------|-------------|-------|
 | `thread running` | `thread list --status running` | Merged into unified list |
 #### Removed Commands (Split)
 | Old Command | New Commands | Notes |
 |------------|-------------|-------|
 | `thread kill` | `thread stop` or `thread cancel` | `stop` keeps thread active, `cancel` archives it |
 #### Moved Commands
 | Old Command | New Command | Notes |
 |------------|-------------|-------|
 | `thread steps` | `step list` | Moved to step layer |
 | `thread step-details` | `step show` | Moved to step layer |
 | `thread fork` | `step fork` | Moved to step layer (forks are step-based) |
 #### Deprecation Errors
 Old commands now show helpful error messages:
 ```bash
 $ uwf thread step 01ARZ3NDEKTSV4RRFFQ69G5FAV
 Error: Command 'thread step' has been removed.
 Use 'thread exec' instead.
 For more information, see: uwf help thread exec
 ```
 ## Internal Structure
 ```
@@ -109,8 +186,9 @@ src/
 ├── validate.ts         Workflow YAML validation
 ├── schemas.ts          CLI-local schema registration
 └── commands/
-    ├── thread.ts       Thread lifecycle and step execution
+    ├── thread.ts       Thread lifecycle and exec
-    ├── workflow.ts     Workflow registry (put/show/list)
+    ├── step.ts         Step operations (list/show/read/fork)
    ├── workflow.ts     Workflow registry (add/show/list)
    ├── cas.ts          CAS inspection and schema ops
    ├── setup.ts        Interactive/non-interactive setup
    ├── skill.ts        Built-in skill references
@@ -0,0 +1,683 @@
 import { mkdir, mkdtemp, rm } from "node:fs/promises";
 import { tmpdir } from "node:os";
 import { join } from "node:path";
 import { bootstrap, putSchema } from "@uncaged/json-cas";
 import { createFsStore } from "@uncaged/json-cas-fs";
 import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
 import { afterEach, beforeEach, describe, expect, test } from "vitest";
 import { cmdThreadRead, THREAD_READ_DEFAULT_QUOTA } from "../commands/thread.js";
 import { registerUwfSchemas } from "../schemas.js";
 import type { UwfStore } from "../store.js";
 import { saveThreadsIndex } from "../store.js";
 // ── schemas used in tests ────────────────────────────────────────────────────
 const TURN_SCHEMA = {
  title: "hermes-turn",
  type: "object" as const,
  required: ["index", "role", "content"],
  properties: {
    index: { type: "integer" as const },
    role: { type: "string" as const },
    content: { type: "string" as const },
    toolCalls: {
      anyOf: [
        { type: "array" as const, items: { type: "object" as const } },
        { type: "null" as const },
      ],
    },
    reasoning: { anyOf: [{ type: "string" as const }, { type: "null" as const }] },
  },
  additionalProperties: false,
 };
 const DETAIL_SCHEMA = {
  title: "hermes-detail",
  type: "object" as const,
  required: ["sessionId", "model", "duration", "turnCount", "turns"],
  properties: {
    sessionId: { type: "string" as const },
    model: { type: "string" as const },
    duration: { type: "integer" as const },
    turnCount: { type: "integer" as const },
    turns: {
      type: "array" as const,
      items: { type: "string" as const, format: "cas_ref" },
    },
  },
  additionalProperties: false,
 };
 // ── helpers ───────────────────────────────────────────────────────────────────
 async function makeUwfStore(storageRoot: string): Promise<UwfStore> {
  const casDir = join(storageRoot, "cas");
  await mkdir(casDir, { recursive: true });
  const store = createFsStore(casDir);
  const schemas = await registerUwfSchemas(store);
  return { storageRoot, store, schemas };
 }
 async function registerDetailSchemas(store: ReturnType<typeof createFsStore>) {
  await bootstrap(store);
  const [turn, detail] = await Promise.all([
    putSchema(store, TURN_SCHEMA),
    putSchema(store, DETAIL_SCHEMA),
  ]);
  return { turn, detail };
 }
 // ── fixture ───────────────────────────────────────────────────────────────────
 let tmpDir: string;
 beforeEach(async () => {
  tmpDir = await mkdtemp(join(tmpdir(), "cli-uwf-test-"));
 });
 afterEach(async () => {
  await rm(tmpDir, { recursive: true, force: true });
 });
 // ── thread read XML tag isolation ─────────────────────────────────────────────
 describe("thread read XML tag isolation", () => {
  test("scenario 1: wraps output in XML tags instead of heading", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const detailSchemas = await registerDetailSchemas(uwf.store);
    const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "test-wf",
      description: "desc",
      roles: {
        planner: {
          description: "Planner",
          goal: "You are a planning agent. Your task is to...",
          capabilities: [],
          procedure: "Plan the work.",
          output: "Summarize the plan.",
          meta: "placeholder00" as CasRef,
        },
      },
      conditions: {},
      graph: {},
    });
    const startHash = await uwf.store.put(uwf.schemas.startNode, {
      workflow: workflowHash,
      prompt: "Fix issue #459",
    });
    const outputHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "out",
      description: "",
      roles: {},
      conditions: {},
      graph: {},
    });
    const turnHash = await uwf.store.put(detailSchemas.turn, {
      index: 0,
      role: "assistant",
      content:
        "---\nstatus: ready\nplan: CMWGHQKT58RY4\n---\n\n# Analysis Complete\n## Issue Summary\nThe issue requires XML tag isolation.",
      toolCalls: null,
      reasoning: null,
    });
    const detailHash = await uwf.store.put(detailSchemas.detail, {
      sessionId: "sx",
      model: "mx",
      duration: 500,
      turnCount: 1,
      turns: [turnHash],
    });
    const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
      start: startHash,
      prev: null,
      role: "planner",
      output: outputHash,
      detail: detailHash,
      agent: "uwf-claude-code",
    });
    const threadId = "01JTEST0000000000000001" as ThreadId;
    await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
    // Should wrap output in XML tags
    expect(markdown).toContain("<output>");
    expect(markdown).toContain("</output>");
    // Should not have ### Content heading
    expect(markdown).not.toContain("### Content");
    // Should preserve markdown headings inside output tags
    expect(markdown).toContain("# Analysis Complete");
    expect(markdown).toContain("## Issue Summary");
  });
  test("scenario 2: wraps prompt in XML tags", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const detailSchemas = await registerDetailSchemas(uwf.store);
    const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "test-wf",
      description: "desc",
      roles: {
        planner: {
          description: "Planner",
          goal: "You are a planning agent. Your task is to analyze and plan.",
          capabilities: [],
          procedure: "Plan the work.",
          output: "Summarize the plan.",
          meta: "placeholder00" as CasRef,
        },
      },
      conditions: {},
      graph: {},
    });
    const startHash = await uwf.store.put(uwf.schemas.startNode, {
      workflow: workflowHash,
      prompt: "Fix issue",
    });
    const outputHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "out",
      description: "",
      roles: {},
      conditions: {},
      graph: {},
    });
    const turnHash = await uwf.store.put(detailSchemas.turn, {
      index: 0,
      role: "assistant",
      content: "---\nstatus: ready\n---\n\nContent here...",
      toolCalls: null,
      reasoning: null,
    });
    const detailHash = await uwf.store.put(detailSchemas.detail, {
      sessionId: "sx",
      model: "mx",
      duration: 500,
      turnCount: 1,
      turns: [turnHash],
    });
    const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
      start: startHash,
      prev: null,
      role: "planner",
      output: outputHash,
      detail: detailHash,
      agent: "uwf-claude-code",
    });
    const threadId = "01JTEST0000000000000002" as ThreadId;
    await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
    // Should wrap prompt in XML tags
    expect(markdown).toContain("<prompt>");
    expect(markdown).toContain("</prompt>");
    expect(markdown).toContain("You are a planning agent. Your task is to analyze and plan.");
    // Should not have ### Prompt heading
    expect(markdown).not.toContain("### Prompt");
    // Should wrap output in XML tags
    expect(markdown).toContain("<output>");
    expect(markdown).toContain("</output>");
  });
  test("scenario 3: same role repeated does not show prompt twice", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "test-wf",
      description: "desc",
      roles: {
        writer: {
          description: "Writer",
          goal: "You are a writer agent.",
          capabilities: [],
          procedure: "Write content.",
          output: "Summarize writing.",
          meta: "placeholder00" as CasRef,
        },
      },
      conditions: {},
      graph: {},
    });
    const startHash = await uwf.store.put(uwf.schemas.startNode, {
      workflow: workflowHash,
      prompt: "Write something",
    });
    const outputHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "out",
      description: "",
      roles: {},
      conditions: {},
      graph: {},
    });
    const step1 = await uwf.store.put(uwf.schemas.stepNode, {
      start: startHash,
      prev: null,
      role: "writer",
      output: outputHash,
      detail: null,
      agent: "uwf-test",
    });
    const step2 = await uwf.store.put(uwf.schemas.stepNode, {
      start: startHash,
      prev: step1 as CasRef,
      role: "writer",
      output: outputHash,
      detail: null,
      agent: "uwf-test",
    });
    const threadId = "01JTEST0000000000000003" as ThreadId;
    await saveThreadsIndex(tmpDir, { [threadId]: step2 });
    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
    // Should only show prompt tags once
    const promptCount = (markdown.match(/<prompt>/g) ?? []).length;
    expect(promptCount).toBe(1);
  });
  test("scenario 4: step with no detail shows no output tags", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "test-wf",
      description: "desc",
      roles: {
        worker: {
          description: "Worker",
          goal: "You are a worker agent.",
          capabilities: [],
          procedure: "Do work.",
          output: "Summarize work.",
          meta: "placeholder00" as CasRef,
        },
      },
      conditions: {},
      graph: {},
    });
    const startHash = await uwf.store.put(uwf.schemas.startNode, {
      workflow: workflowHash,
      prompt: "Do stuff",
    });
    const outputHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "out",
      description: "",
      roles: {},
      conditions: {},
      graph: {},
    });
    const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
      start: startHash,
      prev: null,
      role: "worker",
      output: outputHash,
      detail: null,
      agent: "uwf-test",
    });
    const threadId = "01JTEST0000000000000004" as ThreadId;
    await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
    // Should not have output tags
    expect(markdown).not.toContain("<output>");
    expect(markdown).not.toContain("</output>");
    // Step header should still be displayed
    expect(markdown).toContain("## Step 1: worker");
    // Prompt should still be shown
    expect(markdown).toContain("<prompt>");
  });
  test("scenario 5: empty content shows no output tags", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "test-wf",
      description: "desc",
      roles: {},
      conditions: {},
      graph: {},
    });
    const startHash = await uwf.store.put(uwf.schemas.startNode, {
      workflow: workflowHash,
      prompt: "Do stuff",
    });
    const outputHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "out",
      description: "",
      roles: {},
      conditions: {},
      graph: {},
    });
    // A detail ref that doesn't exist → extractLastAssistantContent returns null
    const missingDetailRef = "missingdetail0" as CasRef;
    const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
      start: startHash,
      prev: null,
      role: "worker",
      output: outputHash,
      detail: missingDetailRef,
      agent: "uwf-test",
    });
    const threadId = "01JTEST0000000000000005" as ThreadId;
    await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
    // Should not have output tags
    expect(markdown).not.toContain("<output>");
    expect(markdown).not.toContain("</output>");
  });
  test("scenario 6: thread read with --start flag shows task section", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "test-wf",
      description: "desc",
      roles: {
        roleA: {
          description: "Role A",
          goal: "Goal for roleA",
          capabilities: [],
          procedure: "Do stuff.",
          output: "Output.",
          meta: "placeholder00" as CasRef,
        },
      },
      conditions: {},
      graph: {},
    });
    const startHash = await uwf.store.put(uwf.schemas.startNode, {
      workflow: workflowHash,
      prompt: "Initial prompt",
    });
    const outputHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "out",
      description: "",
      roles: {},
      conditions: {},
      graph: {},
    });
    const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
      start: startHash,
      prev: null,
      role: "roleA",
      output: outputHash,
      detail: null,
      agent: "uwf-test",
    });
    const threadId = "01JTEST0000000000000006" as ThreadId;
    await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, true);
    // Should include task section
    expect(markdown).toContain("# Thread");
    expect(markdown).toContain("## Task");
    expect(markdown).toContain("Initial prompt");
    // Prompts should use XML tags
    expect(markdown).toContain("<prompt>");
  });
  test("scenario 7: thread read with --before parameter", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "test-wf",
      description: "desc",
      roles: {
        roleA: {
          description: "Role A",
          goal: "Goal for roleA",
          capabilities: [],
          procedure: "Do stuff.",
          output: "Output.",
          meta: "placeholder00" as CasRef,
        },
        roleB: {
          description: "Role B",
          goal: "Goal for roleB",
          capabilities: [],
          procedure: "Do stuff.",
          output: "Output.",
          meta: "placeholder00" as CasRef,
        },
        roleC: {
          description: "Role C",
          goal: "Goal for roleC",
          capabilities: [],
          procedure: "Do stuff.",
          output: "Output.",
          meta: "placeholder00" as CasRef,
        },
      },
      conditions: {},
      graph: {},
    });
    const startHash = await uwf.store.put(uwf.schemas.startNode, {
      workflow: workflowHash,
      prompt: "Initial prompt",
    });
    const outputHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "out",
      description: "",
      roles: {},
      conditions: {},
      graph: {},
    });
    const step1 = await uwf.store.put(uwf.schemas.stepNode, {
      start: startHash,
      prev: null,
      role: "roleA",
      output: outputHash,
      detail: null,
      agent: "uwf-test",
    });
    const step2 = await uwf.store.put(uwf.schemas.stepNode, {
      start: startHash,
      prev: step1 as CasRef,
      role: "roleB",
      output: outputHash,
      detail: null,
      agent: "uwf-test",
    });
    const step3 = await uwf.store.put(uwf.schemas.stepNode, {
      start: startHash,
      prev: step2 as CasRef,
      role: "roleC",
      output: outputHash,
      detail: null,
      agent: "uwf-test",
    });
    const threadId = "01JTEST0000000000000007" as ThreadId;
    await saveThreadsIndex(tmpDir, { [threadId]: step3 });
    const markdown = await cmdThreadRead(
      tmpDir,
      threadId,
      THREAD_READ_DEFAULT_QUOTA,
      step2 as CasRef,
      false,
    );
    // Should only show roleA
    expect(markdown).toContain("roleA");
    expect(markdown).not.toContain("roleB");
    expect(markdown).not.toContain("roleC");
    // Should use XML tags
    expect(markdown).toContain("<prompt>");
  });
  test("scenario 9: special characters in content are preserved", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const detailSchemas = await registerDetailSchemas(uwf.store);
    const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "test-wf",
      description: "desc",
      roles: {
        writer: {
          description: "Writer",
          goal: "You are a writer.",
          capabilities: [],
          procedure: "Write content.",
          output: "Summarize.",
          meta: "placeholder00" as CasRef,
        },
      },
      conditions: {},
      graph: {},
    });
    const startHash = await uwf.store.put(uwf.schemas.startNode, {
      workflow: workflowHash,
      prompt: "Write something",
    });
    const outputHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "out",
      description: "",
      roles: {},
      conditions: {},
      graph: {},
    });
    const turnHash = await uwf.store.put(detailSchemas.turn, {
      index: 0,
      role: "assistant",
      content: "Content with <special> & characters > like <this>",
      toolCalls: null,
      reasoning: null,
    });
    const detailHash = await uwf.store.put(detailSchemas.detail, {
      sessionId: "sx",
      model: "mx",
      duration: 500,
      turnCount: 1,
      turns: [turnHash],
    });
    const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
      start: startHash,
      prev: null,
      role: "writer",
      output: outputHash,
      detail: detailHash,
      agent: "uwf-test",
    });
    const threadId = "01JTEST0000000000000008" as ThreadId;
    await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
    // Special characters should be preserved as-is
    expect(markdown).toContain("Content with <special> & characters > like <this>");
  });
  test("scenario 10: quota limit with XML tags", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "test-wf",
      description: "desc",
      roles: {
        roleA: {
          description: "Role A",
          goal: "Goal for roleA",
          capabilities: [],
          procedure: "Do stuff.",
          output: "Output.",
          meta: "placeholder00" as CasRef,
        },
      },
      conditions: {},
      graph: {},
    });
    const startHash = await uwf.store.put(uwf.schemas.startNode, {
      workflow: workflowHash,
      prompt: "Initial prompt",
    });
    const outputHash = await uwf.store.put(uwf.schemas.workflow, {
      name: "out",
      description: "",
      roles: {},
      conditions: {},
      graph: {},
    });
    const steps: CasRef[] = [];
    let prev: CasRef | null = null;
    for (let i = 0; i < 5; i++) {
      const step = (await uwf.store.put(uwf.schemas.stepNode, {
        start: startHash,
        prev,
        role: "roleA",
        output: outputHash,
        detail: null,
        agent: "uwf-test",
      })) as CasRef;
      steps.push(step);
      prev = step;
    }
    const threadId = "01JTEST0000000000000009" as ThreadId;
    await saveThreadsIndex(tmpDir, { [threadId]: steps[steps.length - 1]! });
    // Use very small quota
    const markdown = await cmdThreadRead(tmpDir, threadId, 1, null, false);
    // Should have skip hint
    expect(markdown).toContain("earlier step");
    // Should have XML tags for displayed steps
    if (markdown.includes("<prompt>")) {
      expect(markdown).toContain("</prompt>");
    }
  });
 });
@@ -22,48 +22,48 @@ function runCli(args: string[]): { stdout: string; stderr: string; exitCode: num
  }
 }
-describe("thread step --count CLI parsing", () => {
+describe("thread exec --count CLI parsing", () => {
  test("--help shows -c/--count option", () => {
-    const result = runCli(["thread", "step", "--help"]);
+    const result = runCli(["thread", "exec", "--help"]);
    expect(result.stdout).toContain("--count");
    expect(result.stdout).toContain("-c");
  });
  test("description says 'one or more steps'", () => {
-    const result = runCli(["thread", "step", "--help"]);
+    const result = runCli(["thread", "exec", "--help"]);
    expect(result.stdout).toContain("one or more steps");
  });
 });
-describe("cmdThreadStep count logic", () => {
+describe("cmdThreadExec count logic", () => {
  test("count=0 fails with validation error", () => {
-    const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "0"]);
+    const result = runCli(["thread", "exec", "FAKE_THREAD_ID", "-c", "0"]);
    expect(result.exitCode).not.toBe(0);
    expect(result.stderr).toContain("positive integer");
  });
  test("negative count fails with validation error", () => {
-    const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "-1"]);
+    const result = runCli(["thread", "exec", "FAKE_THREAD_ID", "-c", "-1"]);
    expect(result.exitCode).not.toBe(0);
    expect(result.stderr).toContain("positive integer");
  });
  test("non-integer count fails with validation error", () => {
-    const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "1.5"]);
+    const result = runCli(["thread", "exec", "FAKE_THREAD_ID", "-c", "1.5"]);
    expect(result.exitCode).not.toBe(0);
    expect(result.stderr).toContain("positive integer");
  });
  test("count=1 is the default (no -c flag)", () => {
    // Without -c, it should attempt to run 1 step (failing on missing thread, not on count validation)
-    const result = runCli(["thread", "step", "FAKE_THREAD_ID"]);
+    const result = runCli(["thread", "exec", "FAKE_THREAD_ID"]);
    expect(result.exitCode).not.toBe(0);
    // Should NOT contain "positive integer" error — should fail on thread lookup instead
    expect(result.stderr).not.toContain("positive integer");
  });
  test("count=3 passes validation (fails on thread lookup)", () => {
-    const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "3"]);
+    const result = runCli(["thread", "exec", "FAKE_THREAD_ID", "-c", "3"]);
    expect(result.exitCode).not.toBe(0);
    // Should NOT contain "positive integer" error — should fail on thread/storage lookup
    expect(result.stderr).not.toContain("positive integer");
@@ -5,9 +5,9 @@ import { bootstrap, putSchema } from "@uncaged/json-cas";
 import { createFsStore } from "@uncaged/json-cas-fs";
 import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
 import { afterEach, beforeEach, describe, expect, test } from "vitest";
 import { cmdStepShow } from "../commands/step.js";
 import {
  cmdThreadRead,
  cmdThreadStepDetails,
  extractLastAssistantContent,
  THREAD_READ_DEFAULT_QUOTA,
 } from "../commands/thread.js";
@@ -198,10 +198,10 @@ describe("extractLastAssistantContent", () => {
  });
 });
-// ── cmdThreadRead: ### Content section ───────────────────────────────────────
+// ── cmdThreadRead: <output> section ──────────────────────────────────────────
-describe("cmdThreadRead ### Content section", () => {
+describe("cmdThreadRead <output> section", () => {
-  test("includes ### Content before ### Output when detail has assistant turns", async () => {
+  test("includes <output> tags when detail has assistant turns", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const detailSchemas = await registerDetailSchemas(uwf.store);
@@ -264,12 +264,13 @@ describe("cmdThreadRead ### Content section", () => {
    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
-    expect(markdown).toContain("### Content");
+    expect(markdown).toContain("<output>");
    expect(markdown).toContain("</output>");
    expect(markdown).toContain("The assistant response text");
-    expect(markdown).not.toContain("### Output");
+    expect(markdown).not.toContain("### Content");
  });
-  test("omits ### Content when detail has no matching assistant turns", async () => {
+  test("omits <output> tags when detail has no matching assistant turns", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
@@ -308,14 +309,15 @@ describe("cmdThreadRead ### Content section", () => {
    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
    expect(markdown).not.toContain("<output>");
    expect(markdown).not.toContain("</output>");
    expect(markdown).not.toContain("### Content");
    expect(markdown).not.toContain("### Output");
  });
 });
-// ── cmdThreadStepDetails ──────────────────────────────────────────────────────
+// ── cmdStepShow ───────────────────────────────────────────────────────────────
-describe("cmdThreadStepDetails", () => {
+describe("cmdStepShow", () => {
  test("returns expanded detail node with turns inlined", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const detailSchemas = await registerDetailSchemas(uwf.store);
@@ -363,7 +365,7 @@ describe("cmdThreadStepDetails", () => {
      agent: "uwf-hermes",
    });
-    const result = await cmdThreadStepDetails(tmpDir, stepHash);
+    const result = await cmdStepShow(tmpDir, stepHash);
    expect(result).toMatchObject({
      sessionId: "sess42",
@@ -384,9 +386,9 @@ describe("cmdThreadStepDetails", () => {
  });
 });
-// ── cmdThreadRead: ### Prompt deduplication ───────────────────────────────────
+// ── cmdThreadRead: <prompt> deduplication ────────────────────────────────────
-describe("cmdThreadRead ### Prompt deduplication", () => {
+describe("cmdThreadRead <prompt> deduplication", () => {
  async function makeThreadWithRoles(uwf: UwfStore, roles: string[]): Promise<string> {
    const roleMap: Record<string, unknown> = {};
    for (const r of [...new Set(roles)]) {
@@ -434,36 +436,36 @@ describe("cmdThreadRead ### Prompt deduplication", () => {
    return stepHash;
  }
-  test("same consecutive role shows ### Prompt once", async () => {
+  test("same consecutive role shows <prompt> once", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const headHash = await makeThreadWithRoles(uwf, ["writer", "writer"]);
    const threadId = "01JTEST0000000000000003" as ThreadId;
    await saveThreadsIndex(tmpDir, { [threadId]: headHash });
    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
-    const count = (markdown.match(/### Prompt/g) ?? []).length;
+    const count = (markdown.match(/<prompt>/g) ?? []).length;
    expect(count).toBe(1);
  });
-  test("different consecutive roles each show ### Prompt", async () => {
+  test("different consecutive roles each show <prompt>", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const headHash = await makeThreadWithRoles(uwf, ["planner", "coder"]);
    const threadId = "01JTEST0000000000000004" as ThreadId;
    await saveThreadsIndex(tmpDir, { [threadId]: headHash });
    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
-    const count = (markdown.match(/### Prompt/g) ?? []).length;
+    const count = (markdown.match(/<prompt>/g) ?? []).length;
    expect(count).toBe(2);
  });
-  test("non-consecutive same role shows ### Prompt twice", async () => {
+  test("non-consecutive same role shows <prompt> twice", async () => {
    const uwf = await makeUwfStore(tmpDir);
    const headHash = await makeThreadWithRoles(uwf, ["roleA", "roleB", "roleA"]);
    const threadId = "01JTEST0000000000000005" as ThreadId;
    await saveThreadsIndex(tmpDir, { [threadId]: headHash });
    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
-    const count = (markdown.match(/### Prompt/g) ?? []).length;
+    const count = (markdown.match(/<prompt>/g) ?? []).length;
    expect(count).toBe(2);
  });
 });
@@ -584,9 +586,9 @@ describe("cmdThreadRead start section / before / quota", () => {
 // ── Tests that call process.exit must be last ─────────────────────────────────
-describe("cmdThreadStepDetails (process.exit tests - must be last)", () => {
+describe("cmdStepShow (process.exit tests - must be last)", () => {
  test("throws when step hash does not exist", async () => {
-    await expect(cmdThreadStepDetails(tmpDir, "nonexistenth0" as CasRef)).rejects.toThrow();
+    await expect(cmdStepShow(tmpDir, "nonexistenth0" as CasRef)).rejects.toThrow();
  });
  test("before with unknown hash rejects", async () => {
@@ -1,8 +1,7 @@
 #!/usr/bin/env bun
-import type { ThreadId } from "@uncaged/workflow-protocol";
+import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
 import { Command } from "commander";
 import { stringify as yamlStringify } from "yaml";
 import {
  cmdCasGet,
  cmdCasHas,
@@ -17,20 +16,19 @@ import {
 import { cmdLogClean, cmdLogList, cmdLogShow } from "./commands/log.js";
 import { cmdSetup, cmdSetupInteractive } from "./commands/setup.js";
 import { cmdSkillCli } from "./commands/skill.js";
 import { cmdStepFork, cmdStepList, cmdStepShow } from "./commands/step.js";
 import {
-  cmdThreadFork,
+  cmdThreadCancel,
-  cmdThreadKill,
+  cmdThreadExec,
  cmdThreadList,
  cmdThreadRead,
  cmdThreadRunning,
  cmdThreadShow,
  cmdThreadStart,
-  cmdThreadStep,
+  cmdThreadStop,
  cmdThreadStepDetails,
  cmdThreadSteps,
  THREAD_READ_DEFAULT_QUOTA,
  type ThreadStatus,
 } from "./commands/thread.js";
-import { cmdWorkflowList, cmdWorkflowPut, cmdWorkflowShow } from "./commands/workflow.js";
+import { cmdWorkflowAdd, cmdWorkflowList, cmdWorkflowShow } from "./commands/workflow.js";
 import { formatOutput, type OutputFormat } from "./format.js";
 import { resolveStorageRoot } from "./store.js";
@@ -53,20 +51,27 @@ const program = new Command();
 const pkg = await import("../package.json", { with: { type: "json" } });
 program
  .name("uwf")
-  .description("Stateless workflow CLI")
+  .description(
    "Stateless workflow CLI\n\n" +
      "Four-layer architecture:\n" +
      "  workflow → thread → step → turn\n" +
      "  模板定义   执行实例   单步结果   agent内部交互",
  )
  .version(pkg.default.version, "-V, --version");
 program.option("--format <fmt>", "Output format: json or yaml", "json");
-const workflow = program.command("workflow").description("Workflow registry and CAS");
+const workflow = program
  .command("workflow")
  .description("Workflow definitions (layer 1: templates)");
 workflow
-  .command("put")
+  .command("add")
  .description("Register a workflow from YAML")
  .argument("<file>", "Workflow YAML file")
  .action((file: string) => {
    const storageRoot = resolveStorageRoot();
    runAction(async () => {
-      const result = await cmdWorkflowPut(storageRoot, file);
+      const result = await cmdWorkflowAdd(storageRoot, file);
      writeOutput(result);
    });
  });
@@ -94,7 +99,7 @@ workflow
    });
  });
-const thread = program.command("thread").description("Thread lifecycle and execution");
+const thread = program.command("thread").description("Thread execution (layer 2: instances)");
 thread
  .command("start")
@@ -110,7 +115,7 @@ thread
  });
 thread
-  .command("step")
+  .command("exec")
  .description("Execute one or more steps")
  .argument("<thread-id>", "Thread ULID")
  .option("--agent <cmd>", "Override agent command")
@@ -134,7 +139,7 @@ thread
        const background = opts.background ?? false;
        const backgroundWorker = opts._backgroundWorker ?? false;
-        const results = await cmdThreadStep(
+        const results = await cmdThreadExec(
          storageRoot,
          threadId,
          agentOverride,
@@ -165,47 +170,49 @@ thread
 thread
  .command("list")
-  .description("List active threads")
+  .description("List threads")
-  .option("--all", "Include archived threads")
+  .option("--status <status>", "Filter by status: idle, running, or completed")
-  .action((opts: { all: boolean }) => {
+  .action((opts: { status: string | undefined }) => {
    const storageRoot = resolveStorageRoot();
    runAction(async () => {
-      const result = await cmdThreadList(storageRoot, opts.all);
+      const validStatuses: ThreadStatus[] = ["idle", "running", "completed"];
      let statusFilter: ThreadStatus | null = null;
      if (opts.status !== undefined) {
        if (!validStatuses.includes(opts.status as ThreadStatus)) {
          process.stderr.write(
            `Invalid status: ${opts.status}. Must be one of: idle, running, completed\n`,
          );
          process.exit(1);
        }
        statusFilter = opts.status as ThreadStatus;
      }
      const result = await cmdThreadList(storageRoot, statusFilter);
      writeOutput(result);
    });
  });
 thread
-  .command("running")
+  .command("stop")
-  .description("List threads currently executing in the background")
+  .description("Stop background execution of a thread (keep thread active)")
  .action(() => {
    const storageRoot = resolveStorageRoot();
    runAction(async () => {
      const result = await cmdThreadRunning(storageRoot);
      writeOutput(result);
    });
  });
 thread
  .command("kill")
  .description("Terminate and archive a thread")
  .argument("<thread-id>", "Thread ULID")
  .action((threadId: string) => {
    const storageRoot = resolveStorageRoot();
    runAction(async () => {
-      const result = await cmdThreadKill(storageRoot, threadId);
+      const result = await cmdThreadStop(storageRoot, threadId);
      writeOutput(result);
    });
  });
 thread
-  .command("steps")
+  .command("cancel")
-  .description("List all steps in a thread")
+  .description("Cancel a thread (stop execution and move to history)")
  .argument("<thread-id>", "Thread ULID")
  .action((threadId: string) => {
    const storageRoot = resolveStorageRoot();
    runAction(async () => {
-      const result = await cmdThreadSteps(storageRoot, threadId);
+      const result = await cmdThreadCancel(storageRoot, threadId);
      writeOutput(result);
    });
  });
@@ -239,28 +246,141 @@ thread
    },
  );
-thread
+const step = program.command("step").description("Step results (layer 3: single cycle)");
 step
  .command("list")
  .description("List all steps in a thread")
  .argument("<thread-id>", "Thread ULID")
  .action((threadId: string) => {
    const storageRoot = resolveStorageRoot();
    runAction(async () => {
      const result = await cmdStepList(storageRoot, threadId);
      writeOutput(result);
    });
  });
 step
  .command("show")
  .description("Show details of a specific step")
  .argument("<step-hash>", "CAS hash of the StepNode")
  .action((stepHash: string) => {
    const storageRoot = resolveStorageRoot();
    runAction(async () => {
      const detail = await cmdStepShow(storageRoot, stepHash as CasRef);
      writeOutput(detail);
    });
  });
 // step read is not yet registered (half-baked, see step.ts cmdStepRead)
 step
  .command("fork")
  .description("Fork a thread from a specific step")
  .argument("<step-hash>", "CAS hash of the StartNode or StepNode to fork from")
  .action((stepHash: string) => {
    const storageRoot = resolveStorageRoot();
    runAction(async () => {
-      const result = await cmdThreadFork(storageRoot, stepHash);
+      const result = await cmdStepFork(storageRoot, stepHash as CasRef);
      writeOutput(result);
    });
  });
 // ── Deprecation Handlers ──────────────────────────────────────────────────────
 // These commands have been removed. Show helpful error messages.
 workflow
  .command("put")
  .description("[DEPRECATED] Use 'workflow add' instead")
  .argument("<file>", "Workflow YAML file")
  .action(() => {
    process.stderr.write(`Error: Command 'workflow put' has been removed.
 Use 'workflow add' instead.
 For more information, see: uwf help workflow add
 `);
    process.exit(1);
  });
 thread
  .command("step")
  .description("[DEPRECATED] Use 'thread exec' instead")
  .argument("<thread-id>", "Thread ULID")
  .allowUnknownOption()
  .action(() => {
    process.stderr.write(`Error: Command 'thread step' has been removed.
 Use 'thread exec' instead.
 For more information, see: uwf help thread exec
 `);
    process.exit(1);
  });
 thread
  .command("steps")
  .description("[DEPRECATED] Use 'step list' instead")
  .argument("<thread-id>", "Thread ULID")
  .action(() => {
    process.stderr.write(`Error: Command 'thread steps' has been removed.
 Use 'step list' instead.
 For more information, see: uwf help step list
 `);
    process.exit(1);
  });
 thread
  .command("step-details")
-  .description("Dump the full detail node of a step as YAML")
+  .description("[DEPRECATED] Use 'step show' instead")
-  .argument("<step-hash>", "CAS hash of the StepNode")
+  .argument("<step-hash>", "Step hash")
-  .action((stepHash: string) => {
+  .action(() => {
-    const storageRoot = resolveStorageRoot();
+    process.stderr.write(`Error: Command 'thread step-details' has been removed.
-    runAction(async () => {
+Use 'step show' instead.
-      const detail = await cmdThreadStepDetails(storageRoot, stepHash);
+
-      process.stdout.write(yamlStringify(detail));
+For more information, see: uwf help step show
-    });
+`);
    process.exit(1);
  });
 thread
  .command("fork")
  .description("[DEPRECATED] Use 'step fork' instead")
  .argument("<step-hash>", "Step hash")
  .action(() => {
    process.stderr.write(`Error: Command 'thread fork' has been removed.
 Use 'step fork' instead.
 For more information, see: uwf help step fork
 `);
    process.exit(1);
  });
 thread
  .command("kill")
  .description("[DEPRECATED] Use 'thread stop' or 'thread cancel' instead")
  .argument("<thread-id>", "Thread ULID")
  .action(() => {
    process.stderr.write(`Error: Command 'thread kill' has been removed.
 Use 'thread stop' to stop background execution (keep thread active),
 or 'thread cancel' to cancel and archive the thread.
 For more information, see:
  uwf help thread stop
  uwf help thread cancel
 `);
    process.exit(1);
  });
 thread
  .command("running")
  .description("[DEPRECATED] Use 'thread list --status running' instead")
  .action(() => {
    process.stderr.write(`Error: Command 'thread running' has been removed.
 Use 'thread list --status running' instead.
 For more information, see: uwf help thread list
 `);
    process.exit(1);
  });
 const skill = program.command("skill").description("Built-in skill references for agents");
@@ -0,0 +1,227 @@
 import type { Store as CasStore, JSONSchema } from "@uncaged/json-cas";
 import { getSchema } from "@uncaged/json-cas";
 import type {
  CasRef,
  StartNodePayload,
  StepNodePayload,
  ThreadId,
 } from "@uncaged/workflow-protocol";
 import { loadThreadsIndex, type UwfStore } from "../store.js";
 type ChainState = {
  startHash: CasRef;
  start: StartNodePayload;
  stepsNewestFirst: StepNodePayload[];
  headIsStart: boolean;
 };
 type OrderedStepItem = {
  hash: CasRef;
  payload: StepNodePayload;
  timestamp: number;
 };
 function fail(message: string): never {
  process.stderr.write(`${message}\n`);
  process.exit(1);
 }
 function walkChain(uwf: UwfStore, headHash: CasRef): ChainState {
  const headNode = uwf.store.get(headHash);
  if (headNode === null) {
    fail(`CAS node not found: ${headHash}`);
  }
  if (headNode.type === uwf.schemas.startNode) {
    return {
      startHash: headHash,
      start: headNode.payload as StartNodePayload,
      stepsNewestFirst: [],
      headIsStart: true,
    };
  }
  if (headNode.type !== uwf.schemas.stepNode) {
    fail(`head ${headHash} is not a StartNode or StepNode`);
  }
  const stepsNewestFirst: StepNodePayload[] = [];
  let hash: CasRef | null = headHash;
  while (hash !== null) {
    const node = uwf.store.get(hash);
    if (node === null) {
      fail(`CAS node not found while walking chain: ${hash}`);
    }
    if (node.type !== uwf.schemas.stepNode) {
      break;
    }
    const payload = node.payload as StepNodePayload;
    stepsNewestFirst.push(payload);
    hash = payload.prev;
  }
  const newest = stepsNewestFirst[0];
  if (newest === undefined) {
    fail(`empty step chain at head ${headHash}`);
  }
  const startNode = uwf.store.get(newest.start);
  if (startNode === null || startNode.type !== uwf.schemas.startNode) {
    fail(`StartNode not found: ${newest.start}`);
  }
  return {
    startHash: newest.start,
    start: startNode.payload as StartNodePayload,
    stepsNewestFirst,
    headIsStart: false,
  };
 }
 function expandOutput(uwf: UwfStore, outputRef: CasRef): unknown {
  const node = uwf.store.get(outputRef);
  if (node === null) {
    return {};
  }
  return node.payload;
 }
 /**
 * Recursively expand all cas_ref fields in a CAS node's payload,
 * replacing hash strings with the referenced node's expanded payload.
 */
 function expandDeep(store: CasStore, hash: CasRef, visited?: Set<string>): unknown {
  const seen = visited ?? new Set<string>();
  if (seen.has(hash)) return hash; // cycle guard
  seen.add(hash);
  const node = store.get(hash);
  if (node === null) return hash;
  const schema = getSchema(store, node.type);
  if (schema === null) return node.payload;
  return expandValue(store, schema, node.payload, seen);
 }
 function expandCasRefField(store: CasStore, value: unknown, visited: Set<string>): unknown {
  if (typeof value === "string") {
    return expandDeep(store, value as CasRef, visited);
  }
  return value;
 }
 function expandAnyOfField(
  store: CasStore,
  schema: JSONSchema,
  value: unknown,
  visited: Set<string>,
 ): unknown {
  if (!Array.isArray(schema.anyOf)) return value;
  for (const sub of schema.anyOf as JSONSchema[]) {
    if (sub.format === "cas_ref" && typeof value === "string") {
      return expandDeep(store, value as CasRef, visited);
    }
  }
  return value;
 }
 function expandArrayField(
  store: CasStore,
  schema: JSONSchema,
  value: unknown,
  visited: Set<string>,
 ): unknown {
  if (!schema.items || !Array.isArray(value)) return value;
  const itemSchema = schema.items as JSONSchema;
  return (value as unknown[]).map((item) => expandValue(store, itemSchema, item, visited));
 }
 function expandObjectField(
  store: CasStore,
  schema: JSONSchema,
  value: unknown,
  visited: Set<string>,
 ): unknown {
  if (value === null || typeof value !== "object" || Array.isArray(value) || !schema.properties) {
    return value;
  }
  const props = schema.properties as Record<string, JSONSchema>;
  const obj = value as Record<string, unknown>;
  const result: Record<string, unknown> = {};
  for (const [key, val] of Object.entries(obj)) {
    const propSchema = props[key];
    result[key] = propSchema ? expandValue(store, propSchema, val, visited) : val;
  }
  return result;
 }
 function expandValue(
  store: CasStore,
  schema: JSONSchema,
  value: unknown,
  visited: Set<string>,
 ): unknown {
  if (schema.format === "cas_ref") return expandCasRefField(store, value, visited);
  if (Array.isArray(schema.anyOf)) return expandAnyOfField(store, schema, value, visited);
  if (schema.type === "array") return expandArrayField(store, schema, value, visited);
  return expandObjectField(store, schema, value, visited);
 }
 function collectOrderedSteps(
  uwf: UwfStore,
  headHash: CasRef,
  chain: ChainState,
 ): OrderedStepItem[] {
  let hash: CasRef | null = headHash;
  const hashToNode = new Map<string, { payload: StepNodePayload; timestamp: number }>();
  while (hash !== null) {
    const node = uwf.store.get(hash);
    if (node === null || node.type !== uwf.schemas.stepNode) {
      break;
    }
    const payload = node.payload as StepNodePayload;
    hashToNode.set(hash, { payload, timestamp: node.timestamp });
    hash = payload.prev;
  }
  let cur: CasRef | null = chain.headIsStart ? null : headHash;
  const ordered: OrderedStepItem[] = [];
  while (cur !== null) {
    const entry = hashToNode.get(cur);
    if (entry === undefined) {
      break;
    }
    ordered.push({ hash: cur, ...entry });
    cur = entry.payload.prev;
  }
  ordered.reverse();
  return ordered;
 }
 async function resolveHeadHash(storageRoot: string, threadId: ThreadId): Promise<CasRef> {
  const index = await loadThreadsIndex(storageRoot);
  const head = index[threadId];
  if (head === undefined) {
    fail(`thread not active: ${threadId}`);
  }
  return head;
 }
 export {
  type ChainState,
  collectOrderedSteps,
  expandAnyOfField,
  expandArrayField,
  expandCasRefField,
  expandDeep,
  expandObjectField,
  expandOutput,
  expandValue,
  fail,
  type OrderedStepItem,
  resolveHeadHash,
  walkChain,
 };
@@ -0,0 +1,145 @@
 import type {
  CasRef,
  StartEntry,
  StepEntry,
  StepNodePayload,
  ThreadForkOutput,
  ThreadId,
  ThreadStepsOutput,
 } from "@uncaged/workflow-protocol";
 import { generateUlid } from "@uncaged/workflow-util";
 import { createUwfStore, loadThreadsIndex, saveThreadsIndex } from "../store.js";
 import {
  collectOrderedSteps,
  expandDeep,
  expandOutput,
  fail,
  resolveHeadHash,
  walkChain,
 } from "./shared.js";
 /**
 * List all steps in a thread (previously: thread steps)
 */
 export async function cmdStepList(
  storageRoot: string,
  threadId: ThreadId,
 ): Promise<ThreadStepsOutput> {
  const headHash = await resolveHeadHash(storageRoot, threadId);
  const uwf = await createUwfStore(storageRoot);
  const chain = walkChain(uwf, headHash);
  const startNode = uwf.store.get(chain.startHash);
  if (startNode === null) {
    fail(`StartNode not found: ${chain.startHash}`);
  }
  const startEntry: StartEntry = {
    hash: chain.startHash,
    workflow: chain.start.workflow,
    prompt: chain.start.prompt,
    timestamp: startNode.timestamp,
  };
  const stepEntries: StepEntry[] = [];
  const ordered = collectOrderedSteps(uwf, headHash, chain);
  for (const item of ordered) {
    stepEntries.push({
      hash: item.hash,
      role: item.payload.role,
      output: expandOutput(uwf, item.payload.output),
      detail: item.payload.detail ?? null,
      agent: item.payload.agent,
      timestamp: item.timestamp,
    });
  }
  return {
    thread: threadId,
    workflow: chain.start.workflow,
    steps: [startEntry, ...stepEntries],
  };
 }
 /**
 * Show details of a specific step (previously: thread step-details)
 */
 export async function cmdStepShow(storageRoot: string, stepHash: CasRef): Promise<unknown> {
  const uwf = await createUwfStore(storageRoot);
  const node = uwf.store.get(stepHash);
  if (node === null) {
    fail(`CAS node not found: ${stepHash}`);
  }
  if (node.type !== uwf.schemas.stepNode) {
    fail(`node ${stepHash} is not a StepNode`);
  }
  const payload = node.payload as StepNodePayload;
  if (!payload.detail) {
    fail(`step ${stepHash} has no detail`);
  }
  return expandDeep(uwf.store, payload.detail);
 }
 /**
 * Fork a thread from a specific step (previously: thread fork)
 */
 export async function cmdStepFork(
  storageRoot: string,
  stepHash: CasRef,
 ): Promise<ThreadForkOutput> {
  const uwf = await createUwfStore(storageRoot);
  const node = uwf.store.get(stepHash);
  if (node === null) {
    fail(`CAS node not found: ${stepHash}`);
  }
  if (node.type !== uwf.schemas.startNode && node.type !== uwf.schemas.stepNode) {
    fail(`node ${stepHash} is not a StartNode or StepNode`);
  }
  const newThreadId = generateUlid(Date.now()) as ThreadId;
  const index = await loadThreadsIndex(storageRoot);
  index[newThreadId] = stepHash;
  await saveThreadsIndex(storageRoot, index);
  return {
    thread: newThreadId,
    forkedFrom: {
      step: stepHash,
    },
  };
 }
 /**
 * Read a step's agent output as markdown (new command - requires #462)
 * TODO: Implement once unified agent detail/turn schema is available
 */
 export async function cmdStepRead(
  storageRoot: string,
  stepHash: CasRef,
  _before: number | null = null,
 ): Promise<string> {
  const uwf = await createUwfStore(storageRoot);
  const node = uwf.store.get(stepHash);
  if (node === null) {
    fail(`CAS node not found: ${stepHash}`);
  }
  if (node.type !== uwf.schemas.stepNode) {
    fail(`node ${stepHash} is not a StepNode`);
  }
  const payload = node.payload as StepNodePayload;
  if (!payload.output) {
    fail(`step ${stepHash} has no output`);
  }
  // TODO: Implement progressive turn reading with --before N
  // For now, return a placeholder
  const outputNode = uwf.store.get(payload.output);
  if (outputNode === null) {
    fail(`output node not found: ${payload.output}`);
  }
  // Return the output as JSON for now
  // Once #462 is implemented, this will properly format frontmatter + markdown
  return JSON.stringify(outputNode.payload, null, 2);
 }
@@ -1,8 +1,7 @@
 import { execFileSync, spawn } from "node:child_process";
 import { access, readFile } from "node:fs/promises";
 import { dirname, isAbsolute, resolve as resolvePath } from "node:path";
-import type { Store as CasStore, JSONSchema } from "@uncaged/json-cas";
+import { validate } from "@uncaged/json-cas";
 import { getSchema, validate } from "@uncaged/json-cas";
 import { getEnvPath, loadWorkflowConfig } from "@uncaged/workflow-agent-kit";
 import { evaluate } from "@uncaged/workflow-moderator";
 import type {
@@ -11,17 +10,13 @@ import type {
  CasRef,
  ModeratorContext,
  RunningThreadsOutput,
  StartEntry,
  StartNodePayload,
  StartOutput,
  StepContext,
  StepEntry,
  StepNodePayload,
  StepOutput,
  ThreadForkOutput,
  ThreadId,
  ThreadListItem,
  ThreadStepsOutput,
  WorkflowConfig,
  WorkflowPayload,
 } from "@uncaged/workflow-protocol";
@@ -47,6 +42,14 @@ import {
  type UwfStore,
 } from "../store.js";
 import { checkWorkflowFilenameConsistency, isCasRef, parseWorkflowPayload } from "../validate.js";
 import {
  type ChainState,
  collectOrderedSteps,
  expandOutput,
  fail,
  type OrderedStepItem,
  walkChain,
 } from "./shared.js";
 import { materializeWorkflowPayload } from "./workflow.js";
 const END_ROLE = "$END";
@@ -65,29 +68,6 @@ function failStep(plog: ProcessLogger, message: string): never {
  fail(message);
 }
 type ChainState = {
  startHash: CasRef;
  start: StartNodePayload;
  stepsNewestFirst: StepNodePayload[];
  headIsStart: boolean;
 };
 type OrderedStepItem = {
  hash: CasRef;
  payload: StepNodePayload;
  timestamp: number;
 };
 export type KillOutput = {
  thread: ThreadId;
  archived: boolean;
 };
 function fail(message: string): never {
  process.stderr.write(`${message}\n`);
  process.exit(1);
 }
 /**
 * Check if a string looks like a file path (contains path separators or has .yaml/.yml extension).
 */
@@ -346,226 +326,70 @@ export async function cmdThreadShow(storageRoot: string, threadId: ThreadId): Pr
  fail(`thread not found: ${threadId}`);
 }
 export type ThreadStatus = "idle" | "running" | "completed";
 export type ThreadListItemWithStatus = ThreadListItem & {
  status: ThreadStatus;
 };
 async function threadListItemFromActive(
  storageRoot: string,
  uwf: UwfStore,
  threadId: ThreadId,
  head: CasRef,
-): Promise<ThreadListItem | null> {
+): Promise<ThreadListItemWithStatus | null> {
  const workflow = resolveWorkflowFromHead(uwf, head);
  if (workflow === null) {
    return null;
  }
-  return { thread: threadId, workflow, head };
+
  // Check if thread is currently running in background
  const runningMarker = await isThreadRunning(storageRoot, threadId);
  const status: ThreadStatus = runningMarker !== null ? "running" : "idle";
  return { thread: threadId, workflow, head, status };
 }
 export async function cmdThreadList(
  storageRoot: string,
-  includeAll: boolean,
+  statusFilter: ThreadStatus | null,
-): Promise<ThreadListItem[]> {
+): Promise<ThreadListItemWithStatus[]> {
  const uwf = await createUwfStore(storageRoot);
  const index = await loadThreadsIndex(storageRoot);
-  const items: ThreadListItem[] = [];
+  const items: ThreadListItemWithStatus[] = [];
  // Add active threads
  for (const [threadId, head] of Object.entries(index)) {
-    const item = await threadListItemFromActive(uwf, threadId as ThreadId, head);
+    const item = await threadListItemFromActive(storageRoot, uwf, threadId as ThreadId, head);
    if (item !== null) {
      items.push(item);
    }
  }
-  if (!includeAll) {
+  // Add completed threads if requested
-    return items;
+  if (statusFilter === "completed" || statusFilter === null) {
    const activeIds = new Set(items.map((i) => i.thread));
    const history = await loadThreadHistory(storageRoot);
    for (const entry of history) {
      if (!activeIds.has(entry.thread)) {
        items.push({
          thread: entry.thread,
          workflow: entry.workflow,
          head: entry.head,
          status: "completed",
        });
      }
    }
  }
-  const activeIds = new Set(items.map((i) => i.thread));
+  // Apply status filter if provided
-  const history = await loadThreadHistory(storageRoot);
+  if (statusFilter !== null) {
-  for (const entry of history) {
+    return items.filter((item) => item.status === statusFilter);
    if (!activeIds.has(entry.thread)) {
      items.push({
        thread: entry.thread,
        workflow: entry.workflow,
        head: entry.head,
      });
    }
  }
  return items;
 }
 function walkChain(uwf: UwfStore, headHash: CasRef): ChainState {
  const headNode = uwf.store.get(headHash);
  if (headNode === null) {
    fail(`CAS node not found: ${headHash}`);
  }
  if (headNode.type === uwf.schemas.startNode) {
    return {
      startHash: headHash,
      start: headNode.payload as StartNodePayload,
      stepsNewestFirst: [],
      headIsStart: true,
    };
  }
  if (headNode.type !== uwf.schemas.stepNode) {
    fail(`head ${headHash} is not a StartNode or StepNode`);
  }
  const stepsNewestFirst: StepNodePayload[] = [];
  let hash: CasRef | null = headHash;
  while (hash !== null) {
    const node = uwf.store.get(hash);
    if (node === null) {
      fail(`CAS node not found while walking chain: ${hash}`);
    }
    if (node.type !== uwf.schemas.stepNode) {
      break;
    }
    const payload = node.payload as StepNodePayload;
    stepsNewestFirst.push(payload);
    hash = payload.prev;
  }
  const newest = stepsNewestFirst[0];
  if (newest === undefined) {
    fail(`empty step chain at head ${headHash}`);
  }
  const startNode = uwf.store.get(newest.start);
  if (startNode === null || startNode.type !== uwf.schemas.startNode) {
    fail(`StartNode not found: ${newest.start}`);
  }
  return {
    startHash: newest.start,
    start: startNode.payload as StartNodePayload,
    stepsNewestFirst,
    headIsStart: false,
  };
 }
 function expandOutput(uwf: UwfStore, outputRef: CasRef): unknown {
  const node = uwf.store.get(outputRef);
  if (node === null) {
    return {};
  }
  return node.payload;
 }
 /**
 * Recursively expand all cas_ref fields in a CAS node's payload,
 * replacing hash strings with the referenced node's expanded payload.
 */
 function expandDeep(store: CasStore, hash: CasRef, visited?: Set<string>): unknown {
  const seen = visited ?? new Set<string>();
  if (seen.has(hash)) return hash; // cycle guard
  seen.add(hash);
  const node = store.get(hash);
  if (node === null) return hash;
  const schema = getSchema(store, node.type);
  if (schema === null) return node.payload;
  return expandValue(store, schema, node.payload, seen);
 }
 function expandCasRefField(store: CasStore, value: unknown, visited: Set<string>): unknown {
  if (typeof value === "string") {
    return expandDeep(store, value as CasRef, visited);
  }
  return value;
 }
 function expandAnyOfField(
  store: CasStore,
  schema: JSONSchema,
  value: unknown,
  visited: Set<string>,
 ): unknown {
  if (!Array.isArray(schema.anyOf)) return value;
  for (const sub of schema.anyOf as JSONSchema[]) {
    if (sub.format === "cas_ref" && typeof value === "string") {
      return expandDeep(store, value as CasRef, visited);
    }
  }
  return value;
 }
 function expandArrayField(
  store: CasStore,
  schema: JSONSchema,
  value: unknown,
  visited: Set<string>,
 ): unknown {
  if (!schema.items || !Array.isArray(value)) return value;
  const itemSchema = schema.items as JSONSchema;
  return (value as unknown[]).map((item) => expandValue(store, itemSchema, item, visited));
 }
 function expandObjectField(
  store: CasStore,
  schema: JSONSchema,
  value: unknown,
  visited: Set<string>,
 ): unknown {
  if (value === null || typeof value !== "object" || Array.isArray(value) || !schema.properties) {
    return value;
  }
  const props = schema.properties as Record<string, JSONSchema>;
  const obj = value as Record<string, unknown>;
  const result: Record<string, unknown> = {};
  for (const [key, val] of Object.entries(obj)) {
    const propSchema = props[key];
    result[key] = propSchema ? expandValue(store, propSchema, val, visited) : val;
  }
  return result;
 }
 function expandValue(
  store: CasStore,
  schema: JSONSchema,
  value: unknown,
  visited: Set<string>,
 ): unknown {
  if (schema.format === "cas_ref") return expandCasRefField(store, value, visited);
  if (Array.isArray(schema.anyOf)) return expandAnyOfField(store, schema, value, visited);
  if (schema.type === "array") return expandArrayField(store, schema, value, visited);
  return expandObjectField(store, schema, value, visited);
 }
 function collectOrderedSteps(
  uwf: UwfStore,
  headHash: CasRef,
  chain: ChainState,
 ): OrderedStepItem[] {
  let hash: CasRef | null = headHash;
  const hashToNode = new Map<string, { payload: StepNodePayload; timestamp: number }>();
  while (hash !== null) {
    const node = uwf.store.get(hash);
    if (node === null || node.type !== uwf.schemas.stepNode) {
      break;
    }
    const payload = node.payload as StepNodePayload;
    hashToNode.set(hash, { payload, timestamp: node.timestamp });
    hash = payload.prev;
  }
  let cur: CasRef | null = chain.headIsStart ? null : headHash;
  const ordered: OrderedStepItem[] = [];
  while (cur !== null) {
    const entry = hashToNode.get(cur);
    if (entry === undefined) {
      break;
    }
    ordered.push({ hash: cur, ...entry });
    cur = entry.payload.prev;
  }
  ordered.reverse();
  return ordered;
 }
 function formatYaml(value: unknown): string {
  return stringify(value, { aliasDuplicateObjects: false }).trimEnd();
 }
@@ -665,14 +489,14 @@ function formatStepPrompt(
 ): string {
  if (!roleDef || shownPromptRoles.has(role)) return "";
  shownPromptRoles.add(role);
-  return ["", "", "### Prompt", "", roleDef.goal].join("\n");
+  return ["", "", "<prompt>", roleDef.goal, "</prompt>"].join("\n");
 }
 function formatStepContent(uwf: UwfStore, item: OrderedStepItem): string {
  if (!item.payload.detail) return "";
  const content = extractLastAssistantContent(uwf, item.payload.detail);
  if (content === null) return "";
-  return ["", "", "### Content", "", content].join("\n");
+  return ["", "", "<output>", content, "</output>"].join("\n");
 }
 function formatStartSection(options: {
@@ -857,7 +681,7 @@ async function archiveThread(
  });
 }
-export async function cmdThreadStep(
+export async function cmdThreadExec(
  storageRoot: string,
  threadId: ThreadId,
  agentOverride: string | null,
@@ -953,7 +777,7 @@ async function cmdThreadStepBackground(
    failStep(plog, "unable to determine script path for background execution");
  }
-  const args = ["thread", "step", threadId, "--count", String(count)];
+  const args = ["thread", "exec", threadId, "--count", String(count)];
  if (agentOverride !== null) {
    args.push("--agent", agentOverride);
@@ -1085,47 +909,6 @@ async function resolveHeadHash(storageRoot: string, threadId: ThreadId): Promise
  fail(`thread not found: ${threadId}`);
 }
 export async function cmdThreadSteps(
  storageRoot: string,
  threadId: ThreadId,
 ): Promise<ThreadStepsOutput> {
  const headHash = await resolveHeadHash(storageRoot, threadId);
  const uwf = await createUwfStore(storageRoot);
  const chain = walkChain(uwf, headHash);
  const startNode = uwf.store.get(chain.startHash);
  if (startNode === null) {
    fail(`StartNode not found: ${chain.startHash}`);
  }
  const startEntry: StartEntry = {
    hash: chain.startHash,
    workflow: chain.start.workflow,
    prompt: chain.start.prompt,
    timestamp: startNode.timestamp,
  };
  const stepEntries: StepEntry[] = [];
  const ordered = collectOrderedSteps(uwf, headHash, chain);
  for (const item of ordered) {
    stepEntries.push({
      hash: item.hash,
      role: item.payload.role,
      output: expandOutput(uwf, item.payload.output),
      detail: item.payload.detail,
      agent: item.payload.agent,
      timestamp: item.timestamp,
    });
  }
  return {
    thread: threadId,
    workflow: chain.start.workflow,
    steps: [startEntry, ...stepEntries],
  };
 }
 export async function cmdThreadRead(
  storageRoot: string,
  threadId: ThreadId,
@@ -1153,52 +936,50 @@ export async function cmdThreadRead(
  });
 }
-export async function cmdThreadFork(
+export type StopOutput = {
-  storageRoot: string,
+  thread: ThreadId;
-  stepHash: CasRef,
+  stopped: boolean;
-): Promise<ThreadForkOutput> {
+};
  const uwf = await createUwfStore(storageRoot);
  const node = uwf.store.get(stepHash);
  if (node === null) {
    fail(`CAS node not found: ${stepHash}`);
  }
  if (node.type !== uwf.schemas.startNode && node.type !== uwf.schemas.stepNode) {
    fail(`node ${stepHash} is not a StartNode or StepNode`);
  }
-  const newThreadId = generateUlid(Date.now()) as ThreadId;
+export type CancelOutput = {
  thread: ThreadId;
  cancelled: boolean;
 };
 /**
 * Stop background execution of a thread (but keep thread active)
 */
 export async function cmdThreadStop(storageRoot: string, threadId: ThreadId): Promise<StopOutput> {
  const index = await loadThreadsIndex(storageRoot);
-  index[newThreadId] = stepHash;
+  const head = index[threadId];
-  await saveThreadsIndex(storageRoot, index);
+  if (head === undefined) {
    fail(`thread not active: ${threadId}`);
  }
-  return {
+  // Check if thread is running in background and terminate it
-    thread: newThreadId,
+  const runningMarker = await isThreadRunning(storageRoot, threadId);
-    forkedFrom: {
+  if (runningMarker === null) {
-      step: stepHash,
+    process.stderr.write(`Warning: thread ${threadId} is not currently running\n`);
-    },
+    return { thread: threadId, stopped: false };
-  };
+  }
  try {
    process.kill(runningMarker.pid, "SIGTERM");
  } catch {
    // Process may have already exited, ignore error
  }
  await deleteMarker(storageRoot, threadId);
  return { thread: threadId, stopped: true };
 }
-export async function cmdThreadStepDetails(
+/**
 * Cancel a thread (stop execution + move to history)
 */
 export async function cmdThreadCancel(
  storageRoot: string,
-  stepHash: CasRef,
+  threadId: ThreadId,
-): Promise<unknown> {
+): Promise<CancelOutput> {
  const uwf = await createUwfStore(storageRoot);
  const node = uwf.store.get(stepHash);
  if (node === null) {
    fail(`CAS node not found: ${stepHash}`);
  }
  if (node.type !== uwf.schemas.stepNode) {
    fail(`node ${stepHash} is not a StepNode`);
  }
  const payload = node.payload as StepNodePayload;
  if (!payload.detail) {
    fail(`step ${stepHash} has no detail`);
  }
  return expandDeep(uwf.store, payload.detail);
 }
 export async function cmdThreadKill(storageRoot: string, threadId: ThreadId): Promise<KillOutput> {
  const index = await loadThreadsIndex(storageRoot);
  const head = index[threadId];
  if (head === undefined) {
@@ -1233,7 +1014,7 @@ export async function cmdThreadKill(storageRoot: string, threadId: ThreadId): Pr
  };
  await appendThreadHistory(storageRoot, historyEntry);
-  return { thread: threadId, archived: true };
+  return { thread: threadId, cancelled: true };
 }
 export async function cmdThreadRunning(storageRoot: string): Promise<RunningThreadsOutput> {
@@ -29,7 +29,7 @@ export type WorkflowListEntry = {
  origin: WorkflowOrigin;
 };
-export type WorkflowPutOutput = {
+export type WorkflowAddOutput = {
  name: string;
  hash: CasRef;
 };
@@ -111,10 +111,10 @@ export async function materializeWorkflowPayload(
  };
 }
-export async function cmdWorkflowPut(
+export async function cmdWorkflowAdd(
  storageRoot: string,
  filePath: string,
-): Promise<WorkflowPutOutput> {
+): Promise<WorkflowAddOutput> {
  let text: string;
  try {
    text = await readFile(filePath, "utf8");
@@ -19,7 +19,14 @@ mock.module("../src/tools/index.js", () => ({
  getBuiltinTools: () => [],
 }));
-import { executeTurnTools, runBuiltinLoop, shouldNudge } from "../src/loop.js";
+import {
  executeTurnTools,
  extractFinalText,
  runBuiltinLoop,
  shouldInjectDeadlineWarning,
  shouldNudge,
  shouldProcessToolCalls,
 } from "../src/loop.js";
 const fakeProvider = {} as any;
 const fakeToolCtx = {} as any;
@@ -154,3 +161,96 @@ describe("runBuiltinLoop integration", () => {
    expect(original.length).toBe(1);
  });
 });
 describe("shouldInjectDeadlineWarning", () => {
  test("5.1 returns true when turn count reaches warning threshold and not yet warned", () => {
    expect(shouldInjectDeadlineWarning(7, 10, false, false)).toBe(true);
  });
  test("5.2 returns false when already warned", () => {
    expect(shouldInjectDeadlineWarning(7, 10, true, false)).toBe(false);
  });
  test("5.3 returns false when noTools is true", () => {
    expect(shouldInjectDeadlineWarning(7, 10, false, true)).toBe(false);
  });
  test("5.4 returns false when turns remaining > DEADLINE_WARNING_TURNS", () => {
    expect(shouldInjectDeadlineWarning(5, 10, false, false)).toBe(false);
  });
  test("5.5 returns true when exactly at warning threshold", () => {
    expect(shouldInjectDeadlineWarning(7, 10, false, false)).toBe(true);
  });
  test("5.6 returns false when turns remaining is 0", () => {
    expect(shouldInjectDeadlineWarning(10, 10, false, false)).toBe(false);
  });
 });
 describe("shouldProcessToolCalls", () => {
  test("6.1 returns true when toolCalls present and noTools=false", () => {
    expect(shouldProcessToolCalls([{ id: "x", name: "read", arguments: "{}" }], false)).toBe(true);
  });
  test("6.2 returns false when toolCalls is null", () => {
    expect(shouldProcessToolCalls(null, false)).toBe(false);
  });
  test("6.3 returns false when toolCalls is empty array", () => {
    expect(shouldProcessToolCalls([], false)).toBe(false);
  });
  test("6.4 returns false when noTools=true", () => {
    expect(shouldProcessToolCalls([{ id: "x", name: "read", arguments: "{}" }], true)).toBe(false);
  });
  test("6.5 returns true when multiple tool calls present", () => {
    expect(
      shouldProcessToolCalls(
        [
          { id: "x1", name: "read", arguments: "{}" },
          { id: "x2", name: "write", arguments: "{}" },
        ],
        false,
      ),
    ).toBe(true);
  });
 });
 describe("extractFinalText", () => {
  test("7.1 returns last assistant message content", () => {
    const messages = [
      { role: "system" as const, content: "sys", tool_calls: null },
      { role: "assistant" as const, content: "first", tool_calls: null },
      { role: "assistant" as const, content: "last", tool_calls: null },
    ];
    expect(extractFinalText(messages)).toBe("last");
  });
  test("7.2 returns empty string when no assistant messages", () => {
    expect(extractFinalText([{ role: "system" as const, content: "sys", tool_calls: null }])).toBe(
      "",
    );
  });
  test("7.3 skips assistant messages with null content", () => {
    const messages = [
      { role: "assistant" as const, content: "first", tool_calls: null },
      {
        role: "assistant" as const,
        content: null,
        tool_calls: [{ id: "x", name: "t", arguments: "{}" }],
      },
      { role: "assistant" as const, content: "second", tool_calls: null },
    ];
    expect(extractFinalText(messages)).toBe("second");
  });
  test("7.4 skips assistant messages with empty content", () => {
    const messages = [
      { role: "assistant" as const, content: "first", tool_calls: null },
      { role: "assistant" as const, content: "", tool_calls: null },
      { role: "user" as const, content: "nudge", tool_calls: null },
    ];
    expect(extractFinalText(messages)).toBe("first");
  });
  test("7.5 handles empty messages array", () => {
    expect(extractFinalText([])).toBe("");
  });
  test("7.6 handles messages with only user and system roles", () => {
    const messages = [
      { role: "system" as const, content: "sys", tool_calls: null },
      { role: "user" as const, content: "query", tool_calls: null },
    ];
    expect(extractFinalText(messages)).toBe("");
  });
 });
@@ -1,7 +1,12 @@
 import type { ResolvedLlmProvider } from "@uncaged/workflow-agent-kit";
 import { createLogger } from "@uncaged/workflow-util";
-import { type ChatMessage, chatCompletionWithTools, type LlmToolCall } from "./llm/index.js";
+import {
  type ChatMessage,
  chatCompletionWithTools,
  type LlmToolCall,
  type OpenAiToolDefinition,
 } from "./llm/index.js";
 import { appendSessionTurn } from "./session.js";
 import {
  builtinToolsToOpenAi,
@@ -80,10 +85,184 @@ export type ShouldNudgeOptions = {
 const MAX_NUDGES = 3;
 const DEADLINE_WARNING_TURNS = 3;
 export function shouldInjectDeadlineWarning(
  turn: number,
  maxTurns: number,
  alreadyWarned: boolean,
  noTools: boolean,
 ): boolean {
  const turnsRemaining = maxTurns - turn;
  return (
    !noTools && !alreadyWarned && turnsRemaining > 0 && turnsRemaining <= DEADLINE_WARNING_TURNS
  );
 }
 export function shouldProcessToolCalls(toolCalls: LlmToolCall[] | null, noTools: boolean): boolean {
  return !noTools && toolCalls !== null && toolCalls.length > 0;
 }
 export function extractFinalText(messages: ChatMessage[]): string {
  for (let i = messages.length - 1; i >= 0; i--) {
    const msg = messages[i];
    if (
      msg !== undefined &&
      msg.role === "assistant" &&
      msg.content !== null &&
      msg.content.trim() !== ""
    ) {
      return msg.content;
    }
  }
  return "";
 }
 function injectDeadlineWarning(messages: ChatMessage[], turnsRemaining: number): void {
  log("4NRXW6KT", `${turnsRemaining} turns remaining, injecting deadline warning`);
  messages.push({
    role: "user",
    content:
      `⚠️ You have ${turnsRemaining} turns remaining. ` +
      "Wrap up your work and output the YAML frontmatter starting with `---`. " +
      "If you cannot finish in time, output frontmatter with `status: failed` and describe what remains.",
  });
 }
 type HandleTextOnlyTurnResult = {
  shouldBreak: boolean;
  finalText: string;
  turnCount: number;
  nudgeCount: number;
  turnAdjustment: number;
 };
 async function handleTextOnlyTurn(
  text: string,
  messages: ChatMessage[],
  storageRoot: string,
  sessionId: string,
  noTools: boolean,
  turn: number,
  maxTurns: number,
  currentNudgeCount: number,
 ): Promise<HandleTextOnlyTurnResult> {
  await appendTurn(storageRoot, sessionId, {
    role: "assistant",
    content: text,
    toolCalls: null,
    reasoning: null,
  });
  const turnCount = 1;
  let nudgeCount = currentNudgeCount;
  let turnAdjustment = 0;
  if (shouldNudge({ noTools, text, turn, maxTurns })) {
    nudgeCount += 1;
    log("7FXQM2KN", `text-only turn without frontmatter, nudge ${nudgeCount}/${MAX_NUDGES}`);
    const nudge =
      "You stopped calling tools but your response does not start with the required `---` YAML frontmatter. " +
      "Either continue using tools to complete your work, or output your final response starting with `---`.";
    messages.push({ role: "user", content: nudge });
    // Nudge doesn't consume turn budget (up to MAX_NUDGES)
    if (nudgeCount <= MAX_NUDGES) {
      turnAdjustment = -1;
    }
    return { shouldBreak: false, finalText: "", turnCount, nudgeCount, turnAdjustment };
  }
  return { shouldBreak: true, finalText: text, turnCount, nudgeCount, turnAdjustment };
 }
 async function handleToolCallTurn(
  content: string,
  toolCalls: LlmToolCall[],
  messages: ChatMessage[],
  storageRoot: string,
  sessionId: string,
  toolCtx: ToolContext,
 ): Promise<number> {
  await appendTurn(storageRoot, sessionId, {
    role: "assistant",
    content,
    toolCalls: mapToolCallsForPayload(toolCalls),
    reasoning: null,
  });
  let turnCount = 1;
  // Execute tools
  turnCount += await executeTurnTools(toolCalls, toolCtx, messages, storageRoot, sessionId);
  return turnCount;
 }
 export function shouldNudge({ noTools, text, turn, maxTurns }: ShouldNudgeOptions): boolean {
  return !noTools && !text.trimStart().startsWith("---") && turn < maxTurns - 1;
 }
 type ProcessLoopIterationResult = {
  shouldBreak: boolean;
  finalText: string;
  turnCount: number;
  nudgeCount: number;
  turnAdjustment: number;
 };
 async function processLoopIteration(
  options: RunBuiltinLoopOptions,
  messages: ChatMessage[],
  openAiTools: OpenAiToolDefinition[],
  turn: number,
  nudgeCount: number,
 ): Promise<ProcessLoopIterationResult> {
  const response = await chatCompletionWithTools(
    options.provider,
    messages,
    openAiTools.length > 0 ? openAiTools : null,
  );
  // When noTools is set, ignore any tool_calls the LLM might still return
  const effectiveToolCalls = options.noTools ? null : (response.toolCalls ?? null);
  const assistantMessage: ChatMessage = {
    role: "assistant",
    content: response.content,
    tool_calls: effectiveToolCalls,
  };
  messages.push(assistantMessage);
  if (!shouldProcessToolCalls(effectiveToolCalls, options.noTools)) {
    const text = response.content ?? "";
    const result = await handleTextOnlyTurn(
      text,
      messages,
      options.storageRoot,
      options.sessionId,
      options.noTools,
      turn,
      options.maxTurns,
      nudgeCount,
    );
    return result;
  }
  // At this point, effectiveToolCalls is guaranteed to be non-null and non-empty
  const turnCount = await handleToolCallTurn(
    response.content ?? "",
    effectiveToolCalls as LlmToolCall[],
    messages,
    options.storageRoot,
    options.sessionId,
    options.toolCtx,
  );
  return {
    shouldBreak: false,
    finalText: "",
    turnCount,
    nudgeCount,
    turnAdjustment: 0,
  };
 }
 /** Agent run loop: LLM ↔ tools until no tool_calls or maxTurns. */
 export async function runBuiltinLoop(
  options: RunBuiltinLoopOptions,
@@ -99,95 +278,25 @@ export async function runBuiltinLoop(
    log("8K2M4N7P", `builtin loop turn ${turn + 1}/${options.maxTurns}`);
    // Warn agent when approaching turn limit
-    const turnsRemaining = options.maxTurns - turn;
+    if (shouldInjectDeadlineWarning(turn, options.maxTurns, deadlineWarned, options.noTools)) {
    if (!options.noTools && !deadlineWarned && turnsRemaining <= DEADLINE_WARNING_TURNS) {
      deadlineWarned = true;
-      log("4NRXW6KT", `${turnsRemaining} turns remaining, injecting deadline warning`);
+      const turnsRemaining = options.maxTurns - turn;
-      messages.push({
+      injectDeadlineWarning(messages, turnsRemaining);
        role: "user",
        content:
          `⚠️ You have ${turnsRemaining} turns remaining. ` +
          "Wrap up your work and output the YAML frontmatter starting with `---`. " +
          "If you cannot finish in time, output frontmatter with `status: failed` and describe what remains.",
      });
    }
-    const response = await chatCompletionWithTools(
+    const result = await processLoopIteration(options, messages, openAiTools, turn, nudgeCount);
-      options.provider,
+    turnCount += result.turnCount;
-      messages,
+    nudgeCount = result.nudgeCount;
-      openAiTools.length > 0 ? openAiTools : null,
+    turn += result.turnAdjustment;
    );
-    // When noTools is set, ignore any tool_calls the LLM might still return
+    if (result.shouldBreak) {
-    const effectiveToolCalls = options.noTools ? null : (response.toolCalls ?? null);
+      finalText = result.finalText;
    const assistantMessage: ChatMessage = {
      role: "assistant",
      content: response.content,
      tool_calls: effectiveToolCalls,
    };
    messages.push(assistantMessage);
    if (effectiveToolCalls === null || effectiveToolCalls.length === 0) {
      const text = response.content ?? "";
      await appendTurn(options.storageRoot, options.sessionId, {
        role: "assistant",
        content: text,
        toolCalls: null,
        reasoning: null,
      });
      turnCount += 1;
      if (shouldNudge({ noTools: options.noTools, text, turn, maxTurns: options.maxTurns })) {
        nudgeCount += 1;
        log("7FXQM2KN", `text-only turn without frontmatter, nudge ${nudgeCount}/${MAX_NUDGES}`);
        const nudge =
          "You stopped calling tools but your response does not start with the required `---` YAML frontmatter. " +
          "Either continue using tools to complete your work, or output your final response starting with `---`.";
        messages.push({ role: "user", content: nudge });
        // Nudge doesn't consume turn budget (up to MAX_NUDGES)
        if (nudgeCount <= MAX_NUDGES) {
          turn -= 1;
        }
        continue;
      }
      finalText = text;
      break;
    }
    // Assistant turn with tool calls
    await appendTurn(options.storageRoot, options.sessionId, {
      role: "assistant",
      content: response.content ?? "",
      toolCalls: mapToolCallsForPayload(effectiveToolCalls),
      reasoning: null,
    });
    turnCount += 1;
    // Execute tools
    turnCount += await executeTurnTools(
      effectiveToolCalls,
      options.toolCtx,
      messages,
      options.storageRoot,
      options.sessionId,
    );
  }
-  if (finalText === "" && messages.length > 0) {
+  if (finalText === "") {
-    for (let i = messages.length - 1; i >= 0; i--) {
+    finalText = extractFinalText(messages);
      const msg = messages[i];
      if (
        msg !== undefined &&
        msg.role === "assistant" &&
        msg.content !== null &&
        msg.content.trim() !== ""
      ) {
        finalText = msg.content;
        break;
      }
    }
  }
  return { finalText, messages, turnCount };
@@ -146,13 +146,13 @@ async function runClaudeCode(ctx: AgentContext): Promise<AgentRunResult> {
  // Try resuming a cached session for re-entry scenarios (e.g. reviewer reject → developer re-entry).
  if (!ctx.isFirstVisit) {
-    const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role);
+    const cachedSessionId = await getCachedSessionId("claude-code", ctx.threadId, ctx.role);
    if (cachedSessionId !== null) {
      try {
        const { stdout } = await spawnClaudeResume(cachedSessionId, fullPrompt);
        const result = await processClaudeOutput(stdout, ctx.store);
        if (result.sessionId !== undefined && result.sessionId !== "") {
-          await setCachedSessionId(ctx.threadId, ctx.role, result.sessionId);
+          await setCachedSessionId("claude-code", ctx.threadId, ctx.role, result.sessionId);
        }
        return result;
      } catch (err) {
@@ -169,7 +169,7 @@ async function runClaudeCode(ctx: AgentContext): Promise<AgentRunResult> {
  const { stdout } = await spawnClaudeRun(fullPrompt);
  const result = await processClaudeOutput(stdout, ctx.store);
  if (result.sessionId !== undefined && result.sessionId !== "") {
-    await setCachedSessionId(ctx.threadId, ctx.role, result.sessionId);
+    await setCachedSessionId("claude-code", ctx.threadId, ctx.role, result.sessionId);
  }
  return result;
 }
@@ -1,5 +1,22 @@
-// Re-export session cache from the shared agent-kit package.
+// Re-export session cache from the shared agent-kit package with agent name injected.
-export { getCachedSessionId, setCachedSessionId } from "@uncaged/workflow-agent-kit";
+
 import {
  getCachedSessionId as getCachedSessionIdBase,
  setCachedSessionId as setCachedSessionIdBase,
 } from "@uncaged/workflow-agent-kit";
 import type { ThreadId } from "@uncaged/workflow-protocol";
 export async function getCachedSessionId(threadId: ThreadId, role: string): Promise<string | null> {
  return getCachedSessionIdBase("hermes", threadId, role);
 }
 export async function setCachedSessionId(
  threadId: ThreadId,
  role: string,
  sessionId: string,
 ): Promise<void> {
  return setCachedSessionIdBase("hermes", threadId, role, sessionId);
 }
 export function isResumeDisabled(): boolean {
  // Hermes ACP session/resume is broken: _restore fails for custom providers
@@ -0,0 +1,247 @@
 import { mkdir, readdir, readFile, rm, stat, writeFile } from "node:fs/promises";
 import { dirname, join } from "node:path";
 import type { ThreadId } from "@uncaged/workflow-protocol";
 import { afterEach, beforeEach, describe, expect, test } from "vitest";
 import { getCachedSessionId, getCachePath, setCachedSessionId } from "../src/session-cache.js";
 import { resolveStorageRoot } from "../src/storage.js";
 describe("session-cache", () => {
  let originalStorageRoot: string;
  let testStorageRoot: string;
  beforeEach(async () => {
    // Create a temporary test storage root
    originalStorageRoot = resolveStorageRoot();
    testStorageRoot = join(originalStorageRoot, "test-cache", `test-${Date.now()}`);
    await mkdir(testStorageRoot, { recursive: true });
    // Override the storage root for testing
    process.env.WORKFLOW_STORAGE_ROOT = testStorageRoot;
  });
  afterEach(async () => {
    // Clean up test storage root
    await rm(testStorageRoot, { recursive: true, force: true });
    delete process.env.WORKFLOW_STORAGE_ROOT;
  });
  describe("getCachePath", () => {
    test("returns agent-specific file path", () => {
      const path = getCachePath("claude-code");
      expect(path).toMatch(/\/cache\/claude-code-sessions\.json$/);
    });
    test("returns different paths for different agents", () => {
      const pathClaudeCode = getCachePath("claude-code");
      const pathHermes = getCachePath("hermes");
      expect(pathClaudeCode).not.toBe(pathHermes);
      expect(pathClaudeCode).toMatch(/claude-code-sessions\.json$/);
      expect(pathHermes).toMatch(/hermes-sessions\.json$/);
    });
    test("handles agent names with special characters", () => {
      const path1 = getCachePath("my-agent");
      const path2 = getCachePath("my_agent");
      expect(path1).toMatch(/my-agent-sessions\.json$/);
      expect(path2).toMatch(/my_agent-sessions\.json$/);
    });
  });
  describe("session isolation", () => {
    const threadId = "01234567890123456789012345" as ThreadId;
    const role = "developer";
    test("sessions are isolated per agent", async () => {
      // Cache different session IDs for each agent
      await setCachedSessionId("claude-code", threadId, role, "session-cc-001");
      await setCachedSessionId("hermes", threadId, role, "session-hermes-001");
      // Each agent should retrieve its own session ID
      const sessionCC = await getCachedSessionId("claude-code", threadId, role);
      const sessionHermes = await getCachedSessionId("hermes", threadId, role);
      expect(sessionCC).toBe("session-cc-001");
      expect(sessionHermes).toBe("session-hermes-001");
    });
    test("updating one agent's cache does not affect another", async () => {
      // Set initial sessions for both agents
      await setCachedSessionId("claude-code", threadId, role, "session-cc-001");
      await setCachedSessionId("hermes", threadId, role, "session-hermes-001");
      // Update claude-code's session
      await setCachedSessionId("claude-code", threadId, role, "session-cc-002");
      // Hermes's session should remain unchanged
      const sessionHermes = await getCachedSessionId("hermes", threadId, role);
      expect(sessionHermes).toBe("session-hermes-001");
      // Claude-code should have the new session
      const sessionCC = await getCachedSessionId("claude-code", threadId, role);
      expect(sessionCC).toBe("session-cc-002");
    });
    test("missing session returns null for specific agent", async () => {
      const session = await getCachedSessionId("claude-code", threadId, role);
      expect(session).toBeNull();
    });
    test("empty session ID is treated as missing", async () => {
      await setCachedSessionId("claude-code", threadId, role, "");
      const session = await getCachedSessionId("claude-code", threadId, role);
      expect(session).toBeNull();
    });
  });
  describe("file system operations", () => {
    const threadId = "01234567890123456789012345" as ThreadId;
    const role = "developer";
    test("cache directory is created if missing", async () => {
      const cachePath = getCachePath("claude-code");
      const cacheDir = dirname(cachePath);
      // Ensure cache dir doesn't exist
      await rm(cacheDir, { recursive: true, force: true });
      // Write a session
      await setCachedSessionId("claude-code", threadId, role, "session-001");
      // Cache directory should be created
      const stats = await stat(cacheDir);
      expect(stats.isDirectory()).toBe(true);
    });
    test("multiple agents create separate cache files", async () => {
      // Cache sessions for multiple agents
      await setCachedSessionId("claude-code", threadId, role, "session-cc-001");
      await setCachedSessionId("hermes", threadId, role, "session-hermes-001");
      // Separate cache files should exist
      const pathCC = getCachePath("claude-code");
      const pathHermes = getCachePath("hermes");
      const contentCC = JSON.parse(await readFile(pathCC, "utf8")) as Record<string, string>;
      const contentHermes = JSON.parse(await readFile(pathHermes, "utf8")) as Record<
        string,
        string
      >;
      expect(contentCC).toHaveProperty(`${threadId}:${role}`, "session-cc-001");
      expect(contentHermes).toHaveProperty(`${threadId}:${role}`, "session-hermes-001");
    });
    test("atomic writes prevent partial reads", async () => {
      // Write a session
      await setCachedSessionId("claude-code", threadId, role, "session-001");
      // The final file should exist (no .tmp files left behind)
      const cachePath = getCachePath("claude-code");
      const dir = dirname(cachePath);
      const files = await readdir(dir);
      expect(files).toContain("claude-code-sessions.json");
      expect(files.every((f) => !f.endsWith(".tmp"))).toBe(true);
    });
  });
  describe("legacy migration", () => {
    const threadId = "01234567890123456789012345" as ThreadId;
    const role = "developer";
    test("old agent-sessions.json is ignored", async () => {
      // Create old agent-sessions.json file
      const oldCachePath = join(resolveStorageRoot(), "cache", "agent-sessions.json");
      await mkdir(dirname(oldCachePath), { recursive: true });
      await writeFile(
        oldCachePath,
        JSON.stringify({
          "01234567890123456789012345:developer": "old-session-001",
        }),
        "utf8",
      );
      // Query with the new per-agent cache
      const session = await getCachedSessionId("claude-code", threadId, role);
      // Should return null (old cache is ignored)
      expect(session).toBeNull();
    });
    test("new per-agent cache takes precedence", async () => {
      // Create both old and new cache files
      const oldPath = join(resolveStorageRoot(), "cache", "agent-sessions.json");
      await mkdir(dirname(oldPath), { recursive: true });
      await writeFile(
        oldPath,
        JSON.stringify({
          [`${threadId}:${role}`]: "old-session",
        }),
        "utf8",
      );
      await setCachedSessionId("claude-code", threadId, role, "new-session");
      // The new per-agent cache value should be returned
      const session = await getCachedSessionId("claude-code", threadId, role);
      expect(session).toBe("new-session");
    });
  });
  describe("error handling", () => {
    const threadId = "01234567890123456789012345" as ThreadId;
    const role = "developer";
    test("invalid JSON in cache file returns empty cache", async () => {
      // Create a corrupted cache file
      const cachePath = getCachePath("claude-code");
      await mkdir(dirname(cachePath), { recursive: true });
      await writeFile(cachePath, "{ invalid json }", "utf8");
      // Should return null (treating corrupted cache as empty)
      const session = await getCachedSessionId("claude-code", threadId, role);
      expect(session).toBeNull();
    });
    test("non-object JSON in cache file returns empty cache", async () => {
      // Create a cache file with non-object JSON
      const cachePath = getCachePath("claude-code");
      await mkdir(dirname(cachePath), { recursive: true });
      await writeFile(cachePath, JSON.stringify(["not", "an", "object"]), "utf8");
      // Should return null
      const session = await getCachedSessionId("claude-code", threadId, role);
      expect(session).toBeNull();
    });
    test("cache entries with non-string values are ignored", async () => {
      // Create a cache file with mixed types
      const cachePath = getCachePath("claude-code");
      const cacheData = {
        "thread1:role1": "valid-session",
        "thread2:role2": 12345, // number
        "thread3:role3": null, // null
        "thread4:role4": "", // empty string
      };
      await mkdir(dirname(cachePath), { recursive: true });
      await writeFile(cachePath, JSON.stringify(cacheData), "utf8");
      // Valid string entries should be returned
      const session1 = await getCachedSessionId("claude-code", "thread1" as ThreadId, "role1");
      expect(session1).toBe("valid-session");
      // Invalid entries should return null
      const session2 = await getCachedSessionId("claude-code", "thread2" as ThreadId, "role2");
      const session3 = await getCachedSessionId("claude-code", "thread3" as ThreadId, "role3");
      const session4 = await getCachedSessionId("claude-code", "thread4" as ThreadId, "role4");
      expect(session2).toBeNull();
      expect(session3).toBeNull();
      expect(session4).toBeNull(); // empty string is treated as missing
    });
  });
 });
@@ -12,7 +12,7 @@ export {
 export type { FrontmatterFastPathResult } from "./frontmatter.js";
 export { tryFrontmatterFastPath } from "./frontmatter.js";
 export { createAgent } from "./run.js";
-export { getCachedSessionId, setCachedSessionId } from "./session-cache.js";
+export { getCachedSessionId, getCachePath, setCachedSessionId } from "./session-cache.js";
 export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
 export type {
  AgentContext,
@@ -8,8 +8,8 @@ import { resolveStorageRoot } from "./storage.js";
 type SessionCache = Record<string, string>;
-function getCachePath(): string {
+export function getCachePath(agentName: string): string {
-  return join(resolveStorageRoot(), "cache", "agent-sessions.json");
+  return join(resolveStorageRoot(), "cache", `${agentName}-sessions.json`);
 }
 function cacheKey(threadId: ThreadId, role: string): string {
@@ -20,8 +20,8 @@ function isRecord(value: unknown): value is Record<string, unknown> {
  return typeof value === "object" && value !== null && !Array.isArray(value);
 }
-async function readCache(): Promise<SessionCache> {
+async function readCache(agentName: string): Promise<SessionCache> {
-  const path = getCachePath();
+  const path = getCachePath(agentName);
  try {
    const text = await readFile(path, "utf8");
    const raw = JSON.parse(text) as unknown;
@@ -40,36 +40,45 @@ async function readCache(): Promise<SessionCache> {
    if (err.code === "ENOENT") {
      return {};
    }
    // Treat JSON parse errors as empty cache
    if (err.name === "SyntaxError") {
      return {};
    }
    throw e;
  }
 }
-async function writeCache(cache: SessionCache): Promise<void> {
+async function writeCache(agentName: string, cache: SessionCache): Promise<void> {
-  const path = getCachePath();
+  const path = getCachePath(agentName);
  const dir = dirname(path);
  await mkdir(dir, { recursive: true });
  // Atomic write: write to temp file then rename to avoid partial reads on concurrent access.
  // NOTE: Current workflow execution is serial (execFileSync), so true concurrency doesn't occur.
  // This is a safety net for future parallel execution.
-  const tmpPath = join(dir, `.agent-sessions.${randomBytes(4).toString("hex")}.tmp`);
+  const tmpPath = join(dir, `.${agentName}-sessions.${randomBytes(4).toString("hex")}.tmp`);
  await writeFile(tmpPath, `${JSON.stringify(cache, null, 2)}\n`, "utf8");
  await rename(tmpPath, path);
 }
 /** Read the cached session ID for a thread+role pair. */
-export async function getCachedSessionId(threadId: ThreadId, role: string): Promise<string | null> {
+export async function getCachedSessionId(
-  const cache = await readCache();
+  agentName: string,
  threadId: ThreadId,
  role: string,
 ): Promise<string | null> {
  const cache = await readCache(agentName);
  const sessionId = cache[cacheKey(threadId, role)];
  return sessionId ?? null;
 }
 /** Write the session ID for a thread+role pair into the cache. */
 export async function setCachedSessionId(
  agentName: string,
  threadId: ThreadId,
  role: string,
  sessionId: string,
 ): Promise<void> {
-  const cache = await readCache();
+  const cache = await readCache(agentName);
  cache[cacheKey(threadId, role)] = sessionId;
-  await writeCache(cache);
+  await writeCache(agentName, cache);
 }
@@ -1,7 +1,6 @@
 import path from "node:path";
 import { defineConfig } from "vitest/config";
 // biome-ignore lint/style/noDefaultExport: Vitest loads config from default export.
 export default defineConfig({
  test: {
    environment: "node",
Author	SHA1	Message	Date
xiaoju	669af841e1	refactor: address review feedback for CLI restructure - Extract shared module (shared.ts) — walkChain, expandDeep, etc. deduplicated - Hide step read command (half-baked, not ready for users) - Remove cmdThreadKill dead code - Revert unrelated protocol type change - Revert unrelated package.json change - Fix unused imports (biome) Refs #463	2026-05-24 11:32:47 +00:00
xiaoju	650313b1c2	feat(step): expand detail CAS refs by default in step list Previously step list showed raw CAS refs for detail fields. Now detail is recursively expanded (like output already was), since every turn is individually hashed and walkable. Refs #463	2026-05-24 11:12:22 +00:00
xiaoju	c40007eeaf	fix(agent-claude-code): add missing workflow-util dependency The claude-code agent imports createLogger from @uncaged/workflow-util but was missing the dependency declaration, causing test failures.	2026-05-24 11:04:02 +00:00
xiaoju	1f13b1e79c	fix(cli): resolve lint errors and unused imports (#463 ) Fix all lint errors flagged by biome check to ensure clean codebase. ## Changes ### Removed Unused Imports - `packages/cli-workflow/src/commands/thread.ts`: - Removed `StartEntry` (moved to step.ts) - Removed `StepEntry` (moved to step.ts) - Removed `ThreadForkOutput` (moved to step.ts) - Removed `ThreadStepsOutput` (moved to step.ts) - `packages/cli-workflow/src/cli.ts`: - Removed unused `yamlStringify` import from yaml package ### Fixed Unused Parameter - `packages/cli-workflow/src/commands/step.ts`: - Prefixed unused `before` parameter with underscore in `cmdStepRead` - Parameter is part of the function signature for future use (awaiting #462) ### Fixed Import Order - `packages/cli-workflow/src/__tests__/thread.test.ts`: - Reordered imports to follow biome's organization rules - Moved cmdStepShow import before cmdThreadRead imports ## Test Results - ✅ `bun run check` passes (typecheck + lint + log tags) - ✅ All 124 tests passing - ✅ Build completes successfully Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-24 10:50:49 +00:00
xiaoju	031c3aa632	docs(cli): add deprecation handlers and update documentation (#463 ) Complete the CLI refactoring with deprecation error handlers, updated help text, and comprehensive migration guide. ## Changes ### Deprecation Handlers Add error handlers for all removed commands with helpful migration messages: - `workflow put` → suggests `workflow add` - `thread step` → suggests `thread exec` - `thread steps` → suggests `step list` - `thread step-details` → suggests `step show` - `thread fork` → suggests `step fork` - `thread kill` → suggests `thread stop` or `thread cancel` - `thread running` → suggests `thread list --status running` Error messages follow the format: ``` Error: Command 'X' has been removed. Use 'Y' instead. For more information, see: uwf help Y ``` ### Help Documentation Updated CLI help text to explain four-layer architecture: - Main help shows architecture diagram with Chinese labels - Command group descriptions reference layers: - `workflow` → "Workflow definitions (layer 1: templates)" - `thread` → "Thread execution (layer 2: instances)" - `step` → "Step results (layer 3: single cycle)" - Deprecated commands appear in help with [DEPRECATED] tag ### README Updates Comprehensive documentation updates: - Added "Four-Layer Architecture" section with diagram - Updated all command tables with new command names - Added complete migration guide with: - Renamed commands table - Merged commands table - Split commands table - Moved commands table - Example deprecation error output - Updated "Internal Structure" to show new step.ts module ## Testing - ✅ All 124 tests pass - ✅ Build completes successfully - ✅ Deprecation handlers tested manually - ✅ Help output verified for main, thread, and step commands Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-24 10:46:31 +00:00
xiaoju	7b50969307	refactor(cli): reorganize CLI commands into four-layer model (#463 ) Implement comprehensive CLI refactoring to clarify the four-layer model: workflow → thread → step → turn ## Breaking Changes ### Renamed Commands - `uwf workflow put` → `uwf workflow add` - `uwf thread step` → `uwf thread exec` ### Removed Commands - `uwf thread running` (merged into `thread list --status running`) - `uwf thread kill` (split into `thread stop` and `thread cancel`) ### Moved Commands - `uwf thread steps` → `uwf step list` - `uwf thread step-details` → `uwf step show` - `uwf thread fork` → `uwf step fork` ## New Commands ### Thread Commands - `uwf thread list --status <idle\|running\|completed>` - Filter threads by status - `uwf thread stop <thread-id>` - Stop background execution (keep thread active) - `uwf thread cancel <thread-id>` - Cancel thread (stop + archive to history) ### Step Command Group (New) - `uwf step list <thread-id>` - List all steps in a thread - `uwf step show <step-hash>` - Show step details - `uwf step read <step-hash> [--before N]` - Read step output as markdown - `uwf step fork <step-hash>` - Fork thread from a step ## Implementation Details ### Files Modified - `packages/cli-workflow/src/commands/workflow.ts` - Renamed cmdWorkflowPut → cmdWorkflowAdd - `packages/cli-workflow/src/commands/thread.ts`: - Renamed cmdThreadStep → cmdThreadExec - Added cmdThreadStop and cmdThreadCancel (split from cmdThreadKill) - Updated cmdThreadList to support --status filter with idle/running/completed - Removed cmdThreadSteps, cmdThreadStepDetails, cmdThreadFork - `packages/cli-workflow/src/commands/step.ts` - New module with: - cmdStepList (moved from cmdThreadSteps) - cmdStepShow (moved from cmdThreadStepDetails) - cmdStepFork (moved from cmdThreadFork) - cmdStepRead (new, stub implementation pending #462) - `packages/cli-workflow/src/cli.ts` - Updated all CLI command registrations ### Tests Updated - `packages/cli-workflow/src/__tests__/thread-step-count.test.ts` - Updated references from "thread step" to "thread exec" - `packages/cli-workflow/src/__tests__/thread.test.ts` - Updated imports to use cmdStepShow from step.ts ## Test Results All 124 tests pass in cli-workflow package. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-24 10:40:32 +00:00
xiaoju	fc6072c28c	Merge pull request 'feat: use git worktree for isolated development in solve-issue workflow' (#465 ) from fix/464-worktree-isolation into main	2026-05-24 10:27:51 +00:00
xiaoju	b0e3f4a363	feat: use git worktree for isolated development in solve-issue workflow All roles (developer, reviewer, tester, committer) now work in ~/repos/workflow-worktrees/fix/<issue>-<slug> instead of modifying the main working directory. Prevents self-destructive edits. Fixes #464	2026-05-24 10:22:25 +00:00
xiaoju	38112053a0	Merge pull request 'fix(agent-kit): separate session cache per agent' (#462 ) from fix/461-per-agent-session-cache into main	2026-05-24 09:19:50 +00:00
xiaoju	1d174ee5c9	fix(agent-kit): separate session cache per agent Each agent now maintains its own session cache file instead of sharing a single agent-sessions.json. This prevents session ID conflicts when multiple agents operate on the same thread+role pair. Changes: - getCachePath() now takes agentName parameter - getCachedSessionId/setCachedSessionId require agentName as first param - Cache files named <agent>-sessions.json (e.g., hermes-sessions.json) - Agent wrappers inject their agent name into cache calls - Add comprehensive tests for session cache isolation - Handle malformed JSON gracefully (treat as empty cache) Fixes #461	2026-05-24 09:16:06 +00:00
xiaoju	6e3b32ca34	Merge pull request 'fix(cli): replace markdown headings with XML tags in thread read output' (#460 ) from fix/459-xml-tag-isolation into main	2026-05-24 08:44:47 +00:00
xiaoju	932bbe5c41	fix(cli): replace markdown headings with XML tags in thread read output Changed uwf thread read to wrap role prompts and agent outputs in XML tags (<prompt> and <output>) instead of markdown headings (### Prompt, ### Content). This prevents Claude Code from treating step outputs as structural headings. - Updated formatStepPrompt to use <prompt>...</prompt> tags - Updated formatStepContent to use <output>...</output> tags - Added comprehensive test suite in thread-read-xml-tags.test.ts - Updated existing tests to verify XML tag behavior Fixes #459 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-24 08:04:34 +00:00
xiaomo	9440b9af82	Merge pull request 'chore: fix biome noExcessiveCognitiveComplexity warnings' (#458 ) from fix/444-biome-complexity-warnings into main	2026-05-24 07:30:41 +00:00
xiaoju	f96d6eb7c4	refactor(agent-builtin): reduce cognitive complexity in loop.ts Refactored runBuiltinLoop function to reduce cognitive complexity from 30 to below 15 by extracting helper functions: - shouldInjectDeadlineWarning: checks if deadline warning should be shown - shouldProcessToolCalls: determines if tool calls should be processed - extractFinalText: extracts last assistant message content - injectDeadlineWarning: injects deadline warning message - handleTextOnlyTurn: handles text-only turn logic - handleToolCallTurn: handles tool call turn logic - processLoopIteration: processes a single loop iteration Added 24 new unit tests for the extracted helper functions, bringing total test count to 41 (all passing). All existing behavior is preserved. Fixes #444 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-24 05:53:55 +00:00
xiaomo	95102941f1	Merge pull request 'feat(cli): thread step --background + thread running' (#457 ) from fix/456-thread-step-background into main	2026-05-24 05:33:56 +00:00