feat: use git worktree for isolated development in solve-issue workflow

All roles (developer, reviewer, tester, committer) now work in ~/repos/workflow-worktrees/fix/<issue>-<slug> instead of modifying the main working directory. Prevents self-destructive edits. Fixes #464
Merge pull request 'fix(agent-kit): separate session cache per agent' (#462 ) from fix/461-per-agent-session-cache into main
2026-05-24 10:22:25 +00:00 · 2026-05-24 09:19:50 +00:00 · 2026-05-24 09:16:06 +00:00 · 2026-05-24 08:44:47 +00:00 · 2026-05-24 08:04:34 +00:00 · 2026-05-24 07:30:41 +00:00
602 changed files with 21850 additions and 4486 deletions
@@ -0,0 +1,67 @@
+# Sync README
+
+When updating README.md files in this monorepo, follow these conventions.
+
+## Scope
+
+- Root `README.md` — project overview and navigation hub
+- Per-package `packages/*/README.md` — each package self-contained
+
+## Root README Structure
+
+The root README should have these sections in order:
+
+1. **Title and one-liner** — stateless workflow engine driven by single-step CLI
+2. **Overview** — 2-3 paragraphs explaining what it does and key concepts
+3. **Architecture** — dependency layer diagram (text-based)
+4. **Packages** — table with ALL packages from packages/ directory, columns: Package, Description, Type (cli/lib/agent/app)
+5. **Quick Start** — install, build, register workflow, start thread, run step
+6. **CLI Reference** — brief command list, detailed usage in cli-workflow README
+7. **Development** — bun install / build / check / test
+
+## Per-Package README Structure
+
+Each package README should have:
+
+1. **Title** — package name
+2. **One-line description** — matching package.json
+3. **Overview** — what it does, where it sits in the architecture, dependencies
+4. **Installation** — bun add (for libs) or "included as binary" (for cli/agents)
+5. **API** (lib packages) — all exports from src/index.ts with type signatures, grouped by category, minimal usage examples
+6. **CLI Usage** (cli/agent packages) — command reference with examples
+7. **Internal Structure** — brief src/ file organization
+8. **Configuration** (if applicable)
+
+## Execution Steps
+
+### Step 1: Gather current state
+For each package read:
+- package.json (name, version, description, dependencies, bin)
+- src/index.ts (public API exports)
+- Existing README.md (preserve hand-written content worth keeping)
+
+### Step 2: Update root README
+- Ensure ALL packages in packages/ directory are listed in the table
+- Update CLI command reference from uwf --help output
+- Keep Quick Start examples valid
+
+### Step 3: Write/update each package README
+- Follow the per-package structure
+- API section MUST match actual src/index.ts exports — never invent
+- For agent packages: document CLI binary name, how it is invoked
+- For lib packages: document exported types and functions
+- Internal structure: list actual files in src/
+
+### Step 4: Verify
+- All relative links work
+- Package names match package.json
+- No references to removed/renamed packages
+- bun run build still passes
+
+## Guidelines
+
+- Only document what src/index.ts actually exports
+- Root README summarizes, package READMEs go into detail
+- Verify CLI examples against actual commands
+- Preserve existing good prose when updating
+- English for all README content
@@ -1,40 +0,0 @@
-# ──────────────────────────────────────────────
-# Workflow Engine — Environment Variables
-# ──────────────────────────────────────────────
-# Copy this file to .env and fill in the values.
-
-# ── Cursor Agent ──
-
-# CLI command to invoke the Cursor agent (required for develop workflow)
-WORKFLOW_CURSOR_COMMAND=
-
-# Model override for Cursor agent
-WORKFLOW_CURSOR_MODEL=
-
-# Timeout in milliseconds for Cursor agent operations
-WORKFLOW_CURSOR_TIMEOUT=
-
-# ── Hermes Agent (used by develop tester/committer + solve-issue) ──
-
-# CLI command to invoke the Hermes agent (absolute path required)
-WORKFLOW_HERMES_COMMAND=
-
-# Model override for Hermes agent
-WORKFLOW_HERMES_MODEL=
-
-# Timeout in milliseconds for Hermes agent operations
-WORKFLOW_HERMES_TIMEOUT=
-
-# ── Storage ──
-
-# Override the workflow storage root directory
-# Default: ~/.uncaged/workflow
-WORKFLOW_STORAGE_ROOT=
-
-# Gateway secret for the serve command
-WORKFLOW_DASHBOARD_SECRET=
-
-# ── Display ──
-
-# Set to any value to disable colored output
-# NO_COLOR=1
@@ -10,3 +10,6 @@ xiaoju/
 solve-issue-entry.ts
 packages/workflow-template-develop/develop.esm.js
 .DS_Store
+*.py
+.claude
+tmp
@@ -0,0 +1,83 @@
+# Test Spec: uwf setup model connectivity validation (#335)
+
+## Context
+
+File: `packages/cli-workflow/src/commands/setup.ts`
+Test file: `packages/cli-workflow/src/__tests__/setup-validate.test.ts`
+
+After `cmdSetup` writes config, it should send a test chat completion request to verify the configured model is reachable. If validation fails, warn the user (don't abort — config is already saved).
+
+## Implementation Notes
+
+- Add a `validateModel(baseUrl, apiKey, model)` function that sends a minimal chat completion request (`POST /chat/completions` with `messages: [{role:"user",content:"hi"}]`, `max_tokens: 1`)
+- Returns `Result<void, string>` — ok if 2xx response, error with reason string otherwise
+- Use `AbortSignal.timeout(15_000)` for the request
+- Both `cmdSetup` and `cmdSetupInteractive` should call it after saving config
+- `cmdSetup` returns validation result in its return object: `{ ...existing, validation: { ok: true } | { ok: false, error: string } }`
+- `cmdSetupInteractive` prints a warning to console if validation fails, success message if it passes
+- Use the project logger (`createLogger`) — no raw `console.log` except in interactive CLI output (per CLAUDE.md)
+
+## Test Cases (vitest)
+
+### 1. `validateModel` — success path
+- Mock `fetch` to return `{ status: 200, ok: true, json: () => ({}) }`
+- Call `validateModel(baseUrl, apiKey, model)`
+- Assert returns `{ ok: true, value: undefined }`
+- Assert fetch was called with correct URL (`${baseUrl}/chat/completions`), correct headers (`Authorization: Bearer ${apiKey}`), correct body (model, messages, max_tokens: 1)
+
+### 2. `validateModel` — HTTP error (401 unauthorized)
+- Mock `fetch` to return `{ status: 401, ok: false, statusText: "Unauthorized" }`
+- Call `validateModel(baseUrl, apiKey, model)`
+- Assert returns `{ ok: false, error: <string containing "401"> }`
+
+### 3. `validateModel` — HTTP error (404 model not found)
+- Mock `fetch` to return `{ status: 404, ok: false, statusText: "Not Found" }`
+- Assert returns `{ ok: false, error: <string containing "404"> }`
+
+### 4. `validateModel` — network timeout
+- Mock `fetch` to throw `DOMException` with name `AbortError`
+- Assert returns `{ ok: false, error: <string containing "timeout" or "unreachable"> }`
+
+### 5. `validateModel` — network error (DNS failure, connection refused)
+- Mock `fetch` to throw `TypeError("fetch failed")`
+- Assert returns `{ ok: false, error: <string mentioning connectivity> }`
+
+### 6. `cmdSetup` — includes validation result on success
+- Mock global `fetch` for `/chat/completions` to succeed
+- Call `cmdSetup({ provider, baseUrl, apiKey, model, storageRoot })`
+- Assert returned object has `validation: { ok: true, value: undefined }`
+- Assert config files are still written (existing behavior preserved)
+
+### 7. `cmdSetup` — includes validation result on failure (config still saved)
+- Mock global `fetch` for `/chat/completions` to return 401
+- Call `cmdSetup({ ... })`
+- Assert returned object has `validation: { ok: false, error: ... }`
+- Assert `config.yaml` and `.env` are still written (validation failure doesn't prevent saving)
+
+### 8. `cmdSetupInteractive` — prints success message on validation pass
+- Mock `fetch` for both `/models` and `/chat/completions` to succeed
+- Mock stdin to provide valid selections
+- Capture console output
+- Assert output contains a success message like "Model verified" or "✓"
+
+### 9. `cmdSetupInteractive` — prints warning on validation failure
+- Mock `fetch`: `/models` succeeds, `/chat/completions` returns 401
+- Mock stdin for valid selections
+- Capture console output
+- Assert output contains a warning about model not being reachable and suggests trying a different model
+
+### 10. `validateModel` — request body correctness
+- Mock `fetch` to capture the request body
+- Call `validateModel(baseUrl, apiKey, "test-model")`
+- Assert body is `{ model: "test-model", messages: [{role: "user", content: "hi"}], max_tokens: 1 }`
+
+## Export Requirements
+
+- `validateModel` must be exported (for direct unit testing)
+- Signature: `async function validateModel(baseUrl: string, apiKey: string, model: string): Promise<Result<void, string>>`
+- `Result` type: `{ ok: true; value: T } | { ok: false; error: E }` (project convention)
+
+## Files to Create/Modify
+
+- **New**: `packages/cli-workflow/src/__tests__/setup-validate.test.ts` — all test cases above
+- **Modify**: `packages/cli-workflow/src/commands/setup.ts` — add `validateModel`, integrate into `cmdSetup` and `cmdSetupInteractive`
@@ -0,0 +1,213 @@
+name: "solve-issue"
+description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds."
+roles:
+  planner:
+    description: "Analyzes issue and outputs a TDD test spec"
+    goal: "You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify."
+    capabilities:
+      - issue-analysis
+      - planning
+    procedure: |
+      On first run (no previous steps):
+      1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
+      2. Read CLAUDE.md (or equivalent project conventions file) to understand coding standards
+      3. Assess whether the issue has enough information to produce a test spec
+      4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output status=insufficient_info and terminate
+      5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
+
+      On subsequent runs (bounced back by tester with fix_spec):
+      1. Read the tester's output from the previous step to understand what's wrong with the spec
+      2. Revise the test spec accordingly
+
+      After producing the test spec:
+      1. Store it via `uwf cas put-text "<markdown content>"` and capture the returned hash
+      2. Put the hash in frontmatter.plan (required when status=ready)
+    output: "Output a brief summary of the test spec. Frontmatter must include: status (ready or insufficient_info) and plan (CAS hash of the test spec, required when status=ready)."
+    frontmatter:
+      type: object
+      properties:
+        status:
+          type: string
+          enum: [ready, insufficient_info]
+        plan:
+          type: string
+      required: [status]
+  developer:
+    description: "TDD implementation per test spec"
+    goal: "You are a developer agent. You implement code changes following TDD — write tests first, then implementation."
+    capabilities:
+      - coding
+    procedure: |
+      IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
+
+      Before starting any work, set up an isolated worktree:
+      1. `cd ~/repos/workflow && git fetch origin` to get latest refs
+      2. First time (no existing branch):
+         - `git worktree add ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
+         - `cd ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug> && bun install`
+      3. If bounced back from reviewer or tester (branch already exists):
+         - The worktree should already exist at `~/repos/workflow-worktrees/fix/<issue-number>-<short-slug>`
+         - `cd ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug>`
+         - `git fetch origin && git rebase origin/main`
+      4. ALL subsequent work must happen inside the worktree directory.
+
+      Then implement TDD:
+      5. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
+      6. If bounced back from reviewer or tester: read the previous role's output to understand what needs fixing
+      7. Write tests first based on the spec
+      8. Implement the code to make tests pass
+      9. Ensure `bun run build` passes with no errors
+      10. Run `bun test` to verify all tests pass
+    output: "List all files changed and provide a summary. Frontmatter must include: status (done or failed)."
+    frontmatter:
+      type: object
+      properties:
+        status:
+          type: string
+          enum: [done, failed]
+      required: [status]
+  reviewer:
+    description: "Code standards compliance check"
+    goal: "You are a code reviewer. You verify code standards compliance — NOT functionality (that's the tester's job)."
+    capabilities:
+      - code-review
+      - static-analysis
+    procedure: |
+      First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
+
+      Before reviewing, verify the git branch:
+      1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
+      2. If the branch doesn't correspond to the issue, flag it in your output and reject
+
+      Then perform code review:
+      Hard checks (must all pass):
+      3. `bun run build` — no build errors
+      4. `bunx biome check` — no lint violations
+      5. TypeScript strict mode — no type errors
+
+      Soft checks (review against CLAUDE.md conventions):
+      - Functional-first: `function` + `type`, not `class` + `interface`
+      - No optional properties (`?:`) — use `T | null`
+      - Naming conventions (kebab-case files, PascalCase types, camelCase functions)
+      - Module boundary discipline (folder exports via index.ts)
+      - No `console.log` (use structured logger)
+      - No dynamic imports in production code
+
+      Only review standards compliance. Do NOT test functionality.
+      If rejecting, you MUST explain the specific reason in your output.
+    output: "Explain your decision with specific file/line references. Frontmatter must include: approved (true or false)."
+    frontmatter:
+      type: object
+      properties:
+        approved:
+          type: boolean
+      required: [approved]
+  tester:
+    description: "Functional correctness verification"
+    goal: "You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec."
+    capabilities:
+      - testing
+    procedure: |
+      First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
+
+      1. Run `bun test` for automated test verification
+      2. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
+      3. Verify each scenario in the spec is covered and passing
+      4. Determine outcome:
+         - passed: all scenarios verified, tests pass
+         - fix_code: tests fail or implementation doesn't match spec → send back to developer
+         - fix_spec: the spec itself is wrong or incomplete → send back to planner
+    output: "Report test results per scenario. Frontmatter must include: status (passed, fix_code, or fix_spec)."
+    frontmatter:
+      type: object
+      properties:
+        status:
+          type: string
+          enum: [passed, fix_code, fix_spec]
+      required: [status]
+  committer:
+    description: "Commits and creates PR"
+    goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue."
+    capabilities: []
+    procedure: |
+      First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
+
+      Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
+      1. Stage all changes: `git add -A`
+      2. Commit with a descriptive message referencing the issue: `git commit -m "type: description\n\nFixes #N"`
+      3. Push the branch: `git push -u origin <branch-name>`
+         - If push hook fails: capture the error log in your output, mark hook_failed
+      4. On push success: create a PR via `tea pr create --title "..." --description "..."`
+         - PR description must follow the project template: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
+      5. After PR creation, clean up the worktree:
+         - `cd ~/repos/workflow`
+         - `git worktree remove ~/repos/workflow-worktrees/fix/<issue-number>-<slug>`
+    output: "Include PR URL on success or error log on failure. Frontmatter must include: success (true or false)."
+    frontmatter:
+      type: object
+      properties:
+        success:
+          type: boolean
+      required: [success]
+conditions:
+  insufficientInfo:
+    description: "Planner determined there's not enough info to proceed"
+    expression: "$last('planner').status = 'insufficient_info'"
+  devFailed:
+    description: "Developer failed to implement"
+    expression: "$last('developer').status = 'failed'"
+  rejected:
+    description: "Reviewer rejected the implementation"
+    expression: "$last('reviewer').approved = false"
+  fixCode:
+    description: "Tester found code issues"
+    expression: "$last('tester').status = 'fix_code'"
+  fixSpec:
+    description: "Tester found spec issues"
+    expression: "$last('tester').status = 'fix_spec'"
+  hookFailed:
+    description: "Push hook failed"
+    expression: "$last('committer').success = false"
+graph:
+  $START:
+    - role: "planner"
+      condition: null
+      prompt: "Analyze the issue and produce an implementation plan."
+  planner:
+    - role: "$END"
+      condition: "insufficientInfo"
+      prompt: "Insufficient information to proceed; end the workflow."
+    - role: "developer"
+      condition: null
+      prompt: "Implement the plan from the planner."
+  developer:
+    - role: "$END"
+      condition: "devFailed"
+      prompt: "Development failed; end the workflow."
+    - role: "reviewer"
+      condition: null
+      prompt: "Send the implementation to the reviewer."
+  reviewer:
+    - role: "developer"
+      condition: "rejected"
+      prompt: "Reviewer rejected the implementation; fix the issues."
+    - role: "tester"
+      condition: null
+      prompt: "Review passed; run tests on the implementation."
+  tester:
+    - role: "developer"
+      condition: "fixCode"
+      prompt: "Tests found code issues; return to developer."
+    - role: "planner"
+      condition: "fixSpec"
+      prompt: "Tests found spec issues; return to planner."
+    - role: "committer"
+      condition: null
+      prompt: "Tests passed; commit and push the changes."
+  committer:
+    - role: "developer"
+      condition: "hookFailed"
+      prompt: "Push hook failed; return to developer to fix."
+    - role: "$END"
+      condition: null
+      prompt: "Commit succeeded; complete the workflow."
@@ -2,46 +2,41 @@

 ## Project Overview

-This monorepo implements a workflow engine that executes single-file ESM bundles. Each workflow is a self-contained `.esm.js` file with an XXH64 hash as its version identifier. Shared types live in `@uncaged/workflow-protocol`; bundle authors typically depend on `@uncaged/workflow-runtime`.
+This monorepo implements a stateless workflow engine driven by a single-step CLI (`uwf`). Workflows are **YAML definitions** stored as CAS nodes; threads are immutable chains of CAS-linked step nodes. No daemon — each `uwf thread step` invocation runs one moderator→agent→extract cycle and exits.

 ### Key Terms

 | Concept | What it is |
 |---------|-----------|
-| **Workflow** | A single-file ESM module that exports `run` (workflow function) and `descriptor` (metadata). Identified by its XXH64 hash (Crockford Base32). |
-| **Bundle** | The physical `.esm.js` file stored in `~/.uncaged/workflow/bundles/`. |
-| **Thread** | A single execution of a workflow, identified by a ULID. State lives in CAS (linked nodes); active threads indexed in `threads.json`; completed rows in `history/*.jsonl`. Debug logs use `.info.jsonl`. |
-| **Role** | A named actor within a workflow. Each role produces output with typed `meta`. |
-| **Registry** | `workflow.yaml` — maps workflow names to current/historical bundle hashes. |
+| **Workflow** | A YAML definition (`WorkflowPayload`) with roles, conditions, and a routing graph. Stored as a CAS node, identified by its XXH64 hash. |
+| **Thread** | A single execution of a workflow, identified by a ULID. State is an immutable CAS chain; active threads indexed in `threads.yaml`; completed threads in `history.jsonl`. |
+| **Role** | A named actor within a workflow. Each role has a system prompt and a JSON Schema `outputSchema`. |
+| **Moderator** | JSONata-based graph evaluator — determines the next role (or `$END`) with zero LLM cost. |
+| **Agent** | An external CLI command (`uwf-hermes`, etc.) spawned by `uwf thread step`. Produces frontmatter markdown output. |
+| **CAS** | Content-Addressed Storage via `@uncaged/json-cas` — all workflow definitions, thread nodes, and outputs are immutable CAS nodes. |
+| **Registry** | `~/.uncaged/workflow/registry.yaml` — maps workflow names to current CAS hashes. |

 ### Monorepo Structure

 ```
 workflow/
  packages/
-    workflow-protocol/              # @uncaged/workflow-protocol — shared types + Result
-    workflow-runtime/               # @uncaged/workflow-runtime — createWorkflow, type re-exports
-    workflow-util/                  # @uncaged/workflow-util — Base32, ULID, logger, storage paths, refs helpers
-    workflow-reactor/               # @uncaged/workflow-reactor — LLM fn + thread reactor (tool calls)
-    workflow-cas/                   # @uncaged/workflow-cas — CAS store, hash, Merkle
-    workflow-register/              # @uncaged/workflow-register — bundle validation, registry YAML, model resolution
-    workflow-execute/               # @uncaged/workflow-execute — engine, extract, fork, GC, workflowAsAgent
-    cli-workflow/                   # @uncaged/cli-workflow — uncaged-workflow CLI
-    workflow-agent-cursor/          # @uncaged/workflow-agent-cursor
-    workflow-agent-hermes/          # @uncaged/workflow-agent-hermes
-    workflow-agent-llm/             # @uncaged/workflow-agent-llm
-    workflow-agent-react/             # @uncaged/workflow-agent-react
-    workflow-util-agent/            # @uncaged/workflow-util-agent — buildAgentPrompt, spawnCli
-    workflow-template-develop/      # @uncaged/workflow-template-develop
-    workflow-template-solve-issue/  # @uncaged/workflow-template-solve-issue
-    workflow-dashboard/             # @uncaged/workflow-dashboard — React dashboard (private app)
-  docs/             # RFCs, conventions
-  biome.json        # root Biome config
-  tsconfig.json     # root TypeScript config
+    workflow-protocol/    # @uncaged/workflow-protocol — shared types (WorkflowPayload, StepNodePayload, WorkflowConfig, etc.)
+    workflow-util/        # @uncaged/workflow-util — Crockford Base32, ULID, logger, frontmatter parsing/validation
+    workflow-moderator/   # @uncaged/workflow-moderator — JSONata graph evaluator
+    workflow-agent-kit/   # @uncaged/workflow-agent-kit — createAgent factory, context builder, extract pipeline
+    workflow-agent-hermes/ # @uncaged/workflow-agent-hermes — uwf-hermes CLI binary (spawns hermes chat)
+    cli-workflow/         # @uncaged/cli-workflow — uwf CLI binary
+  legacy-packages/       # Archived packages (preserved for reference, not active)
+  examples/              # Workflow YAML examples (solve-issue.yaml)
+  docs/                  # Architecture docs
+  biome.json             # root Biome config
+  tsconfig.json          # root TypeScript config
 ```

- Execution stack layers: `workflow-protocol` → (`workflow-runtime`, `workflow-util`, `workflow-reactor`) → (`workflow-cas`, `workflow-register`) → `workflow-execute` → `cli-workflow`
+- Dependency layers: `workflow-protocol` → (`workflow-util`, `workflow-moderator`) → `workflow-agent-kit` → `workflow-agent-hermes` / `cli-workflow`
 - Packages use `workspace:^` protocol (resolves to `^x.y.z` on publish)
+- External CAS: `@uncaged/json-cas` (store API, hashing, schema validation) + `@uncaged/json-cas-fs` (filesystem backend)

 ## Language & Paradigm

@@ -109,8 +104,6 @@ type WorkflowEntry = {
 - Always named exports, never default exports
 - One module = one responsibility, filename = purpose

-Workflow bundles (`.esm.js`) follow the same rule: export `const run` and `const descriptor`, not `export default`.
-
 ### Folder Module Discipline

 Every folder under `src/` is a **module boundary**. Four rules:
@@ -136,10 +129,10 @@ export { createCasStore } from "../cas/cas.js";

 // ❌ Bad — types defined in index.ts
 // in cas/index.ts:
-export type CasStore = { ... };  // should be in cas/types.ts
+export type CasStore = { ... }; // should be in cas/types.ts
 ```

-**Exception**: The package-level `src/index.ts` is the public API surface and re-exports from folder `index.ts` files. Files that remain at `src/` root (e.g. `types.ts`, `workflow-as-agent.ts`) are not inside a folder module and follow normal rules.
+**Exception**: The package-level `src/index.ts` is the public API surface and re-exports from folder `index.ts` files. Files that remain at `src/` root (e.g. `types.ts`) are not inside a folder module and follow normal rules.

 ## Naming

@@ -160,7 +153,7 @@ Workflow names use **verb-first** kebab-case:
 ### ID Encoding

 All IDs use **Crockford Base32**:
- Bundle hash: XXH64 → 13-char Crockford Base32
+- CAS hash: XXH64 → 13-char Crockford Base32
 - Thread ID: ULID → 26-char Crockford Base32 (10 timestamp + 16 random)

 ## Error Handling
@@ -189,7 +182,7 @@ import { createLogger } from "@uncaged/workflow-util";
 const log = createLogger();

 // Each call site has a fixed 8-char Crockford Base32 tag
-log("4KNMR2PX", "Loading workflow bundle...");
+log("4KNMR2PX", "Loading workflow...");
 log("7BQST3VW", `Role ${role} started`);
 ```

@@ -204,7 +197,7 @@ log("7BQST3VW", `Role ${role} started`);

 ### Why fixed tags?

- `grep "4KNMR2PX"` in `.info.jsonl` → instant code location
+- `grep "4KNMR2PX"` in logs → instant code location
 - No need for file/line info in the log — tag is the locator
 - Survives refactoring (tag stays the same when code moves)

@@ -221,74 +214,76 @@ console.log(result);

 Do NOT use `await import()` in production code. Always use static top-level `import`.

-**Exception**: The bundle loader and `extractBundleExports` dynamically import user workflow files at runtime.
-
-```ts
-// Dynamic import required: user bundle path resolved at runtime
-const mod = await import(bundlePath);
-```
-
 Test files (`__tests__/**`) are exempt.

 ## Toolchain

 | Tool | Purpose |
 |------|---------|
-| **bun** | Package manager + runtime + test runner |
+| **bun** | Package manager + runtime |
 | **TypeScript** | Type checking (strict mode) |
 | **Biome** | Lint + format (replaces ESLint + Prettier) |
+| **vitest** | Test runner (`cli-workflow` uses vitest; other packages use `bun test`) |

-### Commands
+### Development Workflow

 ```bash
-bun run check       # tsc --build + biome check
-bun run format      # biome format --write
-bun test            # run tests
+# ── Setup ──
+bun install                 # install all workspace dependencies
+
+# ── Daily development ──
+bun run build               # tsc --build (all packages, dependency order)
+bun run check               # tsc --build + biome check + lint-log-tags
+bun run format              # biome format --write
+bun test                    # run tests across all packages
+
+# ── Before committing ──
+bun run check               # must pass — typecheck + lint + log tag validation
+bun test                    # must pass — all package tests
 ```

-### Version Management & Publishing
+### Publishing

-All public `@uncaged/*` packages are published to **npmjs.org** via `@changesets/cli` with **fixed mode** (all packages share the same version number). `workflow-dashboard` is private and excluded.
+All public `@uncaged/*` packages are published to **npmjs.org** with **fixed mode** (all packages share the same version number).

 ```bash
-# 1. After making changes, add a changeset describing the change
+# 1. Add a changeset describing the change
 bun changeset

-# 2. Before release, bump all package versions + generate CHANGELOGs
+# 2. Bump all package versions + generate CHANGELOGs
 bun version

-# 3. Build, test, and publish to npmjs
+# 3. Build, test, and publish (runs scripts/publish-all.mjs)
 bun release
+
+# Or publish manually with a tag:
+node scripts/publish-all.mjs --tag alpha
+node scripts/publish-all.mjs --dry-run    # preview without publishing
 ```

 - `workspace:^` dependencies resolve to `^x.y.z` on publish
+- Publish order defined in `scripts/publish-all.mjs` (dependency order)
 - Changesets config: `.changeset/config.json` (fixed mode, public access)
- Each package has auto-generated `CHANGELOG.md`

-### Consuming @uncaged/* Packages
-
-External workflow repos just `bun install` — packages come from npmjs like any other dependency. No special registry config needed.
-
-### End-to-end: Monorepo → Registry → Workspace → Bundle
+### End-to-end: Author → Register → Run

 ```
-workflow/ (monorepo)           — engine, runtime, templates, agents
-  │  bun release               — build + test + changeset publish
+examples/solve-issue.yaml       — write a workflow YAML definition
+  │  uwf workflow put
  ▼
-npmjs.org                      — @uncaged/* scoped packages (public)
-  │  bun install
+~/.uncaged/workflow/cas/        — Workflow stored as CAS node
+~/.uncaged/workflow/registry.yaml — name → hash mapping updated
+  │  uwf thread start <name> -p "..."
  ▼
-my-workflows/ (workspace)     — normal package.json
-  │  bun run build:develop     — bun build → single .esm.js
+~/.uncaged/workflow/threads.yaml — new thread head pointer
+  │  uwf thread step <thread-id>
  ▼
-uncaged-workflow workflow add  — register bundle locally
-uncaged-workflow run           — execute workflow
+moderator → agent → extract      — one step per invocation, repeat until $END
 ```

-1. **Monorepo changes** → `bun changeset` (describe change) → `bun version` (bump) → `bun release` (publish)
-2. **Workspace** → `bun install` fetches latest from npmjs
-3. **Build** → produces single-file ESM bundle with `@uncaged/*` as externals
-4. **Register & Run** → `uncaged-workflow workflow add <name> <bundle>` then `uncaged-workflow run <name>`
+1. **Author** — write a workflow YAML file with roles, conditions, and graph
+2. **Register** — `uwf workflow put <file.yaml>` parses YAML, registers output schemas, stores `WorkflowPayload` in CAS
+3. **Run** — `uwf thread start` creates a thread, `uwf thread step` executes one cycle per invocation

 ## Commit Convention

@@ -296,5 +291,5 @@ uncaged-workflow run           — execute workflow
 <type>(<scope>): <description>

 type: feat | fix | refactor | docs | chore | test
-scope: workflow | cli | rfc-001 | ...
+scope: workflow | cli | moderator | agent-kit | hermes | util | protocol | ...
 ```
@@ -1,71 +1,103 @@
 # @uncaged/workflow

-A workflow engine that executes single-file ESM bundles. Each workflow is a self-contained `.esm.js` file identified by its XXH64 hash (Crockford Base32).
+A stateless workflow engine driven by a single-step CLI. Workflows are YAML definitions with roles, JSONata routing conditions, and a directed graph. Threads are immutable CAS-linked chains — each `uwf thread step` runs one moderator→agent→extract cycle and exits.

-## Core Concepts
+## Overview

-| Concept | Description |
-|---------|-------------|
-| **Workflow** | A single-file ESM module exporting `run` (workflow function) and `descriptor` (metadata). Identified by its XXH64 hash. |
-| **Bundle** | The physical `.esm.js` file stored in `~/.uncaged/workflow/bundles/`. |
-| **Thread** | A single execution of a workflow, identified by a ULID. CAS-backed chain plus `threads.json` / `history/*.jsonl`; `.info.jsonl` for debug logs. |
-| **Role** | A named actor within a workflow. Each role produces output with typed `meta`. Roles live inside template packages (`src/roles/`). |
-| **Registry** | `workflow.yaml` — maps workflow names to current/historical bundle hashes. |
-| **CAS** | Content-Addressed Storage — bundles are immutable and addressed by hash. |
+This monorepo implements **uwf**, a workflow engine with no long-running daemon. You register YAML workflow definitions in a content-addressed store (CAS), start a thread with an initial prompt, then invoke `uwf thread step` repeatedly until the moderator routes to `$END`. Each step is a complete process: the moderator evaluates JSONata conditions to pick the next role, an external agent CLI produces frontmatter markdown output, and an extract pipeline validates or structures that output against the role's JSON Schema.

-## Monorepo Packages
+Workflow state lives entirely on disk under `~/.uncaged/workflow/`: CAS nodes for definitions and step payloads, `registry.yaml` for workflow name→hash mappings, and `threads.yaml` for active thread head pointers. Completed threads are archived to `history.jsonl`. Because there is no server process, workflows are easy to debug, fork, and inspect with ordinary CLI tools.
+
+Agents are pluggable CLI binaries (`uwf-hermes`, `uwf-builtin`, `uwf-claude-code`, or custom commands). The engine spawns the configured agent with `<thread-id>` and `<role>`, sets `UWF_EDGE_PROMPT` from the graph transition, and captures both the agent's markdown output and a detail CAS node for session replay.
+
+## Architecture
+
+Dependency layers (lower layers have no dependency on higher layers):

 ```
-packages/
-  workflow/                      # @uncaged/workflow — core lib (types, engine, hash, ULID, registry)
-  cli-workflow/                  # @uncaged/cli-workflow — CLI (`uncaged-workflow` command)
-  workflow-template-develop/     # @uncaged/workflow-template-develop — develop workflow template (includes roles)
-  workflow-template-solve-issue/ # @uncaged/workflow-template-solve-issue — solve-issue workflow template (includes roles)
-  workflow-agent-hermes/         # @uncaged/workflow-agent-hermes — Hermes agent adapter
-  workflow-agent-cursor/         # @uncaged/workflow-agent-cursor — Cursor agent adapter
-  workflow-agent-llm/            # @uncaged/workflow-agent-llm — LLM agent adapter
-  workflow-util-agent/           # @uncaged/workflow-util-agent — agent utilities (buildAgentPrompt, spawnCli)
+Layer 0 — Contract
+  workflow-protocol          Shared types and JSON Schema definitions
+
+Layer 1 — Shared infra
+  workflow-util              Encoding, IDs, logging, frontmatter, paths
+  workflow-moderator         JSONata graph evaluator
+
+Layer 2 — Agent framework
+  workflow-agent-kit         createAgent factory, context builder, extract pipeline
+
+Layer 3 — Agent implementations
+  workflow-agent-hermes      Hermes ACP agent (uwf-hermes)
+  workflow-agent-builtin     Built-in LLM + tools agent (uwf-builtin)
+  workflow-agent-claude-code Claude Code agent (uwf-claude-code)
+
+Layer 4 — CLI
+  cli-workflow               uwf binary — thread lifecycle, registry, CAS, setup
+
+App (uses protocol; not in the runtime engine stack)
+  workflow-dashboard         Web UI for visual workflow editing
 ```

-Managed with **bun workspace** using the `workspace:*` protocol.
+External CAS: [`@uncaged/json-cas`](https://www.npmjs.com/package/@uncaged/json-cas) (store API, hashing, schema validation) + `@uncaged/json-cas-fs` (filesystem backend).
+
+See [docs/architecture.md](docs/architecture.md) for the full design — three-phase engine loop, CAS node types, storage layout, agent CLI protocol, and design decisions.
+
+## Packages
+
+| Package | npm | Description | Type | README |
+|---------|-----|-------------|------|--------|
+| `cli-workflow` | `@uncaged/cli-workflow` | `uwf` CLI — thread lifecycle, workflow registry, CAS inspection, setup | cli | [README](packages/cli-workflow/README.md) |
+| `workflow-protocol` | `@uncaged/workflow-protocol` | Shared TypeScript types and JSON Schema constants | lib | [README](packages/workflow-protocol/README.md) |
+| `workflow-moderator` | `@uncaged/workflow-moderator` | JSONata graph evaluator — next role or `$END` | lib | [README](packages/workflow-moderator/README.md) |
+| `workflow-agent-kit` | `@uncaged/workflow-agent-kit` | `createAgent` factory, context builder, extract pipeline | lib | [README](packages/workflow-agent-kit/README.md) |
+| `workflow-util` | `@uncaged/workflow-util` | Crockford Base32, ULID, logger, frontmatter parsing, storage paths | lib | [README](packages/workflow-util/README.md) |
+| `workflow-agent-hermes` | `@uncaged/workflow-agent-hermes` | `uwf-hermes` — spawns Hermes chat via ACP | agent | [README](packages/workflow-agent-hermes/README.md) |
+| `workflow-agent-builtin` | `@uncaged/workflow-agent-builtin` | `uwf-builtin` — built-in LLM agent with file/shell tools | agent | [README](packages/workflow-agent-builtin/README.md) |
+| `workflow-agent-claude-code` | `@uncaged/workflow-agent-claude-code` | `uwf-claude-code` — spawns Claude Code CLI | agent | [README](packages/workflow-agent-claude-code/README.md) |
+| `workflow-dashboard` | `@uncaged/workflow-dashboard` | Web graph editor for workflow YAML (private, alpha) | app | [README](packages/workflow-dashboard/README.md) |

 ## Quick Start

 ```bash
-# Install dependencies
-bun install
+# 1. Configure provider, model, and default agent
+uwf setup

-# Build all packages
-bun run build
+# 2. Register a workflow from YAML
+uwf workflow put examples/solve-issue.yaml

-# Register a workflow bundle
-uncaged-workflow workflow add solve-issue dist/packages/workflow-template-solve-issue/solve-issue.esm.js
+# 3. Start a thread (creates head pointer; does not execute)
+uwf thread start solve-issue -p "Fix the login redirect bug"

-# Run a workflow
-uncaged-workflow run solve-issue --prompt "Fix bug #42"
+# 4. Execute steps (one at a time, until done)
+uwf thread step <thread-id>
 ```

-## CLI Usage
+Use `-c, --count <number>` on `thread step` to run multiple steps in one invocation. Override the agent with `--agent <cmd>`.

-```bash
-uncaged-workflow                   # Print full command usage (exits with status 1)
-uncaged-workflow workflow list     # List registered workflows
-uncaged-workflow run <name>        # Start a workflow thread
-uncaged-workflow thread list       # List all threads
-uncaged-workflow thread show <id>  # Inspect a thread
-uncaged-workflow skill             # Agent-consumable reference docs
-```
+## CLI Reference

-Run `uncaged-workflow` with no arguments to print usage, or `uncaged-workflow skill cli` for the full CLI skill reference.
+Global options: `-V, --version`, `--format <json|yaml>`, `-h, --help`.
+
+| Group | Commands |
+|-------|----------|
+| **thread** | `start`, `step`, `show`, `list`, `kill`, `steps`, `read`, `fork`, `step-details` |
+| **workflow** | `put`, `show`, `list` |
+| **cas** | `get`, `put`, `put-text`, `has`, `refs`, `walk`, `reindex`, `schema list`, `schema get` |
+| **setup** | Interactive or `--provider`, `--base-url`, `--api-key`, `--model`, `--agent` |
+| **skill** | `cli` — print markdown reference of all uwf commands |
+| **log** | `list`, `show`, `clean` — process-level debug logs |
+
+Config is stored in `~/.uncaged/workflow/config.yaml`. API keys go in `~/.uncaged/workflow/.env`.
+
+Detailed command usage, options, and examples: [packages/cli-workflow/README.md](packages/cli-workflow/README.md).

 ## Development

 ```bash
-bun run check    # Biome lint + format check
-bun run format   # Auto-format with Biome
-bun test         # Run tests
+bun install --no-cache     # Install dependencies
+bun run build              # tsc --build (all packages)
+bun run check              # tsc + biome + lint-log-tags
+bun run format             # Auto-format with Biome
+bun test                   # Run all tests
 ```

-## Architecture
-
-See [docs/architecture.md](docs/architecture.md) for the full design — three-phase engine loop, bundle contract, storage layout, and design decisions.
+Managed with **bun workspace**. See [CLAUDE.md](CLAUDE.md) for coding conventions.
@@ -5,6 +5,8 @@
      "**",
      "!**/dist",
      "!**/node_modules",
+      "!**/legacy-packages",
+      "!scripts",
      "!packages/workflow/workflow",
      "!xiaoju/scripts/bundle.ts"
    ]
@@ -15,6 +17,15 @@
    "indentWidth": 2,
    "lineWidth": 100
  },
+  "css": {
+    "parser": {
+      "cssModules": true,
+      "tailwindDirectives": true
+    },
+    "linter": {
+      "enabled": false
+    }
+  },
  "javascript": {
    "formatter": {
      "quoteStyle": "double",
@@ -36,7 +47,7 @@
      }
    },
    {
-      "includes": ["**/*.d.ts"],
+      "includes": ["**/*.d.ts", "**/vitest.config.*"],
      "linter": {
        "rules": {
          "style": {
@@ -44,6 +55,16 @@
          }
        }
      }
+    },
+    {
+      "includes": ["**/cli.ts", "**/setup.ts"],
+      "linter": {
+        "rules": {
+          "suspicious": {
+            "noConsole": "off"
+          }
+        }
+      }
    }
  ],
  "linter": {
@@ -1,271 +1,495 @@
-# Uncaged workflow — Architecture
+# Workflow Engine — Architecture

-**Last updated:** 2026-05-09
+**Last updated:** 2026-05-19

 ---

 ## Overview

-A workflow engine that executes single-file ESM bundles. Each workflow is a self-contained `.esm.js` file identified by its XXH64 hash (Crockford Base32). No daemon — processes start on demand and exit when done.
+A stateless workflow engine driven by a single-step CLI. Workflows are YAML definitions stored as CAS nodes; threads are immutable chains of CAS-linked step nodes. No daemon — each `uwf thread step` invocation runs one moderator→agent→extract cycle and exits.

-The implementation lives in **21** Bun workspace packages under `packages/`, using the `workspace:*` protocol.
+The implementation lives in **6** active packages under `packages/`, plus two external CAS packages (`@uncaged/json-cas`, `@uncaged/json-cas-fs`). Legacy packages reside in `legacy-packages/` and are not part of the active stack.

 ## Package map

-Grouped by responsibility (npm name → folder).
-
 | Layer | Package | One-line role |
-|-------|---------|----------------|
-| Contract | `@uncaged/workflow-protocol` → `workflow-protocol` | Shared TypeScript types and `Result` helpers; peer `zod` only — no other workspace deps. |
-| Author API | `@uncaged/workflow-runtime` → `workflow-runtime` | `createWorkflow` and re-exports of protocol workflow types for bundle authors. |
-| Shared infra | `@uncaged/workflow-util` → `workflow-util` | Base32/ULID, logger, storage root paths, global CAS dir, ref-field helpers. |
-| LLM plumbing | `@uncaged/workflow-reactor` → `workflow-reactor` | `createLlmFn`, `createThreadReactor`, and related tool-call types for threaded LLM invocation. |
-| CAS | `@uncaged/workflow-cas` → `workflow-cas` | `CasStore` implementation, XXH64 hashing, Merkle helpers over CAS payloads. |
-| Registry / bundles | `@uncaged/workflow-register` → `workflow-register` | Bundle validation & dynamic export extraction, `workflow.yaml` registry I/O, provider/model resolution. |
-| Engine | `@uncaged/workflow-execute` → `workflow-execute` | Thread execution, worker entry path, fork/GC, extract pipeline, `workflowAsAgent`. |
-| CLI | `@uncaged/cli-workflow` → `cli-workflow` | `uncaged-workflow` binary (depends on engine, registry, CAS, protocol, util, runtime). |
-| Agent adapters | `@uncaged/workflow-agent-cursor` → `workflow-agent-cursor` | `AgentFn` via `cursor-agent` CLI + workspace extraction. |
-| | `@uncaged/workflow-agent-hermes` → `workflow-agent-hermes` | `AgentFn` via `hermes chat` CLI. |
-| | `@uncaged/workflow-agent-office` → `workflow-agent-office` | `AdapterFn` via `office-agent` CLI; generates or edits Word documents, stores outputs per threadId. |
-| | `@uncaged/workflow-agent-docx-diff` → `workflow-agent-docx-diff` | `AdapterFn` via `docx-diff` CLI; produces Word-format diff reports for document edit workflows. |
-| | `@uncaged/workflow-agent-llm` → `workflow-agent-llm` | `AgentFn` via OpenAI-compatible HTTP (`LlmProvider` from runtime). |
-| Agent shared | `@uncaged/workflow-util-agent` → `workflow-util-agent` | `buildAgentPrompt`, `spawnCli` for CLI-backed agents. |
-| Templates | `@uncaged/workflow-template-develop` → `workflow-template-develop` | Develop workflow definition, roles, descriptor builder. |
-| | `@uncaged/workflow-template-solve-issue` → `workflow-template-solve-issue` | Solve-issue workflow definition, roles, descriptor builder. |
-| | `@uncaged/workflow-template-document` → `workflow-template-document` | Document generation/editing workflow definition (writer + differ roles, moderator table, descriptor). |
-| Dashboard | `@uncaged/workflow-dashboard` → `workflow-dashboard` | Private Vite + React app (`src/main.tsx`); only `react` / `react-dom` dependencies — no workspace packages. |
+|-------|---------|---------------|
+| Contract | `@uncaged/workflow-protocol` → `workflow-protocol` | Shared TypeScript types (`WorkflowPayload`, `StepNodePayload`, `ModeratorContext`, `WorkflowConfig`, etc.). No runtime deps beyond `@uncaged/json-cas-fs`. |
+| Shared infra | `@uncaged/workflow-util` → `workflow-util` | Crockford Base32, ULID generation, `createLogger`, frontmatter parsing/validation. |
+| Moderator | `@uncaged/workflow-moderator` → `workflow-moderator` | JSONata-based graph evaluator: given a `WorkflowPayload` and `ModeratorContext`, returns the next role or `$END`. |
+| Agent framework | `@uncaged/workflow-agent-kit` → `workflow-agent-kit` | `createAgent` entrypoint factory, context builder, frontmatter fast-path extractor, LLM extract fallback, output format instruction builder. |
+| Agent: Hermes | `@uncaged/workflow-agent-hermes` → `workflow-agent-hermes` | `uwf-hermes` CLI binary — spawns `hermes chat`, pipes prompt, captures session detail. |
+| CLI | `@uncaged/cli-workflow` → `cli-workflow` | `uwf` binary — thread lifecycle, workflow registry, CAS inspection, setup. |

-## Dependency graph (workspace packages)
+### External dependencies

-Bottom-up layering for the execution stack:
+| Package | Role |
+|---------|------|
+| `@uncaged/json-cas` | Content-addressed store API, XXH64 hashing, JSON Schema registration and validation. |
+| `@uncaged/json-cas-fs` | Filesystem backend for `json-cas`. |
+| `jsonata` | JSONata expression evaluator (used by `workflow-moderator`). |
+| `commander` | CLI argument parsing (used by `cli-workflow`). |
+| `dotenv` | Loads `.env` files for API keys. |
+| `yaml` | YAML parse/stringify. |
+
+## Dependency graph

 ```mermaid
 flowchart BT
+  subgraph External
+    jcas["@uncaged/json-cas"]
+    jcasfs["@uncaged/json-cas-fs"]
+  end
  subgraph L0["Layer 0 — contract"]
    protocol["@uncaged/workflow-protocol"]
  end
-  subgraph L1["Layer 1 — on protocol"]
-    runtime["@uncaged/workflow-runtime"]
+  subgraph L1["Layer 1 — shared"]
    util["@uncaged/workflow-util"]
-    reactor["@uncaged/workflow-reactor"]
+    moderator["@uncaged/workflow-moderator"]
  end
-  subgraph L2["Layer 2 — protocol + util"]
-    cas["@uncaged/workflow-cas"]
-    register["@uncaged/workflow-register"]
+  subgraph L2["Layer 2 — agent framework"]
+    kit["@uncaged/workflow-agent-kit"]
  end
-  subgraph L3["Layer 3 — engine"]
-    execute["@uncaged/workflow-execute"]
+  subgraph L3["Layer 3 — agent implementations"]
+    hermes["@uncaged/workflow-agent-hermes"]
  end
  subgraph L4["Layer 4 — CLI"]
    cli["@uncaged/cli-workflow"]
  end
-  runtime --> protocol
+  protocol --> jcasfs
  util --> protocol
-  reactor --> protocol
-  cas --> protocol
-  cas --> util
-  register --> protocol
-  register --> util
-  execute --> protocol
-  execute --> runtime
-  execute --> util
-  execute --> cas
-  execute --> reactor
-  execute --> register
+  moderator --> protocol
+  kit --> protocol
+  kit --> util
+  kit --> jcas
+  kit --> jcasfs
+  hermes --> kit
+  hermes --> jcas
  cli --> protocol
  cli --> util
-  cli --> cas
-  cli --> execute
-  cli --> register
-  cli --> runtime
+  cli --> kit
+  cli --> moderator
+  cli --> jcas
+  cli --> jcasfs
 ```

-**Adjacent consumers** (not in the main CLI stack):
+## Workflow definition

- `@uncaged/workflow-util-agent` → `@uncaged/workflow-runtime`
- `@uncaged/workflow-agent-llm` → `@uncaged/workflow-runtime`
- `@uncaged/workflow-agent-cursor` → `@uncaged/workflow-runtime`, `@uncaged/workflow-util-agent`, `zod`
- `@uncaged/workflow-agent-hermes` → `@uncaged/workflow-runtime`, `@uncaged/workflow-util-agent`
- `@uncaged/workflow-template-develop` → `@uncaged/workflow-register`, `@uncaged/workflow-runtime`, `zod`
- `@uncaged/workflow-template-solve-issue` → `@uncaged/workflow-register`, `@uncaged/workflow-runtime`, `zod` (dev-only workspace deps: `@uncaged/workflow-cas`, `@uncaged/workflow-execute` for tests/tooling per `package.json`)
+Workflows are **YAML files** (not ESM bundles). `uwf workflow put <file.yaml>` parses the YAML, registers output schemas as JSON Schema CAS nodes, and stores the `WorkflowPayload` as a CAS node.

-## Package roles (detail)
+Example (`examples/solve-issue.yaml`):

- **`workflow-protocol`** — Pure types (`WorkflowFn`, contexts, `CasStore` interface, descriptor shapes), `START` / `END`, `ok` / `err`. Depends only on peer `zod` for schema-related types in signatures.
- **`workflow-runtime`** — Workflow author surface: `createWorkflow` from `src/create-workflow.js`, re-exports protocol types/constants used when authoring bundles.
- **`workflow-util`** — Cross-cutting utilities: Crockford Base32, ULID, `createLogger`, `getDefaultWorkflowStorageRoot`, `getGlobalCasDir`, ref normalization; re-exports `ok`/`err` from protocol.
- **`workflow-cas`** — Filesystem CAS (`createCasStore`), `hashString` / `hashWorkflowBundleBytes`, Merkle node serialization and helpers (`merkle.js`).
- **`workflow-register`** — Bundle pipeline (`validateWorkflowBundle`, `extractBundleExports`, descriptor builders), registry YAML read/write, `resolveModel` / `splitProviderModelRef`.
- **`workflow-execute`** — `executeThread`, supervisor/worker wiring (`engine/`), fork/GC/pause gate, `createExtract` + LLM extract helpers (`extract/`), `workflowAsAgent`. Imports `@uncaged/workflow-reactor` for LLM-backed extract/supervisor paths (`extract-fn.ts`, `supervisor.ts`).
- **`workflow-reactor`** — `createLlmFn`, `createThreadReactor`, and thread tool-invocation types — consumed by `workflow-execute`.
- **`cli-workflow`** — CLI commands and HTTP/dashboard-related wiring (`hono`, `yaml`); composes register + execute + CAS + util.
- **`workflow-agent-*`** — Replaceable `AgentFn` implementations (Cursor / Hermes CLIs, or HTTP LLM).
- **`workflow-util-agent`** — Shared prompt assembly and subprocess spawning for CLI agents.
- **`workflow-template-*`** — Concrete `WorkflowDefinition` graphs + Zod role schemas + descriptor builders for publishing bundles.
- **`workflow-dashboard`** — Standalone React UI; no published library entry matching `src/index.ts`.
+```yaml
+name: "solve-issue"
+description: "End-to-end issue resolution"
+roles:
+  planner:
+    description: "Creates implementation plan"
+    goal: "You are a planning agent. Analyze the issue and create a step-by-step plan."
+    capabilities:
+      - issue-analysis
+      - planning
+    procedure: "Analyze the issue and create a detailed, actionable implementation plan."
+    output: "Output the plan summary and list of concrete steps."
+    meta:
+      type: object
+      properties:
+        plan: { type: string }
+        steps: { type: array, items: { type: string } }
+      required: [plan, steps]
+  developer:
+    description: "Implements code changes"
+    goal: "You are a developer agent. Implement the plan."
+    capabilities:
+      - file-edit
+      - shell
+    procedure: "Implement the plan. Write code, tests, and ensure existing tests pass."
+    output: "List all files changed and provide a summary of the implementation."
+    meta:
+      type: object
+      properties:
+        filesChanged: { type: array, items: { type: string } }
+        summary: { type: string }
+      required: [filesChanged, summary]
+  reviewer:
+    description: "Reviews code changes"
+    goal: "You are a code reviewer. Review the implementation."
+    capabilities:
+      - code-review
+    procedure: "Review the implementation against the plan."
+    output: "Approve or reject with detailed comments."
+    meta:
+      type: object
+      properties:
+        approved: { type: boolean }
+        comments: { type: string }
+      required: [approved, comments]
+conditions:
+  notApproved:
+    description: "Reviewer rejected the implementation"
+    expression: "steps[-1].output.approved = false"
+graph:
+  $START:
+    - role: "planner"
+      condition: null
+  planner:
+    - role: "developer"
+      condition: null
+  developer:
+    - role: "reviewer"
+      condition: null
+  reviewer:
+    - role: "developer"
+      condition: "notApproved"
+    - role: "$END"
+      condition: null
+```
+
+Key properties:
+
+- **`roles`** — inline role definitions; each `meta` is a JSON Schema (stored as its own CAS node on registration)
+- **`conditions`** — named JSONata expressions evaluated against the `ModeratorContext`
+- **`graph`** — `Record<Role | "$START", Transition[]>` — first matching transition wins; `condition: null` = fallback
+- **No agent binding** — agent selection is a deployment concern, configured in `config.yaml`
+- **No Zod** — all schemas are JSON Schema, validated through `@uncaged/json-cas`

 ## Three-phase engine loop

-Each role round is implemented in `packages/workflow-runtime/src/create-workflow.ts` (`advanceOneRound`): moderator → agent → extractor, with progressive context types from `@uncaged/workflow-protocol`.
+Each `uwf thread step` runs exactly one cycle: moderator → agent → extract. The CLI orchestrates this in `packages/cli-workflow/src/commands/thread.ts` (`cmdThreadStep`).

 ```
 ┌─→ Phase 1: MODERATOR
-│   Context: ModeratorContext { threadId, depth, start, steps }
-│   Action:  moderator(ctx) → role name | END
+│   Input:  WorkflowPayload + ModeratorContext { start, steps[] }
+│   Engine: JSONata conditions evaluated against the graph
+│   Output: next role name | $END
 │
 │   Phase 2: AGENT
-│   Context: AgentContext = ModeratorCtx + { currentRole: { name, systemPrompt } }
-│   Action:  agent(ctx) → raw string
+│   Input:  thread-id + role (via argv)
+│   Engine: agent-kit builds context from CAS chain, prepends
+│           output format instruction to system prompt, spawns agent
+│   Output: raw string (frontmatter markdown)
 │
-│   Phase 3: EXTRACTOR
-│   Context: ExtractContext = AgentCtx + { agentContent }
-│   Action:  runtime.extract(schema, extractPrompt, ctx) → typed meta
+│   Phase 3: EXTRACT
+│   Input:  raw agent output + role's meta schema
+│   Engine: two-layer extract (frontmatter fast path → LLM fallback)
+│   Output: CasRef to structured output node
 │
-│   Merge: RoleStep { role, contentHash, meta, refs, timestamp }
-│   Append to steps
-└─────────────────────────────────────────────────────┘
+│   Persist: StepNode { start, prev, role, output, detail, agent }
+│   Update:  threads.yaml head pointer
+└─────────────────────────────────────────────────────────────────┘
 ```

-### Context types (progressive)
+### Context types

 Defined in `packages/workflow-protocol/src/types.ts`:

 ```typescript
-type ModeratorContext<M> = ThreadContext<M>;
-type AgentContext<M> = ModeratorContext<M> & {
-  currentRole: { name: string; systemPrompt: string };
+type StepContext = {
+  role: string;
+  output: unknown;    // CAS node payload, expanded (not hash)
+  detail: CasRef;
+  agent: string;
+};
+
+type ModeratorContext = {
+  start: StartNodePayload;  // { workflow: CasRef, prompt: string }
+  steps: StepContext[];     // chronological, oldest first
+};
+
+type AgentContext = ModeratorContext & {
+  threadId: ThreadId;
+  role: string;
+  store: Store;
+  workflow: WorkflowPayload;
+  outputFormatInstruction: string;
 };
-type ExtractContext<M> = AgentContext<M> & { agentContent: string };
 ```

 ### Key properties

- **Moderator is synchronous and pure** — no I/O, no state mutation inside `createWorkflow`’s moderator call path.
- **Agent receives `AgentContext`** — reads `ctx.currentRole.systemPrompt`; raw output becomes `agentContent` for extract.
- **Extractor is `WorkflowRuntime.extract`** — supplied by the engine from registry-resolved LLM config (`workflow-execute`); stores agent body in CAS and yields `contentHash` + `refs` on each step (`create-workflow.ts`).
- **`extractPrompt` is a call parameter** on `RoleDefinition`, not implicit context state.
+- **Moderator** — pure JSONata evaluation; no LLM call, no I/O beyond CAS reads. Evaluates `workflow.graph[currentRole]` transitions in order, returns first match.
+- **Agent** — receives `AgentContext` with thread history + role system prompt + output format instruction. Raw output is frontmatter markdown.
+- **Extractor** — two-layer: tries frontmatter fast-path first (zero LLM cost), falls back to LLM extract if frontmatter is absent or invalid.
+- **Stateless** — each `uwf thread step` is an atomic, self-contained operation. No in-memory state between steps.

-## Agent information sources
+## Agent CLI protocol

-An agent has exactly three information sources:
+Each agent is an external command invoked by `uwf thread step`:

-1. **Prior knowledge** — LLM training, agent memory, agent skills
-2. **Thread context** — `AgentContext` (`start`, `steps`, `currentRole`)
-3. **Derived information** — from 1 & 2 (e.g. tool calls, shell commands)
-
-No hidden environment parameters. If an agent needs something (like a workspace path), it obtains it via `ExtractFn` (e.g. Cursor agent).
-
-## Bundle contract
-
-A workflow bundle is a single `.esm.js` file with two named exports (see `WorkflowFn` / `WorkflowDescriptor` in `packages/workflow-protocol/src/types.ts`):
-
-```typescript
-export const descriptor: WorkflowDescriptor;
-export const run: WorkflowFn;
-
-type WorkflowFn = (
-  thread: ThreadContext,
-  runtime: WorkflowRuntime,
-) => AsyncGenerator<RoleOutput, WorkflowCompletion>;
+```bash
+<agent-cmd> <thread-id> <role>
 ```

-`RoleOutput` carries `contentHash`, `meta`, and `refs` (agent text lives in CAS, addressed by hash).
+Contract:
+1. `uwf thread step` determines the next role via the moderator
+2. Agent CLI is spawned with `(thread-id, role)` as positional args
+3. `workflow-agent-kit` (`createAgent`) handles the boilerplate:
+   - Parses argv
+   - Loads `.env` from storage root
+   - Builds `AgentContext` by walking the CAS chain from `threads.yaml` head
+   - Resolves the role's `meta` schema and builds `outputFormatInstruction`
+   - Calls the agent's `run` function
+   - Runs two-layer extract on the raw output
+   - Writes `StepNode` to CAS (output + detail + prev link)
+   - Prints the new `StepNode` CAS hash to stdout
+4. `uwf thread step` reads stdout, updates `threads.yaml` head pointer, re-evaluates moderator for `done`
+5. Exit 0 = success, non-zero = failure

-### Constraints
+Agent resolution priority: `--agent` CLI override → `config.yaml` per-workflow/role override → `config.yaml` `defaultAgent`.

- Single `.esm.js` file
- No dynamic `import()` in bundles (loader exempt in engine)
- Portable bundle static imports are constrained by validation in `@uncaged/workflow-register` (`validateWorkflowBundle`)
- XXH64 hash (Crockford Base32) = version ID
+## Agent output format: frontmatter markdown (RFC #351)

-### Why AsyncGenerator?
+Agents produce **frontmatter markdown** — YAML frontmatter for structured meta, followed by a markdown body for content:

- Each `yield` lets `workflow-execute` persist state, CAS rows, and enforce pause/abort
- `return` supplies `WorkflowCompletion`
- Fork replays historical steps into a new thread context
- Bundle does not import the engine — only protocol/runtime types at build time
+```markdown
+---
+status: done
+next: reviewer
+confidence: 0.9
+artifacts:
+  - src/auth.ts
+scope: role
+---
+
+## Implementation
+
+Fixed the login redirect by updating the auth middleware...
+```
+
+The `outputFormatInstruction` (built by `buildOutputFormatInstruction` in `workflow-agent-kit`) is prepended to the role's system prompt, so the deliverable format is the first thing the agent sees. It lists the expected frontmatter fields derived from the role's `meta` JSON Schema.
+
+## Two-layer extract
+
+Structured output extraction uses a two-layer strategy (`workflow-agent-kit`):
+
+### Layer 1: frontmatter fast path (`frontmatter.ts`)
+
+1. Parse YAML frontmatter from raw agent output (`parseFrontmatterMarkdown`)
+2. Validate required fields (`validateFrontmatter`)
+3. Build a candidate object from frontmatter fields (`status`, `next`, `confidence`, `artifacts`, `scope`)
+4. `store.put()` the candidate against the role's `meta` schema
+5. Validate with `json-cas` schema validation
+6. If valid → return `outputHash` (zero LLM cost)
+
+### Layer 2: LLM extract fallback (`extract.ts`)
+
+If the fast path returns `null` (no frontmatter, invalid, or doesn't satisfy schema):
+
+1. Resolve extract model alias from config (`modelOverrides.extract` → `models.extract` → `defaultModel`)
+2. Call OpenAI-compatible chat completion with JSON mode
+3. System prompt: "Extract structured data matching this JSON Schema: ..."
+4. User message: the raw agent output
+5. Parse response, `store.put()`, validate
+6. Return `outputHash`
+
+## Prompt injection
+
+`workflow-agent-kit` prepends two pieces of context to the agent's system prompt:
+
+1. **Deliverable format instruction** — generated from the role's `meta` schema, tells the agent exactly what frontmatter fields to produce and the expected format
+2. **Scope constraint** — "Focus exclusively on YOUR role's deliverable. Do not perform actions outside your role's scope."
+
+This ensures agents produce parseable frontmatter output without requiring per-agent format knowledge.
+
+## CAS node types
+
+### Workflow
+
+```yaml
+type: <workflow-schema-hash>
+payload:
+  name: "solve-issue"
+  description: "End-to-end issue resolution"
+  roles:
+    planner:
+      description: "Creates implementation plan"
+      goal: "You are a planning agent..."
+      capabilities: [planning, issue-analysis]
+      procedure: "Analyze the issue and create a plan."
+      output: "Output the plan summary."
+      meta: "5GWKR8TN1V3JA"    # cas_ref → JSON Schema node
+  conditions:
+    notApproved:
+      description: "Reviewer rejected"
+      expression: "steps[-1].output.approved = false"
+  graph:
+    $START:
+      - role: "planner"
+        condition: null
+```
+
+### StartNode
+
+```yaml
+type: <start-node-schema-hash>
+payload:
+  workflow: "4KNM2PXR3B1QW"    # cas_ref → Workflow
+  prompt: "Fix the login bug..."
+```
+
+### StepNode
+
+```yaml
+type: <step-node-schema-hash>
+payload:
+  start: "4TNVW8KR2B3MA"      # cas_ref → StartNode
+  prev: "2MXBG6PN4A8JR"       # cas_ref → previous StepNode (null for first step)
+  role: "developer"
+  output: "9KRVW3TN5F1QA"     # cas_ref → structured output (validated against meta schema)
+  detail: "7BQST3VW9F2MA"     # cas_ref → execution detail (raw turns, session data)
+  agent: "uwf-hermes"         # agent command used (plain string)
+```
+
+### Chain structure
+
+```
+threads.yaml: { "01J7K9...4T": "8FWKR3TN5V1QA" }
+                                    │
+                                    ▼
+                            StepNode (step 3)
+                            ├── start ──→ StartNode
+                            │              ├── workflow → Workflow (CAS)
+                            │              └── prompt: "Fix..."
+                            ├── prev ──→ StepNode (step 2)
+                            │             ├── prev ──→ StepNode (step 1)
+                            │             │             └── prev: null
+                            │             └── ...
+                            ├── role: "reviewer"
+                            ├── output → CAS({ approved: true })
+                            ├── detail → CAS(session turns)
+                            └── agent: "uwf-hermes"
+```

 ## Storage layout

 ```
 ~/.uncaged/workflow/
-├── cas/                           # Global content-addressed blobs (see getGlobalCasDir)
-├── bundles/
-│   ├── C9NMV6V2TQT81.esm.js       # Crockford Base32 of XXH64
-│   ├── C9NMV6V2TQT81.yaml         # Role descriptor sidecar (when present)
-│   └── C9NMV6V2TQT81/             # Per-hash bundle dir (alongside or instead of loose files)
-│       ├── threads.json           # Active threads: threadId → { head, start, updatedAt }
-│       └── history/
-│           └── 2026-05-09.jsonl   # Completed threads (one JSON object per line)
-├── logs/                          # One folder per bundle hash
-│   └── C9NMV6V2TQT81/
-│       ├── 01KQXKW…YG.running     # Present while worker executes this thread (optional)
-│       └── 01KQXKW…YG.info.jsonl   # Debug log
-└── workflow.yaml                  # Registry
+├── cas/                          # json-cas filesystem store (all CAS nodes)
+├── config.yaml                   # Provider, model, agent configuration
+├── threads.yaml                  # Active thread head pointers: threadId → CasRef
+├── history.jsonl                 # Archived thread records
+├── registry.yaml                 # Workflow name → CAS hash mapping
+└── .env                          # API keys (loaded by dotenv)
 ```

+### Mutable state
+
+Only three files carry mutable state:
+
+| File | Contents |
+|------|----------|
+| `threads.yaml` | `Record<ThreadId, CasRef>` — maps active thread IDs to head node hash |
+| `history.jsonl` | Append-only log of completed threads (`thread`, `workflow`, `head`, `completedAt`) |
+| `registry.yaml` | Workflow name → current CAS hash |
+
+Everything else is immutable CAS content.
+
 ### ID encoding: Crockford Base32

 - Case-insensitive, filesystem-safe, no ambiguous chars (0/O, 1/I/L)
- Bundle hash: XXH64 → 13-char
- Thread ID: ULID → 26-char (10 timestamp + 16 random)
+- CAS hash: XXH64 → 13-char Crockford Base32
+- Thread ID: ULID → 26-char Crockford Base32 (10 timestamp + 16 random)

-### Registry (`workflow.yaml`)
+### Config (`config.yaml`)

-Managed by `@uncaged/workflow-register` (`readWorkflowRegistry`, `writeWorkflowRegistry`, …). Shape includes workflow entries and a top-level `config` section used for extract/supervisor model resolution.
+```yaml
+providers:
+  openrouter:
+    baseUrl: "https://openrouter.ai/api/v1"
+    apiKeyEnv: "OPENROUTER_API_KEY"

-### Thread storage (CAS + index)
+models:
+  sonnet:
+    provider: "openrouter"
+    name: "anthropic/claude-sonnet-4"
+  gpt4o-mini:
+    provider: "openai"
+    name: "gpt-4o-mini"

-Thread execution state is a chain of immutable CAS nodes (`StartNode`, `StateNode`, content Merkle blobs). Per bundle:
+agents:
+  hermes:
+    command: "uwf-hermes"
+    args: []
+  cursor:
+    command: "uwf-cursor"
+    args: []

- **`threads.json`** — only in-flight threads (`head`, `start`, `updatedAt`).
- **`history/{YYYY-MM-DD}.jsonl`** — completed threads (`threadId`, `head`, `start`, `completedAt`).
- **CAS (`cas/`)** — payloads and refs for replay, GC, and fork sharing.
+defaultAgent: "hermes"
+agentOverrides:
+  solve-issue:
+    developer: "cursor"

-**`.info.jsonl`** — Structured debug log via `@uncaged/workflow-util` `createLogger`:
-
-```jsonc
-{ "tag": "4KNMR2PX", "content": "Loading bundle...", "timestamp": ... }
+defaultModel: "sonnet"
+modelOverrides:
+  extract: "gpt4o-mini"
 ```

-Tags are 8-char Crockford Base32 (40-bit random), one per call site. `grep "4KNMR2PX"` → code location.
-
-## Execution model
-
- **No daemon.** `uncaged-workflow run <name>` starts a worker process (`workflow-execute` worker entry via `getWorkerHostScriptPath`)
- Threads share bundle-scoped workers as implemented in CLI/engine
- Pause/resume/abort via engine IPC and pause gate (`createThreadPauseGate`)
-
 ## CLI commands

-| Priority | Command | Description |
-|----------|---------|-------------|
-| P1 | `add <name> <file.esm.js>` | Register a bundle |
-| P1 | `list` | List registered workflows |
-| P1 | `show <name>` | Show workflow details |
-| P1 | `remove <name>` | Remove a workflow |
-| P1 | `run <name> [--prompt] [--max-rounds]` | Start a thread |
-| P1 | `threads [name]` | List threads |
-| P1 | `thread <id>` | Show thread state |
-| P1 | `thread rm <id>` | Delete a thread |
-| P1 | `ps` | List running threads |
-| P1 | `kill <thread-id>` | Terminate a running thread |
-| P2 | `history <name>` | Show version history |
-| P2 | `rollback <name> [hash]` | Switch to a previous version |
-| P2 | `pause <thread-id>` | Pause a running thread |
-| P2 | `resume <thread-id>` | Resume a paused thread |
-| P3 | `fork <thread-id> [--from-role <role>]` | Fork from historical state |
+Binary: `uwf`
+
+### Thread commands
+
+| Command | Description |
+|---------|-------------|
+| `uwf thread start <workflow> -p <prompt>` | Create a thread (StartNode → CAS, head → threads.yaml). No execution. |
+| `uwf thread step <thread-id> [--agent <cmd>]` | Execute one moderator→agent→extract cycle. |
+| `uwf thread show <thread-id>` | Show thread head pointer and done status. |
+| `uwf thread list [--all]` | List active threads (`--all` includes archived). |
+| `uwf thread steps <thread-id>` | List all steps in chronological order. |
+| `uwf thread read <thread-id> [--quota <chars>] [--before <hash>]` | Render thread as human-readable markdown. |
+| `uwf thread fork <step-hash>` | Fork a thread from a specific CAS node. |
+| `uwf thread step-details <step-hash>` | Dump full detail node as YAML. |
+| `uwf thread kill <thread-id>` | Terminate and archive a thread. |
+
+### Workflow commands
+
+| Command | Description |
+|---------|-------------|
+| `uwf workflow put <file.yaml>` | Register a workflow from YAML definition. |
+| `uwf workflow show <id>` | Show workflow by name or CAS hash. |
+| `uwf workflow list` | List registered workflows. |
+
+### CAS commands
+
+| Command | Description |
+|---------|-------------|
+| `uwf cas get <hash>` | Read a CAS node. |
+| `uwf cas put <type-hash> <data>` | Store a node, print its hash. |
+| `uwf cas has <hash>` | Check if a hash exists. |
+| `uwf cas refs <hash>` | List direct CAS references. |
+| `uwf cas walk <hash>` | Recursive traversal from a node. |
+| `uwf cas reindex` | Rebuild type index from all nodes. |
+| `uwf cas schema list` | List registered schemas. |
+| `uwf cas schema get <hash>` | Show a schema by type hash. |
+
+### Setup
+
+| Command | Description |
+|---------|-------------|
+| `uwf setup [--provider --base-url --api-key --model --agent]` | Configure provider/model/agent (interactive if no flags). |
+
+## Toolchain
+
+| Tool | Purpose |
+|------|---------|
+| **bun** | Package manager + runtime |
+| **TypeScript** | Type checking (strict mode) |
+| **Biome** | Lint + format |
+| **vitest** | Test runner |

 ## Design decisions

 | Decision | Rationale |
 |----------|-----------|
-| **Role = pure data** | Decouples definition from execution; same role with different agents |
-| **Agent bound at runtime** | `WorkflowDefinition` is reusable; agent choice is deployment concern |
-| **Three-phase context** | Each phase sees only what it needs; types live in `workflow-protocol` |
-| **`WorkflowRuntime.extract` + CAS `contentHash`** | Large agent bodies deduplicated globally; Merkle roots summarize threads |
-| **`workflow-reactor` split** | LLM tool-calling loop isolated from filesystem/registry concerns |
-| **Single-file ESM** | Hash = version, self-contained bundle |
-| **No daemon** | OS handles process lifecycle |
-| **Crockford Base32** | Filesystem-safe, readable, compact |
-| **21-package split** | Clear boundaries: protocol ↔ runtime author API ↔ util/CAS/register ↔ execute ↔ CLI ↔ agents/templates/UI |
+| **YAML workflow definitions** | Human-readable, versionable, no build step required. JSON Schema inline in YAML, registered as CAS nodes on `workflow put`. |
+| **Stateless single-step CLI** | Each `uwf thread step` is atomic — no in-memory state, no daemon, no long-running process. OS handles lifecycle. |
+| **CAS-backed thread state** | Immutable linked nodes enable fork, replay, and GC without copying data. Content-addressed deduplication across threads. |
+| **JSONata moderator** | Declarative condition expressions evaluated against thread history. No LLM cost for routing decisions. |
+| **Frontmatter markdown output** | Agents produce structured meta (YAML frontmatter) alongside free-form content (markdown body). Enables zero-cost extraction when frontmatter is well-formed. |
+| **Two-layer extract** | Fast path avoids LLM calls when agents follow the format; LLM fallback handles messy output gracefully. |
+| **Prompt injection for format** | Output format instruction prepended to system prompt ensures agents produce parseable output without per-agent configuration. |
+| **JSON Schema (not Zod)** | Schemas are CAS-native data — storable, hashable, validatable through `json-cas`. No code generation, no runtime library dependency. |
+| **Agent as external command** | Agents are independent CLI binaries (`uwf-hermes`, `uwf-cursor`). Swappable per workflow/role via config. No tight coupling to the engine. |
+| **No daemon** | Process starts, does one step, exits. Simpler failure model, no connection management. |
+| **Crockford Base32** | Filesystem-safe, case-insensitive, readable, compact. |
@@ -0,0 +1,779 @@
+# Built-in Role Agent 调研
+
+## 目标
+
+实现一个内置的 role agent（暂称 `uwf-builtin`），不依赖 hermes/openclaw 等外部 agent 进程。
+直接使用 workflow config 中配置的 model，自己实现 agent run loop 和关键 toolkit。
+
+---
+
+## 关键问题
+
+### Q1: Agent 接口协议
+
+现有 agent 是怎么被 CLI 调用的？输入（argv、环境变量）和输出（stdout、CAS）格式是什么？
+
+**调研要点：**
+- `cli-workflow` 里 `spawnAgent` 的完整实现
+- AgentConfig 类型定义
+- agent 进程的 exit code 约定
+- 环境变量传递（UWF_STORAGE_ROOT 等）
+
+**答案：**
+
+#### 调用链
+
+`uwf thread step` → `cmdThreadStepOnce` → moderator 求值下一 role → `resolveAgentConfig` → `spawnAgent`。
+
+#### AgentConfig 类型
+
+```146:149:packages/workflow-protocol/src/types.ts
+export type AgentConfig = {
+  command: string;
+  args: string[];
+};
+```
+
+在 `config.yaml` 的 `agents` 段注册，例如 `hermes: { command: "uwf-hermes", args: [] }`。
+
+#### spawnAgent 行为
+
+```627:653:packages/cli-workflow/src/commands/thread.ts
+function spawnAgent(agent: AgentConfig, threadId: ThreadId, role: string): CasRef {
+  const argv = [...agent.args, threadId, role];
+  let stdout: string;
+  try {
+    stdout = execFileSync(agent.command, argv, {
+      encoding: "utf8",
+      env: process.env,
+      stdio: ["ignore", "pipe", "pipe"],
+    });
+  } catch (e) {
+  // ... stderr 拼进 fail 消息
+  }
+
+  const line = stdout.trim().split("\n").pop()?.trim() ?? "";
+  if (!isCasRef(line)) {
+    fail(`agent stdout is not a valid CAS hash: ${line || "(empty)"}`);
+  }
+  return line;
+}
+```
+
+| 项目 | 约定 |
+|------|------|
+| **argv** | `[...agent.args, <thread-id>, <role>]`，即 `process.argv[2]`=threadId，`process.argv[3]`=role（与 `createAgent` 的 `parseArgv` 一致） |
+| **stdin** | 忽略 |
+| **stdout** | 纯文本，**最后一行**必须是新 `StepNode` 的 CAS hash（13 字符 Crockford Base32） |
+| **stderr** | 失败时 CLI 会附带 stderr；成功时无约定 |
+| **exit code** | `0` = 成功；非 0 时 `execFileSync` 抛错，step 失败 |
+| **环境变量** | 继承父进程 `process.env`（含 storage root、API key 等） |
+| **链头更新** | **不由 agent 负责**；agent 只写 CAS StepNode，CLI 在拿到 stdout hash 后更新 `threads.yaml` |
+
+Agent 解析优先级（`resolveAgentConfig`）：
+
+1. CLI `--agent` override（整段 command + args 字符串）
+2. `config.agentOverrides[workflow.name][role]`
+3. `config.defaultAgent`
+
+#### 环境变量：Storage Root
+
+文档中写的 `UWF_STORAGE_ROOT` **在当前代码中不存在**。实际优先级（`workflow-agent-kit` / `cli-workflow` 一致）：
+
+```33:43:packages/workflow-agent-kit/src/storage.ts
+export function resolveStorageRoot(): string {
+  const internal = process.env.UNCAGED_WORKFLOW_STORAGE_ROOT;
+  if (internal !== undefined && internal !== "") {
+    return internal;
+  }
+  const userOverride = process.env.WORKFLOW_STORAGE_ROOT;
+  if (userOverride !== undefined && userOverride !== "") {
+    return userOverride;
+  }
+  return getDefaultStorageRoot();
+}
+```
+
+Agent 子进程通过继承的 `process.env` 与父 CLI 共享同一 storage root；`createAgent` 内还会 `loadDotenv({ path: getEnvPath(storageRoot) })` 加载 `~/.uncaged/workflow/.env`。
+
+#### Agent 侧职责（设计文档 + 实现）
+
+- 读 `threads.yaml` 链头，构建 context，执行 role
+- 将 `StepNode` 写入 CAS（`output` / `detail` / `agent` / `prev` / `start`）
+- stdout 打印 step hash
+- **不**更新 `threads.yaml`
+
+---
+
+### Q2: createAgent 工厂
+
+workflow-agent-kit 的 `createAgent` 做了什么？它的完整生命周期是什么？
+
+**调研要点：**
+- `AgentOptions` 类型的 `run` 和 `continue` 回调签名
+- `AgentRunResult` 的完整定义
+- retry 逻辑（frontmatter 校验失败后的重试机制）
+- `persistStep` 写入 CAS 的 StepNode 结构
+
+**答案：**
+
+#### 类型定义
+
+```4:35:packages/workflow-agent-kit/src/types.ts
+export type AgentContext = ModeratorContext & {
+  threadId: ThreadId;
+  role: string;
+  store: Store;
+  workflow: WorkflowPayload;
+  outputFormatInstruction: string;
+};
+
+export type AgentRunResult = {
+  output: string;
+  detailHash: CasRef;
+  sessionId: string;
+};
+
+export type AgentContinueFn = (
+  sessionId: string,
+  message: string,
+  store: AgentContext["store"],
+) => Promise<AgentRunResult>;
+
+export type AgentRunFn = (ctx: AgentContext) => Promise<AgentRunResult>;
+
+export type AgentOptions = {
+  name: string;
+  run: AgentRunFn;
+  continue: AgentContinueFn;
+};
+```
+
+- **`run(ctx)`**：首次执行，返回原始 agent 文本 `output`、审计用 `detailHash`、用于续聊的 `sessionId`。
+- **`continue(sessionId, message, store)`**：在同一 session 上追加用户消息（用于 frontmatter 纠错），再次返回 `AgentRunResult`。
+
+`createAgent(options)` 返回 `() => Promise<void>`，作为 agent CLI 的 `main`（见 `uwf-hermes` 的 `cli.ts`）。
+
+#### 生命周期（按执行顺序）
+
+```101:152:packages/workflow-agent-kit/src/run.ts
+export function createAgent(options: AgentOptions): () => Promise<void> {
+  return async function main(): Promise<void> {
+    const { threadId, role } = parseArgv(process.argv);
+    const storageRoot = resolveStorageRoot();
+    loadDotenv({ path: getEnvPath(storageRoot) });
+
+    const ctx = await buildContextWithMeta(threadId, role);
+    // 1. 校验 role 存在
+    // 2. 从 CAS 取 frontmatter JSON Schema → buildOutputFormatInstruction → ctx.outputFormatInstruction
+
+    let agentResult = await options.run(ctx);
+
+    let outputHash = await tryExtractOutput(agentResult.output, roleDef.frontmatter, ctx);
+
+    for (let retry = 0; retry < MAX_FRONTMATTER_RETRIES && outputHash === null; retry++) {
+      const correctionMessage = "Your previous response did not contain valid YAML frontmatter...";
+      agentResult = await options.continue(agentResult.sessionId, correctionMessage, ctx.meta.store);
+      outputHash = await tryExtractOutput(agentResult.output, roleDef.frontmatter, ctx);
+    }
+
+    if (outputHash === null) { fail(...); }
+
+    const stepHash = await persistStep({ ctx, outputHash, detailHash: agentResult.detailHash, agentName });
+    process.stdout.write(`${stepHash}\n`);
+  };
+}
+```
+
+| 阶段 | 行为 |
+|------|------|
+| 解析 argv | `argv[2]=threadId`, `argv[3]=role`，缺失则 `stderr` + `exit(1)` |
+| Context | `buildContextWithMeta` + 可选 `outputFormatInstruction` |
+| Run | `options.run(ctx)` |
+| Extract | **仅** `tryFrontmatterFastPath`（见 Q4）；**不**调用 `extract()` LLM fallback |
+| Retry | 最多 `MAX_FRONTMATTER_RETRIES = 2` 次 `continue` + 再试 fast-path |
+| Persist | `persistStep` → `writeStepNode` |
+| 输出 | stdout 一行 step CAS hash |
+
+#### StepNode 写入结构
+
+```44:68:packages/workflow-agent-kit/src/run.ts
+async function writeStepNode(options: {
+  store: AgentStore["store"];
+  schemas: AgentStore["schemas"];
+  startHash: CasRef;
+  prevHash: CasRef | null;
+  role: string;
+  outputHash: CasRef;
+  detailHash: CasRef;
+  agentName: string;
+}): Promise<CasRef> {
+  const payload: StepNodePayload = {
+    start: options.startHash,
+    prev: options.prevHash,
+    role: options.role,
+    output: options.outputHash,
+    detail: options.detailHash,
+    agent: options.agentName,
+  };
+  // store.put(stepNode schema) + validate
+}
+```
+
+`agentName` 经 `agentLabel(name)` 规范化：已有 `uwf-` 前缀则原样，否则加 `uwf-`（如 `hermes` → `uwf-hermes`）。
+
+`prevHash`：若链头仍是 `StartNode` 则为 `null`，否则为当前 head step hash。
+
+---
+
+### Q3: Context Builder
+
+`buildContextWithMeta` 构建了什么上下文给 agent？
+
+**调研要点：**
+- `AgentContext` 完整类型定义（所有字段）
+- context 构建过程（CAS chain walk）
+- `outputFormatInstruction` 怎么生成的
+- role definition 怎么获取（从 workflow YAML）
+
+**答案：**
+
+#### AgentContext 字段
+
+继承 `ModeratorContext`：
+
+```60:68:packages/workflow-protocol/src/types.ts
+export type ModeratorContext = {
+  start: StartNodePayload;
+  steps: StepContext[];
+};
+```
+
+```48:51:packages/workflow-protocol/src/types.ts
+export type StartNodePayload = {
+  workflow: CasRef;
+  prompt: string;
+};
+```
+
+```61:63:packages/workflow-protocol/src/types.ts
+export type StepContext = Omit<StepRecord, "output"> & {
+  output: unknown;
+};
+```
+
+`AgentContext` 额外字段：
+
+| 字段 | 类型 | 含义 |
+|------|------|------|
+| `threadId` | `ThreadId` | 当前线程 |
+| `role` | `string` | 本步要执行的角色名 |
+| `store` | `Store` | CAS store（读写节点） |
+| `workflow` | `WorkflowPayload` | 已从 CAS 加载的 workflow 定义 |
+| `outputFormatInstruction` | `string` | 由 `createAgent` 根据 role 的 frontmatter schema 生成；`buildContext*` 初始为 `""` |
+
+`buildContextWithMeta` 还返回 `meta`：
+
+```148:154:packages/workflow-agent-kit/src/context.ts
+export type BuildContextMeta = {
+  storageRoot: string;
+  store: Store;
+  schemas: AgentStore["schemas"];
+  headHash: CasRef;
+  chain: ChainState;
+};
+```
+
+#### CAS chain walk
+
+1. 从 `threads.yaml[threadId]` 取 `headHash`
+2. `walkChain`：若 head 是 `StartNode`，`stepsNewestFirst=[]`；否则沿 `prev` 收集所有 `StepNode`， newest-first
+3. `buildHistory`：反转为时间序，`expandOutput` 把每步 `output` CasRef 展开为 JSON payload（供 prompt / JSONata 使用）
+4. `loadWorkflow`：从 `start.workflow` CasRef 加载 `WorkflowPayload`
+
+#### Role definition 来源
+
+- 作者写在 workflow YAML 的 `roles.<name>`（`goal`, `capabilities`, `procedure`, `output`, `frontmatter` 等）
+- `uwf workflow put` 时 `frontmatter` 内联 JSON Schema 经 `putSchema` 存入 CAS，workflow 里存的是 **CasRef**
+- Agent 运行时：`ctx.workflow.roles[ctx.role]` → `RoleDefinition`
+
+#### outputFormatInstruction
+
+在 `createAgent` 中，若 `getSchema(store, roleDef.frontmatter)` 非空，则：
+
+```typescript
+ctx.outputFormatInstruction = buildOutputFormatInstruction(frontmatterSchema);
+```
+
+`buildOutputFormatInstruction` 根据 JSON Schema 的 `properties` 生成「必须以 `---` YAML frontmatter 开头」的说明和示例字段列表（见 `build-output-format-instruction.ts`）。
+
+各 agent 实现（Hermes / Claude Code）在组装 prompt 时把该块放在最前，再接 `buildRolePrompt(roleDef)`。
+
+---
+
+### Q4: Extract Pipeline
+
+agent 输出怎么被处理成结构化数据？
+
+**调研要点：**
+- frontmatter fast-path 的完整逻辑
+- LLM extract fallback 的实现（`extract.ts`）
+- frontmatter schema 从哪里来（role 定义里的 `frontmatter` 字段）
+- 校验失败时的 correction prompt 是什么
+
+**答案：**
+
+#### Schema 来源
+
+Workflow YAML 中每个 role 的 `frontmatter:` 段是 JSON Schema 对象；注册时：
+
+```66:76:packages/cli-workflow/src/commands/workflow.ts
+async function resolveFrontmatterRef(..., frontmatter: unknown): Promise<CasRef> {
+  // 校验为 JSON Schema → putSchema → 返回 CasRef
+}
+```
+
+运行时 `roleDef.frontmatter` 即该 schema 的 CAS hash；structured `output` 节点用**同一 schema** 写入 CAS。
+
+#### Frontmatter fast-path（createAgent 实际使用的路径）
+
+```148:195:packages/workflow-agent-kit/src/frontmatter.ts
+export async function tryFrontmatterFastPath(
+  raw: string,
+  outputSchema: CasRef,
+  store: Store,
+): Promise<FrontmatterFastPathResult | null>
+```
+
+流程：
+
+1. `parseFrontmatterMarkdown(raw)` → 标准 agent 字段（`status`, `next`, `confidence`, `artifacts`, `scope`）+ body
+2. `validateFrontmatter` 失败 → `null`
+3. `getSchema(store, outputSchema)` + `extractSchemaFields` 得到 role 需要的属性名
+4. `buildCandidate`：从标准 frontmatter + YAML 原始字段拼出符合 schema 的对象
+5. `store.put(outputSchema, candidate)` + `validate` → 成功则 `{ body, outputHash }`
+
+**永不抛错**，失败返回 `null`。
+
+#### LLM extract fallback（已实现但未接入 createAgent）
+
+```135:181:packages/workflow-agent-kit/src/extract.ts
+export async function extract(
+  rawOutput: string,
+  outputSchema: CasRef,
+  config: WorkflowConfig,
+): Promise<ExtractResult>
+```
+
+- 模型：`resolveExtractModelAlias(config)` → `modelOverrides.extract` → `models.extract` → `models.default` → `defaultModel`
+- HTTP：`POST {baseUrl}/chat/completions`，`response_format: { type: "json_object" }`
+- System：要求按 JSON Schema 从 agent 输出提取单个 JSON 对象
+- 校验通过后 `store.put(outputSchema, structured)`
+
+**重要：`createAgent` 当前未调用 `extract()`**。fast-path 失败且 2 次 `continue` 仍失败则直接 `fail()`。builtin agent 若希望无 frontmatter 也能跑，需在 kit 或 builtin 层显式接入 `extract()`。
+
+#### Correction prompt（retry）
+
+```125:128:packages/workflow-agent-kit/src/run.ts
+const correctionMessage =
+  "Your previous response did not contain valid YAML frontmatter matching the role schema.\n" +
+  "You MUST begin your response with a YAML frontmatter block (--- delimited).\n" +
+  "Please output ONLY the corrected frontmatter block followed by your work.";
+```
+
+通过 `options.continue(sessionId, correctionMessage, store)` 发给外部 agent；builtin 需在自有 message 历史里 append 同等语义的 user 消息。
+
+---
+
+### Q5: Model 配置与 LLM 调用
+
+workflow 怎么配置和使用 model？
+
+**调研要点：**
+- `WorkflowConfig` 中 providers/models/defaultModel/modelOverrides 的完整定义
+- `resolveModel` 函数的实现
+- `chatCompletionText` 的实现（OpenAI 兼容 HTTP 客户端）
+- 有没有 streaming 支持？tool calling 支持？
+
+**答案：**
+
+#### WorkflowConfig
+
+```136:160:packages/workflow-protocol/src/types.ts
+export type ProviderConfig = {
+  baseUrl: string;
+  apiKeyEnv: string;
+};
+
+export type ModelConfig = {
+  provider: ProviderAlias;
+  name: string;
+};
+
+export type WorkflowConfig = {
+  providers: Record<ProviderAlias, ProviderConfig>;
+  models: Record<ModelAlias, ModelConfig>;
+  agents: Record<AgentAlias, AgentConfig>;
+  defaultAgent: AgentAlias;
+  agentOverrides: Record<WorkflowName, Record<RoleName, AgentAlias>> | null;
+  defaultModel: ModelAlias;
+  modelOverrides: Record<Scenario, ModelAlias> | null;
+};
+```
+
+示例见 `docs/architecture.md`（`providers` / `models` / `defaultModel` / `modelOverrides.extract`）。
+
+#### resolveModel
+
+```32:50:packages/workflow-agent-kit/src/extract.ts
+export function resolveModel(config: WorkflowConfig, alias: ModelAlias): ResolvedLlmProvider {
+  const modelEntry = config.models[alias];
+  const providerEntry = config.providers[modelEntry.provider];
+  const apiKey = process.env[providerEntry.apiKeyEnv];
+  return { baseUrl: providerEntry.baseUrl, apiKey, model: modelEntry.name };
+}
+```
+
+`ResolvedLlmProvider = { baseUrl, apiKey, model }`。
+
+Extract 专用别名解析：
+
+```18:30:packages/workflow-agent-kit/src/extract.ts
+export function resolveExtractModelAlias(config: WorkflowConfig): ModelAlias {
+  return config.modelOverrides?.extract ?? (config.models.extract ? "extract" : config.models.default ? "default" : config.defaultModel);
+}
+```
+
+**尚无** `modelOverrides` 按 role/workflow 解析 agent 主模型的函数；builtin 首版可用 `config.defaultModel`，扩展时可加 `modelOverrides.agent` 或与 `agentOverrides` 对称的表。
+
+#### chatCompletionText
+
+```87:124:packages/workflow-agent-kit/src/extract.ts
+async function chatCompletionText(
+  provider: ResolvedLlmProvider,
+  messages: Array<{ role: "system" | "user"; content: string }>,
+): Promise<string>
+```
+
+| 能力 | 现状 |
+|------|------|
+| 协议 | OpenAI 兼容 `POST /chat/completions` |
+| Streaming | **无**（一次性 `response.text()`） |
+| Tool calling | **无**（无 `tools` / `tool_calls` 字段） |
+| 多模态 | **无**（仅 text `content`） |
+| Extract 专用 | `response_format: { type: "json_object" }` |
+
+builtin agent 的 run loop 需要**新写**带 `tools` 的 completion 客户端（可放在 `workflow-agent-builtin` 或扩展 `workflow-agent-kit` 的 `llm/` 模块），不能复用当前 `chatCompletionText` 而不改。
+
+---
+
+### Q6: Hermes Agent 参考实现
+
+`uwf-hermes` 是怎么实现 `run` 和 `continue` 的？
+
+**调研要点：**
+- prompt 怎么组装的（outputFormatInstruction + rolePrompt + task + history）
+- hermes CLI 的调用参数
+- session management（resume）
+- 输出怎么捕获
+
+**答案：**
+
+#### Prompt 组装
+
+```40:53:packages/workflow-agent-hermes/src/hermes.ts
+export function buildHermesPrompt(ctx: AgentContext): string {
+  const roleDef = ctx.workflow.roles[ctx.role];
+  const rolePrompt = roleDef !== undefined ? buildRolePrompt(roleDef) : "";
+  const parts: string[] = [];
+  if (ctx.outputFormatInstruction !== "") {
+    parts.push(ctx.outputFormatInstruction, "");
+  }
+  parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
+  const historyBlock = buildHistorySummary(ctx.steps);
+  if (historyBlock !== "") {
+    parts.push("", historyBlock);
+  }
+  return parts.join("\n");
+}
+```
+
+`buildRolePrompt` 生成 `## Goal` / `## Capabilities` / `## Prepare`（含 `generateCliReference()`）/ `## Procedure` / `## Output`。
+
+`buildHistorySummary`：每步 `role`、`JSON.stringify(step.output)`、`agent`。
+
+Hermes 把**整段 prompt 作为单条 user 消息**传给 `hermes chat -q`（无独立 system channel）。
+
+#### Hermes CLI 参数
+
+首次：
+
+```88:97:packages/workflow-agent-hermes/src/hermes.ts
+spawnHermes(["chat", "-q", prompt, "--yolo", "--max-turns", "90", "--quiet"]);
+```
+
+续聊：
+
+```100:114:packages/workflow-agent-hermes/src/hermes.ts
+spawnHermes(["chat", "--resume", sessionId, "-q", message, "--yolo", "--max-turns", "90", "--quiet"]);
+```
+
+#### Session
+
+- stdout/stderr 中解析 `session_id: <id>`（`parseSessionIdFromStdout`）
+- 会话文件：`~/.hermes/sessions/session_<id>.json`
+- `loadHermesSession` → `storeHermesSessionDetail`：每 assistant/tool 消息写成 CAS turn 节点，汇总为 `detail`；**output 文本** = 最后一条非空 `assistant` 的 `content`
+
+#### 与 createAgent 的衔接
+
+```157:164:packages/workflow-agent-hermes/src/hermes.ts
+export function createHermesAgent(): () => Promise<void> {
+  return createAgent({ name: "hermes", run: runHermes, continue: continueHermes });
+}
+```
+
+`uwf-hermes` 入口：`createHermesAgent()` 即 main。
+
+Claude Code 包（`workflow-agent-claude-code`）结构相同：`buildClaudeCodePrompt` 同构，`claude -p` + `--resume` + JSON stdout 解析。
+
+---
+
+### Q7: Toolkit 需求分析
+
+要实现一个自给自足的 agent，最少需要哪些 tool？
+
+**调研要点：**
+- 现有 workflow example（solve-issue.yaml）里 role 都做什么任务
+- hermes agent 在 workflow 场景下常用哪些 tool
+- 哪些 tool 是 agent loop 必须的（如 file read/write、shell exec、web fetch）
+
+**答案：**
+
+#### solve-issue.yaml 角色能力
+
+| Role | capabilities | 隐含需求 |
+|------|----------------|----------|
+| planner | issue-analysis, planning | 读上下文/仓库、总结，通常不需写代码 |
+| developer | file-edit, shell, testing | **读文件、写文件、执行命令** |
+| reviewer | code-review, static-analysis | 读 diff/文件、静态分析（可读+可选 shell） |
+
+#### Hermes 侧
+
+Hermes 自带完整 agent runtime（`--yolo`、max-turns），tool 集由 Hermes 项目定义，workflow 不配置。从 session JSON 可见 `tool_calls` 被记入 detail，常见包括文件与 shell 类工具。
+
+#### Builtin 最小 toolkit 建议
+
+| 优先级 | Tool | 用途 |
+|--------|------|------|
+| P0 | `read_file` | 读仓库/配置/issue 上下文 |
+| P0 | `write_file` / `edit_file` | developer 改代码 |
+| P0 | `run_command` | 测试、构建、git（需 cwd + timeout + 输出截断） |
+| P1 | `list_dir` / `glob` | 导航代码库 |
+| P1 | `grep` | 搜索符号/引用 |
+| P2 | `fetch_url` | 查文档（planner 偶尔需要） |
+
+**不需要**在 builtin 里实现 moderator / workflow 路由工具——仍由 `uwf thread step` + JSONata 负责。
+
+#### Agent loop 必须能力
+
+1. 多轮 LLM 调用 + **OpenAI-style tool_calls** 解析与执行
+2. 将 tool 结果 append 回 messages
+3. 终止条件：模型不再请求 tool，或达到 `maxTurns`
+4. 最终响应须含合法 YAML frontmatter（满足 Q4），供 `createAgent` fast-path
+
+---
+
+## 方案草案
+
+（调研完成后基于以上答案撰写）
+
+### 架构设计
+
+```mermaid
+flowchart TB
+  subgraph cli ["cli-workflow"]
+    Step["uwf thread step"]
+    Spawn["spawnAgent(uwf-builtin, threadId, role)"]
+    Step --> Spawn
+  end
+
+  subgraph builtin_pkg ["@uncaged/workflow-agent-builtin"]
+    Main["createBuiltinAgent() = createAgent({...})"]
+    Prompt["buildBuiltinPrompt(ctx)"]
+    Loop["runBuiltinLoop(provider, messages, tools)"]
+    Tools["Toolkit: read/write/exec/..."]
+    Detail["storeBuiltinDetail(turns)"]
+    Main --> Prompt
+    Main --> Loop
+    Loop --> Tools
+    Loop --> Detail
+  end
+
+  subgraph kit ["workflow-agent-kit"]
+    Ctx["buildContextWithMeta"]
+    FM["tryFrontmatterFastPath"]
+    Persist["persistStep"]
+    Ctx --> Main
+    Main --> FM
+    FM --> Persist
+  end
+
+  subgraph cas ["CAS / config"]
+    Config["config.yaml models/providers"]
+    CAS["cas/ + threads.yaml"]
+  end
+
+  Spawn --> Main
+  Config --> Loop
+  CAS --> Ctx
+  Persist --> CAS
+  Spawn -->|"stdout: step hash"| Step
+```
+
+**新包**：`packages/workflow-agent-builtin`，bin `uwf-builtin`，仅依赖 `workflow-agent-kit`、`workflow-protocol`、`workflow-util`（可选 `@uncaged/json-cas` 写 detail schema）。
+
+**分层**：
+
+| 层 | 职责 |
+|----|------|
+| `createAgent`（kit） | argv、context、frontmatter extract、StepNode、stdout 协议 — **不变** |
+| `builtin/agent.ts` | `run` / `continue` 实现 |
+| `builtin/llm.ts` | OpenAI 兼容 chat + tools（可后续抽到 kit） |
+| `builtin/tools/*.ts` | 各 tool 的 JSON Schema + handler |
+| `builtin/prompt.ts` | 复用 Hermes 的 prompt 拼接逻辑（或抽到 kit 的 `buildAgentPrompt`） |
+| `builtin/detail.ts` | 类似 Hermes：每轮 assistant/tool 写入 CAS detail |
+
+**配置集成**：
+
+```yaml
+agents:
+  builtin:
+    command: "uwf-builtin"
+    args: []
+defaultAgent: "builtin"   # 或 agentOverrides 按 role 指定
+```
+
+模型：首版 `resolveModel(config, config.defaultModel)`；后续可增加 `modelOverrides.agent` 或 per-role 映射。
+
+---
+
+### Agent Run Loop
+
+伪代码（单次 `run(ctx)`）：
+
+```
+1. provider ← resolveModel(loadWorkflowConfig(), defaultModel)
+2. system ← buildBuiltinPrompt(ctx)   // outputFormatInstruction + buildRolePrompt + Task + History
+3. messages ← [{ role: "system", content: system }]
+4. sessionId ← newULID()              // 内存或临时目录，供 continue 使用
+5. turns ← []
+
+6. for turn in 1..MAX_TURNS:
+     response ← chatCompletionWithTools(provider, messages, TOOL_DEFINITIONS)
+     record assistant message + tool_calls in turns
+
+     if response has no tool_calls:
+       finalText ← response.content
+       break
+
+     for each tool_call:
+       result ← executeTool(tool_call, { cwd: process.cwd() })
+       messages.push tool result
+       record in turns
+
+7. if no finalText with valid frontmatter after loop:
+     optionally one-shot "finalize" message without tools
+
+8. detailHash ← storeBuiltinDetail(store, sessionId, turns, metadata)
+9. return { output: finalText, detailHash, sessionId }
+```
+
+**`continue(sessionId, message, store)`**：
+
+- 从内存/磁盘恢复 `messages` + `turns`
+- `messages.push({ role: "user", content: message })`（correction 或续聊）
+- 从步骤 6 继续，步数上限可单独设小一点（如 3）
+- 返回新的 `AgentRunResult`
+
+**与 frontmatter 的配合**：
+
+- system prompt 已含 `outputFormatInstruction`；最后一轮可强制 user：`Now output your final answer with YAML frontmatter only if you have not yet.`
+- 仍依赖 `createAgent` 的 fast-path + 最多 2 次 continue
+
+**安全**：
+
+- `run_command`：白名单或需 `UWF_BUILTIN_ALLOW_SHELL=1`，默认工作区限定在 `process.cwd()` 或 `start` 中将来扩展的 `workspace` 字段
+- 路径：禁止 `..` 逃逸出 workspace root
+
+---
+
+### Toolkit 设计
+
+统一注册表：
+
+```typescript
+type BuiltinTool = {
+  name: string;
+  description: string;
+  parameters: JSONSchema; // object type
+  execute: (args: unknown, ctx: ToolContext) => Promise<string>;
+};
+
+type ToolContext = {
+  cwd: string;
+  storageRoot: string;
+};
+```
+
+| Tool name | OpenAI function | 行为摘要 |
+|-----------|-----------------|----------|
+| `read_file` | `read_file` | `{ path }` → UTF-8 文本，大小上限 |
+| `write_file` | `write_file` | `{ path, content }` → 写盘，返回确认 |
+| `edit_file` | 可选 | search/replace 块，减少 token |
+| `run_command` | `run_command` | `{ command, cwd? }` → stdout/stderr 截断 |
+| `list_dir` | `list_dir` | `{ path }` → 条目列表 |
+| `grep` | `grep` | `{ pattern, path? }` → 匹配行 |
+
+**LLM 请求形状**（扩展 extract 客户端）：
+
+```json
+{
+  "model": "...",
+  "messages": [...],
+  "tools": [{ "type": "function", "function": { "name", "description", "parameters" } }],
+  "tool_choice": "auto"
+}
+```
+
+解析 `choices[0].message.tool_calls`，执行后以 `{ role: "tool", tool_call_id, content }` 回传。
+
+**不提供** streaming 首版；detail CAS 记录每轮 tool 名/参数/结果摘要供 `uwf thread step-details` 调试。
+
+---
+
+### 与现有架构的集成
+
+| 集成点 | 方式 |
+|--------|------|
+| CLI 协议 | 实现标准 agent CLI：`uwf-builtin <thread-id> <role>`，stdout 一行 step hash，exit 0/1 |
+| 工厂 | `export function createBuiltinAgent()` → `createAgent({ name: "builtin", run, continue })` |
+| Context / Prompt | 复用 `buildContextWithMeta`、`buildRolePrompt`、`buildOutputFormatInstruction`；prompt 布局对齐 `buildHermesPrompt` |
+| 结构化输出 | 优先 YAML frontmatter fast-path；可选后续在 `createAgent` 增加 `extract()` fallback 开关 |
+| 配置 | `config.yaml` 增加 `agents.builtin`；`uwf setup` 可选默认 agent |
+| 存储 | `resolveStorageRoot()` + `loadWorkflowConfig` + `getEnvPath`；与 Hermes 相同，**不**改 `threads.yaml` 写入方 |
+| 测试 | 单元测试：tool handlers、prompt 组装、mock LLM tool loop；集成测试：临时 storage root + fake provider |
+| 发布 | 新包 `@uncaged/workflow-agent-builtin`，bin `uwf-builtin`，加入 `scripts/publish-all.mjs` |
+
+**明确不做**：
+
+- 不替代 moderator / 不在 agent 内调用 `uwf thread step`
+- 不依赖 Hermes/OpenClaw/Claude Code 二进制
+- 首版不实现 streaming、不实现 MCP
+
+**建议实现顺序**：
+
+1. `llm.ts`：tool calling HTTP 客户端 + 单测
+2. P0 tools + `runBuiltinLoop`
+3. `createBuiltinAgent` + detail CAS
+4. `config` / docs / `examples` 可选 `agentOverrides` 演示
+5. （可选）`createAgent` 接入 `extract()` fallback
@@ -0,0 +1,73 @@
+# Issue #418: ACP session/resume 返回空文本
+
+## 调研日期: 2026-05-23
+
+## 根因
+
+`session/resume` 在 restore 路径下 `_make_agent()` 失败，异常被静默吞掉。
+
+### 完整调用链
+
+```
+resume_session(sid)
+  → update_cwd(sid)
+    → get_session(sid) → _restore(sid)
+      → _make_agent()
+        → resolve_runtime_provider("custom") 失败（line 548-561）
+        → AIAgent() 抛出 "No LLM provider configured"（line 564）
+      → except Exception 静默吞掉（line 482-484）→ return None
+    → return None
+  → state is None → fallback: create_session()（新 sid，无历史）
+```
+
+### 关键代码位置（acp_adapter/session.py）
+
+- `_restore()` line 426-498: 从 DB 恢复 session，但 except 太宽泛
+- `_make_agent()` line 520-568: provider 解析在 restore 路径下不完整
+- Line 548-561: `resolve_runtime_provider("custom")` 失败后，`base_url` 虽然从 DB 取到了但没传给 AIAgent
+
+### 实测行为
+
+1. Phase 1: `session/new` + `prompt` → 正常，有 `agent_message_chunk`
+2. Phase 2: `session/resume` + `prompt`
+   - resume 返回成功，但 `available_commands_update` 里 sessionId 是新的（create_session fallback）
+   - 用原始 sid 发 prompt → `stopReason: "refusal"`（session 不在内存中）
+   - 用新 sid 发 prompt → 能跑但无历史（agent 回答"不知道 secret code"）
+
+### 验证脚本
+
+```python
+# 直接调用 _restore 验证
+cd ~/.hermes/hermes-agent
+python3 -c "
+import sys; sys.path.insert(0, '.')
+from acp_adapter.session import SessionManager
+sm = SessionManager()
+result = sm._restore('SESSION_ID_HERE')
+print(result)  # None — _make_agent 抛异常被吞掉
+"
+```
+
+### 两个 bug
+
+1. **`_make_agent` provider fallback 不完整**: restore 时 DB 里有 `base_url` 和 `api_mode`，但 `resolve_runtime_provider` 失败后这些值没被正确传递给 AIAgent
+2. **`_restore` 的 except 太宽泛**: 静默吞掉所有异常，连 warning 都只在 debug 级别，导致 resume 失败完全无感知
+
+### Hermes 版本
+
+- v0.10.0 (2026.4.16) — 初始测试
+- v0.14.0 (2026.5.16) — 更新后重新测试，bug 仍在
+- 代码路径: ~/.hermes/hermes-agent/acp_adapter/session.py
+
+### v0.14.0 测试结果 (2026-05-23)
+
+- `_restore` 仍因 `custom` provider 解析失败返回 None
+- 日志更清晰了：`WARNING: Failed to recreate agent for ACP session ...`
+- resume fallback 创建新 session（新 sid），但 agent 居然能回答之前的问题（可能通过 memory/session search）
+- 核心问题不变：sessionId 变了，client 用旧 sid 发 prompt → refusal
+
+### 上游 Issue
+
+- https://github.com/NousResearch/hermes-agent/issues/13489 — 已评论根因分析
+- https://github.com/NousResearch/hermes-agent/issues/8083 — resume 静默创建新 session
+- https://github.com/NousResearch/hermes-agent/issues/18452 — _make_agent fallback 不完整
@@ -112,8 +112,8 @@ uwf-hermes <thread-id> <role>

 **约定：**
 - `uwf step` 负责 moderator 决策，将 role 传给 agent CLI
- agent-kit 根据 thread + role 从 CAS 读 systemPrompt / outputSchema
- agent-kit 组装完整 prompt（role systemPrompt + thread context + user prompt from StartNode）
+- agent-kit 根据 thread + role 从 CAS 读 goal / capabilities / procedure / output / meta
+- agent-kit 组装完整 prompt（role goal/capabilities/procedure/output + thread context + user prompt from StartNode）
 - agent 执行实际逻辑，agent-kit 负责 extract
 - agent 将 StepNode 写入 CAS（含 output、detail、agent、prev），但**不挪链头指针**
 - stdout 输出新 StepNode 的 CAS hash（纯文本，一行）
@@ -143,7 +143,7 @@ uwf-hermes <thread-id> <role>

 #### `Workflow`

-Roles 和 moderator 内联在 Workflow 中，只有 outputSchema 独立为 CAS 节点（方便 json-cas 校验）。
+Roles 和 moderator 内联在 Workflow 中，只有 meta 独立为 CAS 节点（方便 json-cas 校验）。

 ```yaml
 type: <workflow-schema-hash>
@@ -153,16 +153,25 @@ payload:
  roles:
    planner:
      description: "Creates implementation plan"
-      systemPrompt: "You are a planning agent..."
-      outputSchema: "5GWKR8TN1V3JA"    # cas_ref → JSON Schema 节点（json-cas 内置）
+      goal: "You are a planning agent..."
+      capabilities: [planning, issue-analysis]
+      procedure: "Analyze the issue and create a plan."
+      output: "Output the plan summary."
+      meta: "5GWKR8TN1V3JA"    # cas_ref → JSON Schema 节点（json-cas 内置）
    developer:
      description: "Implements code changes"
-      systemPrompt: "You are a developer agent..."
-      outputSchema: "8CNWT4KR6D1HV"    # cas_ref → JSON Schema 节点
+      goal: "You are a developer agent..."
+      capabilities: [file-edit, shell]
+      procedure: "Implement the plan."
+      output: "List all files changed."
+      meta: "8CNWT4KR6D1HV"    # cas_ref → JSON Schema 节点
    reviewer:
      description: "Reviews code changes"
-      systemPrompt: "You are a code reviewer..."
-      outputSchema: "1VPBG9SM5E7WK"    # cas_ref → JSON Schema 节点
+      goal: "You are a code reviewer..."
+      capabilities: [code-review]
+      procedure: "Review the implementation."
+      output: "Approve or reject with comments."
+      meta: "1VPBG9SM5E7WK"    # cas_ref → JSON Schema 节点
  conditions:
    needsClarification:
      description: "Planner requests clarification from user"
@@ -189,7 +198,7 @@ payload:
        condition: null
 ```

- `roles` — 内联定义，每个 role 的 `outputSchema` 是独立的 cas_ref（指向 json-cas 内置 JSON Schema 节点）
+- `roles` — 内联定义，每个 role 的 `meta` 是独立的 cas_ref（指向 json-cas 内置 JSON Schema 节点）
 - `conditions` — `Record<Name, JSONata>`，命名条件，方便画图描述
 - `graph` — `Record<Role | "$START", Transition[]>`，每个 Transition = `{ role, condition }`
 - `condition` 引用 conditions 中的 key，`null` = fallback
@@ -234,14 +243,14 @@ payload:
  start: "4TNVW8KR2B3MA"          # cas_ref → StartNode（每个 step 都引用）
  prev: "2MXBG6PN4A8JR"           # cas_ref → 前一个 StepNode，第一步为 null
  role: "developer"
-  output: "9KRVW3TN5F1QA"         # cas_ref → 结构化输出节点（符合 role 的 outputSchema）
+  output: "9KRVW3TN5F1QA"         # cas_ref → 结构化输出节点（符合 role 的 meta schema）
  detail: "7BQST3VW9F2MA"         # cas_ref → 执行详情（content node / 子 workflow terminal StepNode / ...）
  agent: "uwf-cursor"              # 实际使用的 agent 命令（纯字符串）
 ```

 - `start` — 每个 StepNode 都直接引用 StartNode，方便随机访问
 - `prev` — 前一个 StepNode 的 cas_ref，第一步为 `null`（不指向 StartNode）
- `output` — cas_ref，指向符合 role outputSchema 的 CAS 节点，可用 json-cas 校验
+- `output` — cas_ref，指向符合 role meta schema 的 CAS 节点，可用 json-cas 校验
 - `detail` — cas_ref，指向执行详情。可以是原始 agent 输出（content node），也可以是子 workflow thread 的 terminal StepNode（workflowAsAgent 场景）
 - `agent` — 纯字符串，不是 CAS 节点

@@ -340,12 +349,12 @@ OPENROUTER_API_KEY=sk-or-...

 ```
 packages/
-├── cli-uwf/              # @uncaged/cli-uwf — uwf CLI（thread/workflow 命令）
-├── uwf-moderator/        # @uncaged/uwf-moderator — JSONata moderator 引擎
-├── uwf-agent-kit/        # @uncaged/uwf-agent-kit — Agent CLI 框架（含 extractor）
-├── uwf-agent-hermes/     # @uncaged/uwf-agent-hermes — uwf-hermes CLI
-├── uwf-agent-cursor/     # @uncaged/uwf-agent-cursor — uwf-cursor CLI
-└── uwf-protocol/         # @uncaged/uwf-protocol — 共享类型定义
+├── cli-workflow/              # @uncaged/cli-workflow — uwf CLI（thread/workflow 命令）
+├── workflow-moderator/        # @uncaged/workflow-moderator — JSONata moderator 引擎
+├── workflow-agent-kit/        # @uncaged/workflow-agent-kit — Agent CLI 框架（含 extractor）
+├── workflow-agent-hermes/     # @uncaged/workflow-agent-hermes — uwf-hermes CLI
+├── workflow-agent-cursor/ # @uncaged/workflow-agent-cursor — uwf-cursor CLI
+└── workflow-protocol/         # @uncaged/workflow-protocol — 共享类型定义
 ```

 **外部依赖：**
@@ -372,7 +381,7 @@ type ThreadId = string;
 /** 一个 step 的核心数据，被 StepNode payload 和 JSONata 上下文共享 */
 type StepRecord = {
  role: string;
-  output: CasRef;                    // cas_ref → 结构化输出节点（符合 role outputSchema）
+  output: CasRef;                    // cas_ref → 结构化输出节点（符合 role meta schema）
  detail: CasRef;                    // cas_ref → 执行详情（content node / 子 workflow terminal StepNode）
  agent: string;                     // 实际使用的 agent 命令（纯字符串）
 };
@@ -383,8 +392,11 @@ type StepRecord = {
 ```typescript
 type RoleDefinition = {
  description: string;
-  systemPrompt: string;
-  outputSchema: CasRef;              // cas_ref → json-cas 内置 JSON Schema 节点
+  goal: string;
+  capabilities: string[];
+  procedure: string;
+  output: string;
+  meta: CasRef;                      // cas_ref → json-cas 内置 JSON Schema 节点
 };

 type Transition = {
@@ -0,0 +1,43 @@
+name: "analyze-topic"
+description: "Single-role topic analysis using four-phase role description"
+roles:
+  analyst:
+    description: "Analyzes a given topic and produces a structured summary"
+    goal: |
+      You are a research analyst with expertise in breaking down complex topics
+      into clear, structured summaries. You think critically and cite key points.
+    capabilities:
+      - research
+      - critical-thinking
+      - structured-writing
+    procedure: |
+      Analyze the topic by:
+      1. Identifying the main thesis or question
+      2. Listing 3-5 key points with brief explanations
+      3. Noting any counterarguments or caveats
+      Keep your analysis concise (under 500 words).
+    output: |
+      Provide your analysis as markdown under the frontmatter.
+      The frontmatter must include your structured findings.
+    frontmatter:
+      type: object
+      properties:
+        thesis:
+          type: string
+        keyPoints:
+          type: array
+          items:
+            type: string
+        caveats:
+          type: string
+      required: [thesis, keyPoints]
+conditions: {}
+graph:
+  $START:
+    - role: "analyst"
+      condition: null
+      prompt: "Analyze the topic in the task and produce a structured summary with key points."
+  analyst:
+    - role: "$END"
+      condition: null
+      prompt: "Analysis complete. Finish the workflow."
@@ -0,0 +1,77 @@
+name: "debate"
+description: "Structured debate between two sides. Tests cross-process session resume."
+roles:
+  against:
+    description: "Argues against the proposition"
+    goal: |
+      You are a skilled debater arguing AGAINST the proposition.
+      Be logical, cite evidence, and directly address your opponent's points.
+      Keep each argument concise (under 200 words).
+    capabilities:
+      - argumentation
+      - critical-thinking
+    procedure: |
+      1. If this is the opening, present your strongest argument against the proposition.
+      2. If responding to the other side, directly counter their points with evidence and logic.
+      3. If you find yourself genuinely convinced by the other side, you may concede.
+    output: |
+      Provide your argument in the frontmatter.
+      Set conceded to true ONLY if you are genuinely convinced and wish to stop debating.
+    frontmatter:
+      type: object
+      properties:
+        argument:
+          type: string
+        conceded:
+          type: boolean
+      required: [argument, conceded]
+  for:
+    description: "Argues for the proposition"
+    goal: |
+      You are a skilled debater arguing FOR the proposition.
+      Be logical, cite evidence, and directly address your opponent's points.
+      Keep each argument concise (under 200 words).
+    capabilities:
+      - argumentation
+      - critical-thinking
+    procedure: |
+      1. Read the opposing side's latest argument carefully.
+      2. Counter their points with evidence and logic.
+      3. If you find yourself genuinely convinced by the other side, you may concede.
+    output: |
+      Provide your argument in the frontmatter.
+      Set conceded to true ONLY if you are genuinely convinced and wish to stop debating.
+    frontmatter:
+      type: object
+      properties:
+        argument:
+          type: string
+        conceded:
+          type: boolean
+      required: [argument, conceded]
+conditions:
+  againstConceded:
+    description: "The against side conceded"
+    expression: "$last('against').conceded = true"
+  forConceded:
+    description: "The for side conceded"
+    expression: "$last('for').conceded = true"
+graph:
+  $START:
+    - role: "against"
+      condition: null
+      prompt: "Present your opening argument against the proposition."
+  against:
+    - role: "$END"
+      condition: "againstConceded"
+      prompt: "The against side conceded. Debate over."
+    - role: "for"
+      condition: null
+      prompt: "Counter the opposing argument. Address their points directly."
+  for:
+    - role: "$END"
+      condition: "forConceded"
+      prompt: "The for side conceded. Debate over."
+    - role: "against"
+      condition: null
+      prompt: "Counter the opposing argument. Address their points directly."
@@ -0,0 +1,98 @@
+name: "solve-issue"
+description: "End-to-end issue resolution"
+roles:
+  planner:
+    description: "Creates implementation plan"
+    goal: "You are a planning agent. You analyze issues and create implementation plans grounded in the actual codebase."
+    capabilities:
+      - issue-analysis
+      - planning
+      - file-read
+      - shell
+    procedure: |
+      1. Locate the code repository:
+         - Check if the current working directory is the repo (look for package.json, .git, etc.)
+         - If the task mentions a repo URL, clone it first.
+         - If this is a new project, create the repo and note the path.
+      2. Explore the codebase — read the relevant source files mentioned in the issue. Understand the current architecture, types, and conventions (check CLAUDE.md, CONTRIBUTING.md, .cursor/rules/).
+      3. Identify which files need changes and what the changes should be, with specific code references.
+      4. Output the plan with:
+         - `repoPath`: absolute path to the repository root
+         - `plan`: detailed implementation plan with file paths and code references
+         - `steps`: concrete action items for the developer
+    output: |
+      Provide repoPath, plan summary, and steps in the frontmatter.
+      The plan MUST reference actual file paths and code structures you found by reading the source.
+      Do NOT guess — if you haven't read a file, read it before referencing it.
+    frontmatter:
+      type: object
+      properties:
+        repoPath:
+          type: string
+        plan:
+          type: string
+      required: [repoPath, plan]
+  developer:
+    description: "Implements code changes"
+    goal: "You are a developer agent. You implement code changes according to plans."
+    capabilities:
+      - file-edit
+      - shell
+      - testing
+    procedure: |
+      1. Read the planner's output to get the repoPath and implementation plan.
+      2. cd to the repoPath before making any changes.
+      3. Create a feature branch from the default branch.
+      4. Implement the plan — write code, tests, and ensure existing tests pass.
+      5. Commit your changes with a descriptive message referencing the issue.
+    output: "List all files changed and provide a summary of the implementation."
+    frontmatter:
+      type: object
+      properties:
+        filesChanged:
+          type: array
+          items:
+            type: string
+        summary:
+          type: string
+      required: [filesChanged, summary]
+  reviewer:
+    description: "Reviews code changes"
+    goal: "You are a code reviewer. You review implementations for correctness and quality."
+    capabilities:
+      - code-review
+      - static-analysis
+    procedure: "Review the implementation against the plan. Check for bugs, edge cases, and style."
+    output: "Approve or reject with detailed comments explaining your decision."
+    frontmatter:
+      type: object
+      properties:
+        approved:
+          type: boolean
+        comments:
+          type: string
+      required: [approved, comments]
+conditions:
+  notApproved:
+    description: "Reviewer rejected the implementation"
+    expression: "$last('reviewer').approved = false"
+graph:
+  $START:
+    - role: "planner"
+      condition: null
+      prompt: "Analyze the issue described in the task and produce a detailed implementation plan."
+  planner:
+    - role: "developer"
+      condition: null
+      prompt: "Implement the plan from the planner. Write code, tests, and ensure existing tests pass."
+  developer:
+    - role: "reviewer"
+      condition: null
+      prompt: "Review the developer's implementation against the plan for correctness and quality."
+  reviewer:
+    - role: "developer"
+      condition: "notApproved"
+      prompt: "The reviewer rejected your implementation. Read their feedback and fix the issues."
+    - role: "$END"
+      condition: null
+      prompt: "The review passed. Complete the workflow."
@@ -0,0 +1,76 @@
+# @uncaged/cli-workflow
+
+Command-line interface for the Uncaged workflow engine (`uncaged-workflow`).
+
+The CLI reads and writes the workflow registry, starts and inspects threads, manages CAS blobs, and prints agent-oriented reference docs via `skill`. It uses the same storage layout as `@uncaged/workflow` (default `~/.uncaged/workflow`).
+
+## Install
+
+```bash
+bun add @uncaged/cli-workflow
+```
+
+In this monorepo: `"@uncaged/cli-workflow": "workspace:*"`. Depends on `"@uncaged/workflow": "workspace:*"`.
+
+## Usage
+
+```bash
+uncaged-workflow workflow list
+uncaged-workflow run <name> --prompt "Your task"
+uncaged-workflow thread show <id>
+uncaged-workflow skill
+```
+
+Invoking the CLI with no command (or from this repo: `bun packages/cli-workflow/src/cli.ts`) prints:
+
+```
+uncaged-workflow — workflow engine CLI
+
+Workflow registry:
+  workflow add <name> <file.esm.js> [--types <path>]  Register a workflow bundle in the registry
+  workflow list                                       List all registered workflows
+  workflow show <name>                                Show details of a registered workflow
+  workflow rm <name>                                  Remove a workflow from the registry
+  workflow history <name>                             Show version history of a workflow
+  workflow rollback <name> [hash]                     Rollback a workflow to a previous version
+
+Thread execution:
+  thread run <name> [--prompt <text>] [--max-rounds N]          Start a new thread executing a workflow
+  thread list [name]                                            List threads, optionally filtered by workflow name
+  thread show <id>                                              Show thread details and state
+  thread rm <id>                                                Remove a thread
+  thread fork <thread-id> [--from-role <role>]                  Fork a thread, optionally from a specific role
+  thread ps                                                     List running threads
+  thread kill <thread-id>                                       Kill a running thread
+  thread live <thread-id> | --latest [--debug] [--role <name>]  Attach to a thread and stream output live
+  thread pause <thread-id>                                      Pause a running thread
+  thread resume <thread-id>                                     Resume a paused thread
+
+Content-addressable storage:
+  cas get <hash>     Retrieve content by hash from CAS
+  cas put <content>  Store content in CAS, prints hash
+  cas list           List all hashes in CAS
+  cas rm <hash>      Remove a CAS entry by hash
+  cas gc             Garbage-collect unreferenced CAS entries
+
+Development:
+  init workspace <name>  Initialize a new workflow workspace
+  init template <name>   Initialize a new workflow template
+
+Shortcuts:
+  run <name> [...]  → thread run
+  live <id> [...]   → thread live
+
+Reference:
+  skill [topic]  Agent-consumable docs (cli, develop, author)
+
+Use <command> --help for subcommand details.
+
+Environment variables:
+  WORKFLOW_STORAGE_ROOT              Override storage directory (default: ~/.uncaged/workflow)
+  UNCAGED_WORKFLOW_STORAGE_ROOT      Internal override (takes priority over WORKFLOW_STORAGE_ROOT)
+```
+
+## API overview
+
+This package is bin-only; programmatic use is via `@uncaged/workflow`. Entry: `src/cli.ts` → `runCli` in `src/cli-dispatch.js`.
@@ -0,0 +1,30 @@
+{
+  "name": "@uncaged/cli-workflow",
+  "version": "0.5.0-alpha.4",
+  "files": [
+    "src",
+    "dist",
+    "package.json"
+  ],
+  "type": "module",
+  "bin": {
+    "uncaged-workflow": "src/cli.ts"
+  },
+  "dependencies": {
+    "@uncaged/workflow-gateway": "workspace:^",
+    "@uncaged/workflow-protocol": "workspace:^",
+    "@uncaged/workflow-util": "workspace:^",
+    "@uncaged/workflow-cas": "workspace:^",
+    "@uncaged/workflow-execute": "workspace:^",
+    "@uncaged/workflow-register": "workspace:^",
+    "@uncaged/workflow-runtime": "workspace:^",
+    "hono": "^4.12.18",
+    "yaml": "^2.8.4"
+  },
+  "scripts": {
+    "test": "bun test"
+  },
+  "publishConfig": {
+    "access": "public"
+  }
+}
@@ -0,0 +1,9 @@
+#!/usr/bin/env bun
+
+import { runCli } from "./cli-dispatch.js";
+import { resolveWorkflowStorageRoot } from "./storage-env.js";
+
+const argv = process.argv.slice(2);
+const storageRoot = resolveWorkflowStorageRoot();
+const code = await runCli(storageRoot, argv);
+process.exit(code);
--- a/Show More
+++ b/Show More