improve: committer — check git status before staging (from retrospect PR #578 )

Developer already commits changes, so committer's git add -A is redundant. Now checks git status first and skips to push if tree is clean.
feat: retrospect-workflow — add Phase 0 validation
2026-05-30 15:45:17 +08:00 · 2026-05-30 15:32:33 +08:00 · 2026-05-30 15:28:24 +08:00 · 2026-05-30 14:24:33 +08:00 · 2026-05-30 14:23:37 +08:00
53 changed files with 511 additions and 3078 deletions
@@ -0,0 +1,5 @@
+---
+"@uncaged/workflow-util": patch
+---
+
+Replace optionalEnv/requireEnv with unified env(name, fallback) API
@@ -0,0 +1,5 @@
+---
+"@uncaged/workflow-protocol": patch
+---
+
+fix: correct internal dependency versions for prerelease
@@ -0,0 +1,5 @@
+---
+"@uncaged/workflow-util-agent": patch
+---
+
+fix: include create-agent-adapter.ts in published src
@@ -0,0 +1,5 @@
+---
+"@uncaged/workflow-protocol": patch
+---
+
+fix: use npm publish with pinned deps instead of bun publish (workspace:^ resolution bug)
@@ -1,5 +1,5 @@
 {
-  "mode": "exit",
+  "mode": "pre",
  "tag": "alpha",
  "initialVersions": {
    "@uncaged/cli-workflow": "0.4.5",
@@ -0,0 +1,5 @@
+---
+"@uncaged/workflow-protocol": minor
+---
+
+feat: AgentFn<Opt> type boundary and createAgentAdapter bridging function (RFC #252)
@@ -18,8 +18,11 @@ jobs:
      - name: Install dependencies
        run: bun install

-      - name: Check
-        run: bun run check
+      - name: Lint
+        run: bun run lint
+
+      - name: Type check
+        run: bun run typecheck

      - name: Test
-        run: bun run test:ci
+        run: bun test
@@ -12,5 +12,4 @@ packages/workflow-template-develop/develop.esm.js
 .DS_Store
 *.py
 .claude
-tmp.worktrees/
-.worktrees/
+tmp
@@ -0,0 +1,83 @@
+# Test Spec: uwf setup model connectivity validation (#335)
+
+## Context
+
+File: `packages/cli-workflow/src/commands/setup.ts`
+Test file: `packages/cli-workflow/src/__tests__/setup-validate.test.ts`
+
+After `cmdSetup` writes config, it should send a test chat completion request to verify the configured model is reachable. If validation fails, warn the user (don't abort — config is already saved).
+
+## Implementation Notes
+
+- Add a `validateModel(baseUrl, apiKey, model)` function that sends a minimal chat completion request (`POST /chat/completions` with `messages: [{role:"user",content:"hi"}]`, `max_tokens: 1`)
+- Returns `Result<void, string>` — ok if 2xx response, error with reason string otherwise
+- Use `AbortSignal.timeout(15_000)` for the request
+- Both `cmdSetup` and `cmdSetupInteractive` should call it after saving config
+- `cmdSetup` returns validation result in its return object: `{ ...existing, validation: { ok: true } | { ok: false, error: string } }`
+- `cmdSetupInteractive` prints a warning to console if validation fails, success message if it passes
+- Use the project logger (`createLogger`) — no raw `console.log` except in interactive CLI output (per CLAUDE.md)
+
+## Test Cases (vitest)
+
+### 1. `validateModel` — success path
+- Mock `fetch` to return `{ status: 200, ok: true, json: () => ({}) }`
+- Call `validateModel(baseUrl, apiKey, model)`
+- Assert returns `{ ok: true, value: undefined }`
+- Assert fetch was called with correct URL (`${baseUrl}/chat/completions`), correct headers (`Authorization: Bearer ${apiKey}`), correct body (model, messages, max_tokens: 1)
+
+### 2. `validateModel` — HTTP error (401 unauthorized)
+- Mock `fetch` to return `{ status: 401, ok: false, statusText: "Unauthorized" }`
+- Call `validateModel(baseUrl, apiKey, model)`
+- Assert returns `{ ok: false, error: <string containing "401"> }`
+
+### 3. `validateModel` — HTTP error (404 model not found)
+- Mock `fetch` to return `{ status: 404, ok: false, statusText: "Not Found" }`
+- Assert returns `{ ok: false, error: <string containing "404"> }`
+
+### 4. `validateModel` — network timeout
+- Mock `fetch` to throw `DOMException` with name `AbortError`
+- Assert returns `{ ok: false, error: <string containing "timeout" or "unreachable"> }`
+
+### 5. `validateModel` — network error (DNS failure, connection refused)
+- Mock `fetch` to throw `TypeError("fetch failed")`
+- Assert returns `{ ok: false, error: <string mentioning connectivity> }`
+
+### 6. `cmdSetup` — includes validation result on success
+- Mock global `fetch` for `/chat/completions` to succeed
+- Call `cmdSetup({ provider, baseUrl, apiKey, model, storageRoot })`
+- Assert returned object has `validation: { ok: true, value: undefined }`
+- Assert config files are still written (existing behavior preserved)
+
+### 7. `cmdSetup` — includes validation result on failure (config still saved)
+- Mock global `fetch` for `/chat/completions` to return 401
+- Call `cmdSetup({ ... })`
+- Assert returned object has `validation: { ok: false, error: ... }`
+- Assert `config.yaml` and `.env` are still written (validation failure doesn't prevent saving)
+
+### 8. `cmdSetupInteractive` — prints success message on validation pass
+- Mock `fetch` for both `/models` and `/chat/completions` to succeed
+- Mock stdin to provide valid selections
+- Capture console output
+- Assert output contains a success message like "Model verified" or "✓"
+
+### 9. `cmdSetupInteractive` — prints warning on validation failure
+- Mock `fetch`: `/models` succeeds, `/chat/completions` returns 401
+- Mock stdin for valid selections
+- Capture console output
+- Assert output contains a warning about model not being reachable and suggests trying a different model
+
+### 10. `validateModel` — request body correctness
+- Mock `fetch` to capture the request body
+- Call `validateModel(baseUrl, apiKey, "test-model")`
+- Assert body is `{ model: "test-model", messages: [{role: "user", content: "hi"}], max_tokens: 1 }`
+
+## Export Requirements
+
+- `validateModel` must be exported (for direct unit testing)
+- Signature: `async function validateModel(baseUrl: string, apiKey: string, model: string): Promise<Result<void, string>>`
+- `Result` type: `{ ok: true; value: T } | { ok: false; error: E }` (project convention)
+
+## Files to Create/Modify
+
+- **New**: `packages/cli-workflow/src/__tests__/setup-validate.test.ts` — all test cases above
+- **Modify**: `packages/cli-workflow/src/commands/setup.ts` — add `validateModel`, integrate into `cmdSetup` and `cmdSetupInteractive`
@@ -1,269 +0,0 @@
-name: "e2e-walkthrough"
-description: "End-to-end walkthrough of uwf CLI. Dogfooding: uwf tests uwf. Each role validates a phase of the CLI surface inside an isolated Docker container."
-roles:
-  bootstrap:
-    description: "Start Docker container with isolated storage, verify uwf is runnable"
-    goal: "You are an E2E test runner. Set up an isolated Docker environment and verify basic uwf functionality."
-    capabilities:
-      - docker
-      - shell
-    procedure: |
-      1. Start a Docker container with isolated storage:
-         ```
-         docker run -d --name uwf-e2e-$$ \
-           -v $HOME:$HOME \
-           -e HOME=$HOME \
-           -e UNCAGED_WORKFLOW_STORAGE_ROOT=/tmp/uwf-e2e-storage \
-           -w ~/repos/workflow \
-           node:22-bookworm \
-           sleep infinity
-         ```
-      2. Inside the container, install bun, install deps, then `bun link` all packages
-         so that `uwf`, `uwf-hermes`, `uwf-builtin` are on PATH (from source):
-         ```
-         docker exec uwf-e2e-$$ bash -c '
-           # Install bun
-           curl -fsSL https://bun.sh/install | bash
-           export PATH="$HOME/.bun/bin:$PATH"
-
-           # Isolated storage
-           mkdir -p $UNCAGED_WORKFLOW_STORAGE_ROOT
-
-           # Install workspace deps
-           cd ~/repos/workflow && bun install --frozen-lockfile
-
-           # bun link each package that has a bin entry
-           cd packages/cli-workflow && bun link && cd ../..
-           cd packages/workflow-agent-hermes && bun link && cd ../..
-           cd packages/workflow-agent-builtin && bun link && cd ../..
-         '
-         ```
-      3. Verify all three commands are available inside the container:
-         ```
-         docker exec uwf-e2e-$$ bash -c 'export PATH="$HOME/.bun/bin:$PATH" && uwf --version'
-         docker exec uwf-e2e-$$ bash -c 'export PATH="$HOME/.bun/bin:$PATH" && uwf-hermes --help'
-         docker exec uwf-e2e-$$ bash -c 'export PATH="$HOME/.bun/bin:$PATH" && uwf-builtin --help'
-         ```
-      4. Copy host config if it exists:
-         ```
-         docker exec uwf-e2e-$$ bash -c '
-           if [ -f $HOME/.uncaged/workflow/config.yaml ]; then
-             cp $HOME/.uncaged/workflow/config.yaml $UNCAGED_WORKFLOW_STORAGE_ROOT/config.yaml
-           fi
-         '
-         ```
-
-      Report the container name and confirm uwf + agents are working.
-      Set containerName to the Docker container name for subsequent roles.
-    output: "Report uwf version and container readiness. Set $status to pass with containerName, or fail with error."
-    frontmatter:
-      oneOf:
-        - properties:
-            $status: { const: "pass" }
-            containerName: { type: string }
-          required: [$status, containerName]
-        - properties:
-            $status: { const: "fail" }
-            error: { type: string }
-          required: [$status, error]
-
-  config-and-registry:
-    description: "Validate uwf config commands and workflow registration"
-    goal: "You are an E2E test runner. Validate uwf config operations and workflow registration inside the Docker container."
-    capabilities:
-      - docker
-      - shell
-    procedure: |
-      Use the container from the previous step (containerName is in your prompt).
-      All commands run via: `docker exec <containerName> bash -c '...'`
-      All commands use `uwf` (installed via `bun link` inside the container).
-      Remember to set env vars in each exec:
-        export PATH="$HOME/.bun/bin:$PATH"
-        export UNCAGED_WORKFLOW_STORAGE_ROOT=/tmp/uwf-e2e-storage
-
-      Config tests:
-      1. `uwf config list` — verify it returns valid JSON
-      2. `uwf config set models.test.name test-model` — set a test key
-      3. `uwf config get models.test.name` — verify it returns "test-model"
-
-      Workflow registration tests:
-      4. `uwf workflow add ~/repos/workflow/examples/solve-issue.yaml` — register workflow
-      5. Verify the output contains a hash
-      6. `uwf workflow list` — verify non-empty array
-      7. Capture the workflow name from the list
-      8. `uwf workflow show <name>` — verify it returns roles
-
-      Report all test results with pass/fail counts.
-    output: "Report test results. Set $status to pass (with workflowName and containerName) or fail."
-    frontmatter:
-      oneOf:
-        - properties:
-            $status: { const: "pass" }
-            workflowName: { type: string }
-            containerName: { type: string }
-          required: [$status, workflowName, containerName]
-        - properties:
-            $status: { const: "fail" }
-            error: { type: string }
-            containerName: { type: string }
-          required: [$status, error, containerName]
-
-  thread-ops:
-    description: "Test thread start, list, show, and exec"
-    goal: "You are an E2E test runner. Validate thread creation and execution inside the Docker container."
-    capabilities:
-      - docker
-      - shell
-    procedure: |
-      Use the container (containerName) and workflow (workflowName) from your prompt.
-      All commands via: `docker exec <containerName> bash -c '...'`
-      Set env: PATH="$HOME/.bun/bin:$PATH" UNCAGED_WORKFLOW_STORAGE_ROOT=/tmp/uwf-e2e-storage
-
-      1. `uwf thread start <workflowName> -p 'E2E test: what is 2+2?'` — capture thread ID from JSON output
-      2. `uwf thread list` — verify the thread appears in the list
-      3. `uwf thread show <threadId>` — verify head pointer exists
-      4. `uwf thread exec <threadId> --agent uwf-builtin` — execute one step
-      5. Verify exec returns JSON with a head field
-
-      Report results. Pass threadId and containerName forward.
-    output: "Report test results. Set $status to pass (with threadId, workflowName, containerName) or fail."
-    frontmatter:
-      oneOf:
-        - properties:
-            $status: { const: "pass" }
-            threadId: { type: string }
-            workflowName: { type: string }
-            containerName: { type: string }
-          required: [$status, threadId, workflowName, containerName]
-        - properties:
-            $status: { const: "fail" }
-            error: { type: string }
-            containerName: { type: string }
-          required: [$status, error, containerName]
-
-  inspect:
-    description: "Test step list/show, thread read, and CAS operations"
-    goal: "You are an E2E test runner. Validate read and inspect operations inside the Docker container."
-    capabilities:
-      - docker
-      - shell
-    procedure: |
-      Use the container (containerName) and threadId from your prompt.
-      All commands via: `docker exec <containerName> bash -c '...'`
-      Set env: PATH="$HOME/.bun/bin:$PATH" UNCAGED_WORKFLOW_STORAGE_ROOT=/tmp/uwf-e2e-storage
-
-      Step inspection:
-      1. `uwf step list <threadId>` — verify steps array has length > 1
-      2. Capture the last step hash from the output
-      3. `uwf step show <lastStepHash>` — verify it returns a role field
-
-      Thread read:
-      4. `uwf thread read <threadId>` — verify non-empty output
-
-      CAS operations:
-      5. `uwf cas get <lastStepHash>` — verify returns a type field
-      6. `uwf cas has <lastStepHash>` — verify exits 0
-      7. `uwf cas refs <lastStepHash>` — list refs (may be empty)
-      8. `uwf cas walk <lastStepHash>` — verify returns non-empty array
-
-      Report results. Pass threadId, lastStepHash, workflowName, containerName forward.
-    output: "Report test results. Set $status to pass (with threadId, lastStepHash, workflowName, containerName) or fail."
-    frontmatter:
-      oneOf:
-        - properties:
-            $status: { const: "pass" }
-            threadId: { type: string }
-            lastStepHash: { type: string }
-            workflowName: { type: string }
-            containerName: { type: string }
-          required: [$status, threadId, lastStepHash, workflowName, containerName]
-        - properties:
-            $status: { const: "fail" }
-            error: { type: string }
-            containerName: { type: string }
-          required: [$status, error, containerName]
-
-  cancel-and-fork:
-    description: "Test thread cancel, step fork, and log inspection"
-    goal: "You are an E2E test runner. Validate cancel, fork, and log operations inside the Docker container."
-    capabilities:
-      - docker
-      - shell
-    procedure: |
-      Use containerName, threadId, lastStepHash, and workflowName from your prompt.
-      All commands via: `docker exec <containerName> bash -c '...'`
-      Set env: PATH="$HOME/.bun/bin:$PATH" UNCAGED_WORKFLOW_STORAGE_ROOT=/tmp/uwf-e2e-storage
-
-      Cancel:
-      1. Start a second thread: `uwf thread start <workflowName> -p 'E2E cancel test'`
-      2. Cancel it: `uwf thread cancel <secondThreadId>`
-      3. Verify it appears in completed list: `uwf thread list --status completed`
-
-      Fork:
-      4. Fork from the first thread's last step: `uwf step fork <lastStepHash>`
-      5. Verify fork creates a new thread with a different ID
-
-      Logs:
-      6. `uwf log list` — verify output (may be empty)
-      7. `uwf log show --thread <threadId>` — verify runs without error
-
-      Report results with summary.
-    output: "Report test results with summary. Set $status to pass or fail."
-    frontmatter:
-      oneOf:
-        - properties:
-            $status: { const: "pass" }
-            containerName: { type: string }
-            summary: { type: string }
-          required: [$status, containerName, summary]
-        - properties:
-            $status: { const: "fail" }
-            error: { type: string }
-            containerName: { type: string }
-          required: [$status, error, containerName]
-
-  cleanup:
-    description: "Remove Docker container"
-    goal: "You are an E2E test runner. Clean up the Docker container used for testing."
-    capabilities:
-      - docker
-      - shell
-    procedure: |
-      Remove the Docker container (containerName is in your prompt):
-      1. `docker rm -f <containerName>`
-      2. Verify the container is gone: `docker ps -a --filter name=<containerName> --format '{{.Names}}'` should return empty
-
-      Report cleanup result.
-    output: "Report cleanup result. Set $status to pass or fail."
-    frontmatter:
-      oneOf:
-        - properties:
-            $status: { const: "pass" }
-            summary: { type: string }
-          required: [$status, summary]
-        - properties:
-            $status: { const: "fail" }
-            error: { type: string }
-          required: [$status, error]
-
-graph:
-  $START:
-    _: { role: "bootstrap", prompt: "Set up the Docker container and verify uwf is runnable." }
-  bootstrap:
-    pass: { role: "config-and-registry", prompt: "Container {{{containerName}}} is ready. Validate config and workflow registration." }
-    fail: { role: "$END", prompt: "Bootstrap failed: {{{error}}}. No container was created." }
-  config-and-registry:
-    pass: { role: "thread-ops", prompt: "Config and registry OK. Workflow '{{{workflowName}}}' registered. Container: {{{containerName}}}. Now test thread operations." }
-    fail: { role: "cleanup", prompt: "Config/registry failed: {{{error}}}. Clean up container {{{containerName}}}." }
-  thread-ops:
-    pass: { role: "inspect", prompt: "Thread ops OK. threadId={{{threadId}}}, workflowName={{{workflowName}}}, containerName={{{containerName}}}. Now test inspect operations." }
-    fail: { role: "cleanup", prompt: "Thread ops failed: {{{error}}}. Clean up container {{{containerName}}}." }
-  inspect:
-    pass: { role: "cancel-and-fork", prompt: "Inspect OK. threadId={{{threadId}}}, lastStepHash={{{lastStepHash}}}, workflowName={{{workflowName}}}, containerName={{{containerName}}}. Now test cancel, fork, and logs." }
-    fail: { role: "cleanup", prompt: "Inspect failed: {{{error}}}. Clean up container {{{containerName}}}." }
-  cancel-and-fork:
-    pass: { role: "cleanup", prompt: "All tests passed! {{{summary}}}. Clean up container {{{containerName}}}." }
-    fail: { role: "cleanup", prompt: "Cancel/fork failed: {{{error}}}. Clean up container {{{containerName}}}." }
-  cleanup:
-    pass: { role: "$END", prompt: "E2E walkthrough complete. {{{summary}}}" }
-    fail: { role: "$END", prompt: "Cleanup failed: {{{error}}}. Manual cleanup may be needed." }
@@ -0,0 +1,220 @@
+name: "retrospect-workflow"
+description: "Post-execution retrospective: analyze a completed thread, find inefficiencies, and improve the workflow definition."
+roles:
+  analyst:
+    description: "Scans thread execution for anomalies and produces a findings report"
+    goal: "You are a workflow execution analyst. You review completed thread data to find inefficiencies, wasted effort, and procedure gaps."
+    capabilities:
+      - data-analysis
+    procedure: |
+      You receive a completed thread ID in your task prompt.
+
+      Phase 0 — Validation (must pass before any analysis):
+      1. Run `uwf step list <thread-id>` to get thread metadata including the workflow hash
+      2. Run `uwf workflow show <workflow-hash>` to get the workflow name
+      3. Verify the workflow exists locally: check `.workflows/<name>.yaml` in the current repo
+         - If NOT found: output $status=wrong_project with the workflow name. Do NOT proceed.
+      4. Compare the thread's workflow hash against the current registered version:
+         - Run `uwf workflow show <name>` to get the current hash
+         - If hashes differ: the thread ran on an older version. Note this — you will need to diff versions after analysis.
+
+      Phase 1 — Overview scan:
+      5. From the step list, compute a health signal for each step:
+         - Duration: flag if >2x the median of other steps
+         - Output tokens: flag if >2x the median
+         - Status flow: flag non-happy-path transitions (rejected, fix_code, fix_spec, hook_failed)
+         - Step count: flag if the same role appears more than expected (indicates loops)
+      6. If no anomalies found AND versions match: output $status=clean
+      7. If no anomalies found BUT versions differ:
+         - Diff the two workflow versions to check if any procedure changes are relevant
+         - If the current version already addresses potential concerns: output $status=clean with a note
+         - Otherwise: proceed to Phase 2
+
+      Phase 2 — Targeted deep-dive (only for flagged steps):
+      8. For each flagged step, run `uwf step show <hash>` to get the detail with turns
+      9. Analyze the turn sequence for:
+         - Repeated tool calls with the same or similar input (blind retries)
+         - Tool errors followed by no strategy change (same approach retried)
+         - Unnecessary exploration (reading files or running commands unrelated to the task)
+         - Hallucinated commands or flags (commands that don't exist or wrong syntax)
+         - Excessive turns before reaching the goal
+      10. For each finding, record:
+          - Which role and step hash
+          - What happened (specific turn indices and commands)
+          - Root cause hypothesis (procedure gap, missing pitfall, unclear instruction)
+          - Suggested fix (what to add/change in the procedure)
+      11. If versions differ: compare findings against the version diff.
+          Mark any finding that is already fixed in the current version as "resolved_in_current".
+          Only report findings that are NOT yet addressed.
+
+      Output a structured findings report. Set $status=clean if nothing actionable, $status=findings if unresolved issues exist, or $status=wrong_project if the workflow doesn't belong here.
+    output: "A findings report with per-issue root cause and suggested procedure fixes. Set $status to clean or findings (with report hash)."
+    frontmatter:
+      oneOf:
+        - properties:
+            $status: { const: "clean" }
+            summary: { type: string }
+          required: [$status, summary]
+        - properties:
+            $status: { const: "findings" }
+            report: { type: string }
+            targetWorkflow: { type: string }
+          required: [$status, report, targetWorkflow]
+        - properties:
+            $status: { const: "wrong_project" }
+            workflowName: { type: string }
+          required: [$status, workflowName]
+  proposer:
+    description: "Translates findings into concrete workflow edits"
+    goal: "You are a workflow improvement proposer. You read the analyst's findings and produce specific, minimal edits to the workflow YAML."
+    capabilities:
+      - planning
+    procedure: |
+      1. Read the analyst's findings report from your task prompt
+      2. Locate the target workflow YAML:
+         - Workflow definitions live in the WORKFLOW ENGINE repo (where `uwf` is developed), NOT in the repo that was analyzed.
+         - Find it via: `uwf workflow show <targetWorkflow> --format yaml` to read the current definition
+         - The physical file is `.workflows/<targetWorkflow>.yaml` in the workflow engine repo
+         - Use `git rev-parse --show-toplevel` in the current directory to find the workflow engine repo root
+      3. Read the current workflow YAML to understand existing procedures
+      4. For each finding, draft a minimal edit:
+         - Prefer adding a pitfall note or clarifying instruction over restructuring
+         - If a procedure step is ambiguous, make it explicit
+         - If a tool usage pattern is wrong, add a "Do NOT" or "IMPORTANT" note
+         - Keep edits surgical — don't rewrite procedures that work fine
+      5. Check if existing tests need updating (search for test files referencing the workflow)
+      6. Produce a change plan as CAS text node via `uwf cas put-text "<plan>"`
+
+      The plan should list each edit with:
+      - File path
+      - What to change (old text → new text, or addition)
+      - Why (linked to which finding)
+      - Any test updates needed
+    output: "A change plan stored in CAS. Set $status to ready (with plan hash and repoPath) or no_action (if findings don't warrant changes)."
+    frontmatter:
+      oneOf:
+        - properties:
+            $status: { const: "ready" }
+            plan: { type: string }
+            repoPath: { type: string }
+          required: [$status, plan, repoPath]
+        - properties:
+            $status: { const: "no_action" }
+            reason: { type: string }
+          required: [$status, reason]
+  developer:
+    description: "Applies the proposed workflow edits"
+    goal: "You are a developer agent. You apply workflow YAML edits and update related tests."
+    capabilities:
+      - coding
+    procedure: |
+      IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
+      The workflow definitions live in THIS repo (the workflow engine), not the repo that was analyzed.
+
+      Before starting any work, set up an isolated worktree:
+      1. Use `git rev-parse --show-toplevel` to find the repo root (do NOT use repoPath from proposer — that's the analyzed repo)
+      2. `git fetch origin` to get latest refs
+      3. `git worktree add .worktrees/retrospect/<short-slug> -b retrospect/<short-slug> origin/main`
+      4. `cd .worktrees/retrospect/<short-slug> && bun install`
+      5. ALL subsequent work must happen inside the worktree directory.
+
+      Then apply changes:
+      6. Read the change plan from CAS: `uwf cas get <plan hash>`
+      7. Apply each edit from the plan to the workflow YAML
+      8. Update or add tests as specified in the plan
+      9. Run `bun run build` and `bun test` to verify
+      10. Run `bun run check` for lint
+      11. Commit with message: `improve: <workflow-name> — <brief summary>`
+    output: "List all files changed and provide a summary. Set $status to done (with branch/worktree), or failed (with reason)."
+    frontmatter:
+      oneOf:
+        - properties:
+            $status: { const: "done" }
+            branch: { type: string }
+            worktree: { type: string }
+          required: [$status, branch, worktree]
+        - properties:
+            $status: { const: "failed" }
+            reason: { type: string }
+          required: [$status, reason]
+  reviewer:
+    description: "Reviews the workflow edits for correctness"
+    goal: "You are a reviewer. You verify that workflow edits are minimal, correct, and actually address the findings."
+    capabilities:
+      - code-review
+    procedure: |
+      The worktree path is provided in your task prompt. cd into it first.
+
+      Review criteria:
+      1. Each edit must trace back to a specific finding — no drive-by changes
+      2. Edits should be minimal — don't rewrite working procedures
+      3. New pitfall notes or instructions must be clear and actionable
+      4. Tests must be updated if assertions changed
+      5. `bun run build` and `bun test` must pass
+      6. `bunx biome check` must pass
+
+      IMPORTANT: `tea pr create` must run from the MAIN repo directory (not a worktree), because tea cannot detect the repo from worktree `.git` files.
+    output: "Explain your decision. Set $status to approved (with branch/worktree) or rejected (with comments)."
+    frontmatter:
+      oneOf:
+        - properties:
+            $status: { const: "approved" }
+            branch: { type: string }
+            worktree: { type: string }
+          required: [$status, branch, worktree]
+        - properties:
+            $status: { const: "rejected" }
+            comments: { type: string }
+            worktree: { type: string }
+          required: [$status, comments, worktree]
+  committer:
+    description: "Commits and creates PR"
+    goal: "You are a committer agent. You create a clean commit and push a PR."
+    capabilities: []
+    procedure: |
+      The worktree path, branch name, and repo info are provided in your task prompt.
+      cd into the worktree first.
+
+      Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
+      1. Stage all changes: `git add -A`
+      2. Commit with a descriptive message: `git commit -m "improve: <workflow> — <summary>"`
+      3. Push the branch: `git push -u origin <branch-name>`
+         - If push hook fails: capture the error log in your output, mark hook_failed
+      4. On push success: create a PR via `tea pr create --title "..." --description "..."`
+         - IMPORTANT: `tea pr create` must run from the MAIN repo directory (not a worktree), because tea cannot detect the repo from worktree `.git` files. cd to the repo root first.
+         - Do NOT pass `--repo` — let tea auto-detect from the main repo's git remote.
+         - PR description must include: What / Why / Findings / Changes sections
+         - On tea failure: capture stderr/stdout, include PR details for manual creation, mark hook_failed
+      5. After PR creation, clean up the worktree:
+         - cd to the repo root (parent of .worktrees)
+         - `git worktree remove <worktree-path>`
+    output: "Include PR URL on success or error log on failure. Set $status to committed (with prUrl) or hook_failed (with error)."
+    frontmatter:
+      oneOf:
+        - properties:
+            $status: { const: "committed" }
+            prUrl: { type: string }
+          required: [$status, prUrl]
+        - properties:
+            $status: { const: "hook_failed" }
+            error: { type: string }
+          required: [$status, error]
+graph:
+  $START:
+    _: { role: "analyst", prompt: "Analyze completed thread {{{threadId}}} for execution anomalies." }
+  analyst:
+    clean: { role: "$END", prompt: "No issues found. Thread executed cleanly." }
+    findings: { role: "proposer", prompt: "Findings report: {{{report}}}. Target workflow: {{{targetWorkflow}}}. Propose minimal edits." }
+    wrong_project: { role: "$END", prompt: "Thread uses workflow '{{{workflowName}}}' which does not exist in this project. Run retrospect from the correct repo." }
+  proposer:
+    no_action: { role: "$END", prompt: "No actionable changes needed: {{{reason}}}." }
+    ready: { role: "developer", prompt: "Apply the change plan (CAS hash: {{{plan}}}) to the workflow definitions in this repo." }
+  developer:
+    done: { role: "reviewer", prompt: "Review workflow edits on branch {{{branch}}} at {{{worktree}}}." }
+    failed: { role: "$END", prompt: "Developer failed: {{{reason}}}. Ending workflow." }
+  reviewer:
+    rejected: { role: "developer", prompt: "Reviewer rejected: {{{comments}}}. Fix the issues in {{{worktree}}}." }
+    approved: { role: "committer", prompt: "Approved. Commit and push branch {{{branch}}} from {{{worktree}}}." }
+  committer:
+    hook_failed: { role: "developer", prompt: "Push hook failed: {{{error}}}. Fix and re-submit." }
+    committed: { role: "$END", prompt: "PR created: {{{prUrl}}}. Workflow improved." }
@@ -155,12 +155,13 @@ roles:
      cd into the worktree first.

      Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
-      1. Stage all changes: `git add -A`
-      2. Commit with a descriptive message referencing the issue: `git commit -m "type: description\n\nFixes #N"`
+      1. Check `git status` — if working tree is clean and branch is ahead of origin, skip to step 3 (push).
+      2. If there are unstaged/uncommitted changes: `git add -A` then `git commit -m "type: description\n\nFixes #N"`
      3. Push the branch: `git push -u origin <branch-name>`
         - If push hook fails: capture the error log in your output, mark hook_failed
-      4. On push success: create a PR via `tea pr create --repo <owner/repo> --title "..." --description "..."`
-         - Extract owner/repo from: `git remote get-url origin | sed 's/.*[:/]\([^/]*\/[^.]*\).*/\1/'`
+      4. On push success: create a PR via `tea pr create --title "..." --description "..."`
+         - IMPORTANT: `tea pr create` must run from the MAIN repo directory (not a worktree), because tea cannot detect the repo from worktree `.git` files. cd to the repo root first.
+         - Do NOT pass `--repo` — let tea auto-detect from the main repo's git remote.
         - PR description must include: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
         - On tea failure: capture stderr/stdout, include PR details for manual creation, mark hook_failed
      5. After PR creation, clean up the worktree:
@@ -4,7 +4,6 @@
    "includes": [
      "**",
      "!**/dist",
-      "!.worktrees",
      "!**/node_modules",
      "!**/legacy-packages",
      "!scripts",
@@ -391,7 +391,7 @@ Everything else is immutable CAS content.
 providers:
  openrouter:
    baseUrl: "https://openrouter.ai/api/v1"
-    apiKey: "sk-..."
+    apiKeyEnv: "OPENROUTER_API_KEY"

 models:
  sonnet:
@@ -402,7 +402,7 @@ workflow 怎么配置和使用 model？
 ```136:160:packages/workflow-protocol/src/types.ts
 export type ProviderConfig = {
  baseUrl: string;
-  apiKey: string;
+  apiKeyEnv: string;
 };

 export type ModelConfig = {
@@ -429,7 +429,7 @@ export type WorkflowConfig = {
 export function resolveModel(config: WorkflowConfig, alias: ModelAlias): ResolvedLlmProvider {
  const modelEntry = config.models[alias];
  const providerEntry = config.providers[modelEntry.provider];
-  const apiKey = providerEntry.apiKey;
+  const apiKey = process.env[providerEntry.apiKeyEnv];
  return { baseUrl: providerEntry.baseUrl, apiKey, model: modelEntry.name };
 }
 ```
@@ -280,13 +280,13 @@ threads.yaml: { "01J7K9M2XNPQR5VWBCDF8G3H4T": "8FWKR3TN5V1QA" }
 providers:
  openai:
    baseUrl: "https://api.openai.com/v1"
-    apiKey: "sk-..."
+    apiKeyEnv: "OPENAI_API_KEY"
  anthropic:
    baseUrl: "https://api.anthropic.com/v1"
-    apiKey: "sk-ant-..."
+    apiKeyEnv: "ANTHROPIC_API_KEY"
  openrouter:
    baseUrl: "https://openrouter.ai/api/v1"
-    apiKey: "sk-or-..."
+    apiKeyEnv: "OPENROUTER_API_KEY"

 models:
  sonnet:
@@ -465,7 +465,7 @@ type Scenario = string;              // e.g. "extract"

 type ProviderConfig = {
  baseUrl: string;
-  apiKey: string;                    // API key stored directly
+  apiKeyEnv: string;                 // env var name to read API key from
 };

 type ModelConfig = {
@@ -14,7 +14,7 @@
    "test:ci": "bun run --filter './packages/*' test:ci",
    "changeset": "bunx changeset",
    "version": "bunx changeset version",
-    "release": "bun run build && bun run test && node scripts/publish-all.mjs"
+    "release": "bun run build && bun test && node scripts/publish-all.mjs"
  },
  "devDependencies": {
    "@agentclientprotocol/sdk": "^0.22.1",
@@ -119,7 +119,7 @@ uwf setup --provider openai --base-url https://api.openai.com/v1 \
  --api-key sk-... --model gpt-4o --agent hermes
 ```

-Config: `~/.uncaged/workflow/config.yaml` (includes API keys).
+Config: `~/.uncaged/workflow/config.yaml`. API keys: `~/.uncaged/workflow/.env`.

 ### Skill

@@ -8,7 +8,7 @@
  ],
  "type": "module",
  "bin": {
-    "uwf": "./dist/cli.js"
+    "uwf": "./src/cli.ts"
  },
  "dependencies": {
    "@uncaged/json-cas": "^0.5.3",
@@ -1,622 +0,0 @@
-import { mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs";
-import { tmpdir } from "node:os";
-import { join } from "node:path";
-import { describe, expect, test } from "vitest";
-import {
-  cmdConfigGet,
-  cmdConfigList,
-  cmdConfigSet,
-  getConfigPath,
-  getNestedValue,
-  maskApiKeys,
-  parseDotPath,
-  setNestedValue,
-} from "../commands/config.js";
-
-describe("config command", () => {
-  // Helper function to create a test config
-  function createTestConfig(tempDir: string, content: string): string {
-    const configPath = getConfigPath(tempDir);
-    writeFileSync(configPath, content, "utf8");
-    return configPath;
-  }
-
-  // Sample test config
-  const sampleConfig = `providers:
-  dashscope:
-    baseUrl: https://dashscope.aliyuncs.com/compatible-mode/v1
-    apiKey: sk-test-dashscope-key
-  openai:
-    baseUrl: https://api.openai.com/v1
-    apiKey: sk-test-openai-key
-models:
-  default:
-    provider: dashscope
-    name: qwen-max
-  gpt4:
-    provider: openai
-    name: gpt-4
-agents:
-  hermes:
-    command: uwf-hermes
-    args:
-      - --provider
-      - dashscope
-  claude-code:
-    command: claude-code
-    args:
-      - --profile
-      - work
-defaultAgent: hermes
-defaultModel: default
-`;
-
-  describe("helper functions", () => {
-    describe("parseDotPath", () => {
-      test("splits dot notation correctly", () => {
-        expect(parseDotPath("a.b.c")).toEqual(["a", "b", "c"]);
-        expect(parseDotPath("defaultAgent")).toEqual(["defaultAgent"]);
-        expect(parseDotPath("providers.dashscope.baseUrl")).toEqual([
-          "providers",
-          "dashscope",
-          "baseUrl",
-        ]);
-      });
-    });
-
-    describe("getNestedValue", () => {
-      test("traverses nested objects", () => {
-        const obj = {
-          a: { b: { c: "value" } },
-          x: "simple",
-        };
-        expect(getNestedValue(obj, ["a", "b", "c"])).toBe("value");
-        expect(getNestedValue(obj, ["x"])).toBe("simple");
-      });
-
-      test("returns undefined for non-existent paths", () => {
-        const obj = { a: { b: "value" } };
-        expect(getNestedValue(obj, ["a", "c"])).toBeUndefined();
-        expect(getNestedValue(obj, ["x", "y"])).toBeUndefined();
-      });
-    });
-
-    describe("setNestedValue", () => {
-      test("creates intermediate objects and sets value", () => {
-        const obj: Record<string, unknown> = {};
-        setNestedValue(obj, ["a", "b", "c"], "value");
-        expect(obj).toEqual({ a: { b: { c: "value" } } });
-      });
-
-      test("preserves existing values", () => {
-        const obj: Record<string, unknown> = { a: { x: "keep" } };
-        setNestedValue(obj, ["a", "b"], "new");
-        expect(obj).toEqual({ a: { x: "keep", b: "new" } });
-      });
-
-      test("overwrites existing value at path", () => {
-        const obj: Record<string, unknown> = { a: { b: "old" } };
-        setNestedValue(obj, ["a", "b"], "new");
-        expect(obj).toEqual({ a: { b: "new" } });
-      });
-    });
-
-    describe("maskApiKeys", () => {
-      test("deep clones and masks all apiKey values in providers", () => {
-        const config = {
-          providers: {
-            dashscope: {
-              baseUrl: "https://example.com",
-              apiKey: "sk-test-key-12345",
-            },
-            openai: {
-              baseUrl: "https://api.openai.com",
-              apiKey: "sk-another-secret",
-            },
-          },
-          models: {
-            default: { provider: "dashscope" },
-          },
-        };
-        const masked = maskApiKeys(config);
-        expect(masked).toEqual({
-          providers: {
-            dashscope: {
-              baseUrl: "https://example.com",
-              apiKey: "***MASKED***",
-            },
-            openai: {
-              baseUrl: "https://api.openai.com",
-              apiKey: "***MASKED***",
-            },
-          },
-          models: {
-            default: { provider: "dashscope" },
-          },
-        });
-        // Ensure it's a deep clone
-        expect(masked).not.toBe(config);
-      });
-
-      test("handles config without providers", () => {
-        const config = { models: { default: { provider: "test" } } };
-        const masked = maskApiKeys(config);
-        expect(masked).toEqual(config);
-      });
-    });
-  });
-
-  describe("cmdConfigList", () => {
-    test("returns full config when file exists", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const result = await cmdConfigList(tempDir);
-        expect(result).toBeDefined();
-        expect(typeof result).toBe("object");
-        expect(result).toHaveProperty("providers");
-        expect(result).toHaveProperty("models");
-        expect(result).toHaveProperty("agents");
-        expect(result).toHaveProperty("defaultAgent");
-        expect(result).toHaveProperty("defaultModel");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("masks all apiKey values in providers section", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const result = (await cmdConfigList(tempDir)) as Record<string, unknown>;
-        const providers = result.providers as Record<string, unknown>;
-        const dashscope = providers.dashscope as Record<string, unknown>;
-        const openai = providers.openai as Record<string, unknown>;
-        expect(dashscope.apiKey).toBe("***MASKED***");
-        expect(openai.apiKey).toBe("***MASKED***");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("throws error when config file doesn't exist", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        await expect(cmdConfigList(tempDir)).rejects.toThrow();
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("returns empty object when config file is empty", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, "");
-        const result = await cmdConfigList(tempDir);
-        expect(result).toEqual({});
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("throws error when config file is invalid YAML", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, "invalid: yaml: [broken");
-        await expect(cmdConfigList(tempDir)).rejects.toThrow();
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-  });
-
-  describe("cmdConfigGet", () => {
-    test("retrieves top-level string value (defaultAgent)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const result = await cmdConfigGet(tempDir, "defaultAgent");
-        expect(result).toBe("hermes");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("retrieves top-level string value (defaultModel)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const result = await cmdConfigGet(tempDir, "defaultModel");
-        expect(result).toBe("default");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("retrieves nested object (providers.dashscope)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const result = await cmdConfigGet(tempDir, "providers.dashscope");
-        expect(result).toEqual({
-          baseUrl: "https://dashscope.aliyuncs.com/compatible-mode/v1",
-          apiKey: "sk-test-dashscope-key",
-        });
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("retrieves deeply nested string (providers.dashscope.baseUrl)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const result = await cmdConfigGet(tempDir, "providers.dashscope.baseUrl");
-        expect(result).toBe("https://dashscope.aliyuncs.com/compatible-mode/v1");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("retrieves nested string in models (models.default.provider)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const result = await cmdConfigGet(tempDir, "models.default.provider");
-        expect(result).toBe("dashscope");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("retrieves array value (agents.hermes.args)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const result = await cmdConfigGet(tempDir, "agents.hermes.args");
-        expect(result).toEqual(["--provider", "dashscope"]);
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("throws error when key doesn't exist", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(cmdConfigGet(tempDir, "nonexistent.key")).rejects.toThrow(/Key not found/);
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("throws error when config file doesn't exist", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        await expect(cmdConfigGet(tempDir, "defaultAgent")).rejects.toThrow();
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("throws error when accessing property on non-object", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(cmdConfigGet(tempDir, "defaultAgent.foo")).rejects.toThrow();
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-  });
-
-  describe("cmdConfigSet", () => {
-    test("sets top-level string value (defaultAgent)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const result = await cmdConfigSet(tempDir, "defaultAgent", "claude-code");
-        expect(result).toEqual({ key: "defaultAgent", value: "claude-code" });
-        // Verify it was written
-        const updated = await cmdConfigGet(tempDir, "defaultAgent");
-        expect(updated).toBe("claude-code");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("sets nested string value (providers.dashscope.baseUrl)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const newUrl = "https://new-api.example.com/v1";
-        const result = await cmdConfigSet(tempDir, "providers.dashscope.baseUrl", newUrl);
-        expect(result).toEqual({
-          key: "providers.dashscope.baseUrl",
-          value: newUrl,
-        });
-        // Verify it was written
-        const updated = await cmdConfigGet(tempDir, "providers.dashscope.baseUrl");
-        expect(updated).toBe(newUrl);
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("creates new nested path (providers.newprovider.baseUrl)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const newUrl = "https://new-provider.com/v1";
-        const result = await cmdConfigSet(tempDir, "providers.newprovider.baseUrl", newUrl);
-        expect(result).toEqual({
-          key: "providers.newprovider.baseUrl",
-          value: newUrl,
-        });
-        // Verify it was created
-        const updated = await cmdConfigGet(tempDir, "providers.newprovider.baseUrl");
-        expect(updated).toBe(newUrl);
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("sets array value for args key with valid JSON array", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const newArgs = '["--new", "--flags"]';
-        const result = await cmdConfigSet(tempDir, "agents.hermes.args", newArgs);
-        expect(result).toEqual({
-          key: "agents.hermes.args",
-          value: ["--new", "--flags"],
-        });
-        // Verify it was written
-        const updated = await cmdConfigGet(tempDir, "agents.hermes.args");
-        expect(updated).toEqual(["--new", "--flags"]);
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("preserves existing config values when updating one key", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await cmdConfigSet(tempDir, "defaultAgent", "claude-code");
-        // Verify other values are preserved
-        const defaultModel = await cmdConfigGet(tempDir, "defaultModel");
-        expect(defaultModel).toBe("default");
-        const dashscopeUrl = await cmdConfigGet(tempDir, "providers.dashscope.baseUrl");
-        expect(dashscopeUrl).toBe("https://dashscope.aliyuncs.com/compatible-mode/v1");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("creates config file if it doesn't exist", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        const result = await cmdConfigSet(tempDir, "defaultAgent", "hermes");
-        expect(result).toEqual({ key: "defaultAgent", value: "hermes" });
-        // Verify file was created
-        const configPath = getConfigPath(tempDir);
-        const content = readFileSync(configPath, "utf8");
-        expect(content).toContain("defaultAgent: hermes");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("throws error when setting property on non-object", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(cmdConfigSet(tempDir, "defaultAgent.foo", "bar")).rejects.toThrow();
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("throws error when array value is invalid JSON for args key", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(
-          cmdConfigSet(tempDir, "agents.hermes.args", "[invalid json"),
-        ).rejects.toThrow();
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("sets deeply nested model config (models.gpt4.provider)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const result = await cmdConfigSet(tempDir, "models.gpt4.provider", "new-provider");
-        expect(result).toEqual({
-          key: "models.gpt4.provider",
-          value: "new-provider",
-        });
-        // Verify it was written
-        const updated = await cmdConfigGet(tempDir, "models.gpt4.provider");
-        expect(updated).toBe("new-provider");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("sets agent command (agents.claude-code.command)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        const result = await cmdConfigSet(tempDir, "agents.claude-code.command", "new-command");
-        expect(result).toEqual({
-          key: "agents.claude-code.command",
-          value: "new-command",
-        });
-        // Verify it was written
-        const updated = await cmdConfigGet(tempDir, "agents.claude-code.command");
-        expect(updated).toBe("new-command");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-  });
-
-  describe("cmdConfigSet validation", () => {
-    test("rejects unknown top-level key", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(cmdConfigSet(tempDir, "unknownKey", "value")).rejects.toThrow(
-          /Unknown config key.*unknownKey/,
-        );
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("rejects unknown nested key in providers", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(
-          cmdConfigSet(tempDir, "providers.myProvider.unknownField", "value"),
-        ).rejects.toThrow(/Unknown field.*unknownField.*providers/);
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("rejects unknown nested key in models", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(cmdConfigSet(tempDir, "models.default.invalidField", "value")).rejects.toThrow(
-          /Unknown field.*invalidField.*models/,
-        );
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("rejects unknown nested key in agents", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(cmdConfigSet(tempDir, "agents.hermes.badField", "value")).rejects.toThrow(
-          /Unknown field.*badField.*agents/,
-        );
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("rejects nested path on scalar key (defaultAgent)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(cmdConfigSet(tempDir, "defaultAgent.foo", "value")).rejects.toThrow(
-          /defaultAgent.*scalar|Cannot set property/i,
-        );
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("rejects nested path on scalar key (defaultModel)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(cmdConfigSet(tempDir, "defaultModel.bar", "value")).rejects.toThrow(
-          /defaultModel.*scalar|Cannot set property/i,
-        );
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("rejects incomplete nested path (providers without field)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(cmdConfigSet(tempDir, "providers.myProvider", "value")).rejects.toThrow(
-          /incomplete path|must specify a field/i,
-        );
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("rejects incomplete nested path (models without field)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(cmdConfigSet(tempDir, "models.myModel", "value")).rejects.toThrow(
-          /incomplete path|must specify a field/i,
-        );
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("rejects incomplete nested path (agents without field)", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await expect(cmdConfigSet(tempDir, "agents.myAgent", "value")).rejects.toThrow(
-          /incomplete path|must specify a field/i,
-        );
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("allows valid nested keys in providers", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await cmdConfigSet(tempDir, "providers.newprovider.baseUrl", "https://example.com");
-        await cmdConfigSet(tempDir, "providers.newprovider.apiKey", "sk-test");
-        const baseUrl = await cmdConfigGet(tempDir, "providers.newprovider.baseUrl");
-        const apiKey = await cmdConfigGet(tempDir, "providers.newprovider.apiKey");
-        expect(baseUrl).toBe("https://example.com");
-        expect(apiKey).toBe("sk-test");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("allows valid nested keys in models", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await cmdConfigSet(tempDir, "models.gpt4.provider", "openai");
-        await cmdConfigSet(tempDir, "models.gpt4.name", "gpt-4o");
-        const provider = await cmdConfigGet(tempDir, "models.gpt4.provider");
-        const name = await cmdConfigGet(tempDir, "models.gpt4.name");
-        expect(provider).toBe("openai");
-        expect(name).toBe("gpt-4o");
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-
-    test("allows valid nested keys in agents", async () => {
-      const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
-      try {
-        createTestConfig(tempDir, sampleConfig);
-        await cmdConfigSet(tempDir, "agents.hermes.command", "uwf-hermes");
-        await cmdConfigSet(tempDir, "agents.hermes.args", '["--flag"]');
-        const command = await cmdConfigGet(tempDir, "agents.hermes.command");
-        const args = await cmdConfigGet(tempDir, "agents.hermes.args");
-        expect(command).toBe("uwf-hermes");
-        expect(args).toEqual(["--flag"]);
-      } finally {
-        rmSync(tempDir, { recursive: true, force: true });
-      }
-    });
-  });
-});
@@ -40,7 +40,6 @@ describe("resolveHeadHash", () => {
      workflow: workflowHash,
      head: headHash,
      completedAt: Date.now(),
-      reason: null,
    });

    const result = await resolveHeadHash(tmpDir, threadId);
@@ -65,7 +64,6 @@ describe("resolveHeadHash", () => {
      workflow: workflowHash,
      head: historicalHash,
      completedAt: Date.now(),
-      reason: null,
    });

    const result = await resolveHeadHash(tmpDir, threadId);
@@ -89,21 +87,18 @@ describe("resolveHeadHash", () => {
      workflow: workflowHash,
      head: hash1,
      completedAt: Date.now() - 2000,
-      reason: null,
    });
    await appendThreadHistory(tmpDir, {
      thread: threadId2,
      workflow: workflowHash,
      head: hash2,
      completedAt: Date.now() - 1000,
-      reason: null,
    });
    await appendThreadHistory(tmpDir, {
      thread: threadId3,
      workflow: workflowHash,
      head: hash3,
      completedAt: Date.now(),
-      reason: null,
    });

    const result = await resolveHeadHash(tmpDir, threadId2);
@@ -134,34 +134,4 @@ describe("cmdSetup agent configuration", () => {
    const config2 = parse(readFileSync(join(storageRoot, "config.yaml"), "utf8"));
    expect(config2.defaultAgent).toBe("builtin");
  });
-
-  test("normalizes agent name with uwf- prefix to bare name", async () => {
-    vi.spyOn(globalThis, "fetch").mockResolvedValue(
-      new Response(JSON.stringify({}), { status: 200 }),
-    );
-
-    const result = await cmdSetup({ ...baseArgs(), agent: "uwf-hermes" });
-
-    expect(result.defaultAgent).toBe("hermes");
-    const config = parse(readFileSync(join(storageRoot, "config.yaml"), "utf8"));
-    expect(config.agents.hermes).toEqual({ command: "uwf-hermes", args: [] });
-    expect(config.defaultAgent).toBe("hermes");
-    // Verify no duplicate uwf- prefix
-    expect(config.agents["uwf-hermes"]).toBeUndefined();
-  });
-
-  test("normalizes uwf-claude-code to claude-code", async () => {
-    vi.spyOn(globalThis, "fetch").mockResolvedValue(
-      new Response(JSON.stringify({}), { status: 200 }),
-    );
-
-    const result = await cmdSetup({ ...baseArgs(), agent: "uwf-claude-code" });
-
-    expect(result.defaultAgent).toBe("claude-code");
-    const config = parse(readFileSync(join(storageRoot, "config.yaml"), "utf8"));
-    expect(config.agents["claude-code"]).toEqual({ command: "uwf-claude-code", args: [] });
-    expect(config.defaultAgent).toBe("claude-code");
-    // Verify no duplicate uwf- prefix
-    expect(config.agents["uwf-claude-code"]).toBeUndefined();
-  });
 });
@@ -129,8 +129,9 @@ describe("cmdSetup with validation", () => {
    const result = await cmdSetup(setupArgs());

    expect(result.validation).toEqual({ ok: true, value: undefined });
-    // Config file should still be written
+    // Config files should still be written
    expect(result.configPath).toBeTruthy();
+    expect(result.envPath).toBeTruthy();
  });

  test("includes validation failure — config still saved", async () => {
@@ -142,7 +143,8 @@ describe("cmdSetup with validation", () => {

    expect(result.validation).toBeDefined();
    expect((result.validation as { ok: boolean }).ok).toBe(false);
-    // Config file should still be written despite validation failure
+    // Config files should still be written despite validation failure
    expect(result.configPath).toBeTruthy();
+    expect(result.envPath).toBeTruthy();
  });
 });
@@ -6,15 +6,10 @@ import { describe, expect, test } from "vitest";
 const __dirname = dirname(fileURLToPath(import.meta.url));

 import {
-  cmdSkillActor,
-  cmdSkillAdapter,
  cmdSkillArchitecture,
-  cmdSkillAuthor,
  cmdSkillCli,
-  cmdSkillDeveloper,
  cmdSkillList,
  cmdSkillModerator,
-  cmdSkillUser,
  cmdSkillYaml,
 } from "../commands/skill.js";

@@ -26,12 +21,8 @@ describe("skill commands", () => {
    expect(result).toContain("architecture");
    expect(result).toContain("yaml");
    expect(result).toContain("moderator");
-    expect(result).toContain("actor");
-    expect(result).toContain("user");
-    expect(result).toContain("author");
-    expect(result).toContain("developer");
-    expect(result).toContain("adapter");
    for (const name of result) {
+      expect(typeof name).toBe("string");
      expect(name).toMatch(/^\S+$/);
    }
  });
@@ -71,54 +62,6 @@ describe("skill commands", () => {
    expect(result).toContain("uwf");
  });

-  test("skill actor returns non-empty markdown string", () => {
-    const result = cmdSkillActor();
-    expect(typeof result).toBe("string");
-    expect(result).toContain("frontmatter");
-    expect(result).toContain("CAS");
-    expect(result).toContain("status");
-    expect(result.length).toBeGreaterThan(200);
-  });
-
-  test("skill user returns non-empty markdown string", () => {
-    const result = cmdSkillUser();
-    expect(typeof result).toBe("string");
-    expect(result).toContain("uwf");
-    expect(result).toContain("thread");
-    expect(result).toContain("workflow");
-    expect(result).toContain("Quick Start");
-    expect(result.length).toBeGreaterThan(500);
-  });
-
-  test("skill author returns non-empty markdown string", () => {
-    const result = cmdSkillAuthor();
-    expect(typeof result).toBe("string");
-    expect(result).toContain("frontmatter");
-    expect(result).toContain("graph");
-    expect(result).toContain("$START");
-    expect(result).toContain("$END");
-    expect(result).toContain("$status");
-    expect(result.length).toBeGreaterThan(500);
-  });
-
-  test("skill developer returns non-empty markdown string", () => {
-    const result = cmdSkillDeveloper();
-    expect(typeof result).toBe("string");
-    expect(result).toContain("Monorepo");
-    expect(result).toContain("CAS");
-    expect(result).toContain("Biome");
-    expect(result.length).toBeGreaterThan(500);
-  });
-
-  test("skill adapter returns non-empty markdown string", () => {
-    const result = cmdSkillAdapter();
-    expect(typeof result).toBe("string");
-    expect(result).toContain("createAgent");
-    expect(result).toContain("AgentContext");
-    expect(result).toContain("frontmatter");
-    expect(result.length).toBeGreaterThan(500);
-  });
-
  test("skill help subcommand is suppressed", () => {
    const output = execFileSync("bun", ["src/cli.ts", "skill", "--help"], {
      cwd: join(__dirname, "..", ".."),
@@ -130,11 +73,6 @@ describe("skill commands", () => {
    expect(output).toContain("architecture");
    expect(output).toContain("yaml");
    expect(output).toContain("moderator");
-    expect(output).toContain("actor");
-    expect(output).toContain("user");
-    expect(output).toContain("author");
-    expect(output).toContain("developer");
-    expect(output).toContain("adapter");
    expect(output).toContain("list");
  });
 });
@@ -24,7 +24,7 @@ describe("solve-issue workflow: tea pr create worktree fix", () => {
    "solve-issue.yaml",
  );

-  test("committer procedure should include --repo flag in tea pr create command", async () => {
+  test("committer procedure should require running tea pr create from main repo directory", async () => {
    const yamlContent = await readFile(workflowPath, "utf-8");
    const workflow = parse(yamlContent) as WorkflowPayload;

@@ -32,19 +32,12 @@ describe("solve-issue workflow: tea pr create worktree fix", () => {
    const committerProcedure = workflow.roles.committer?.procedure;
    expect(committerProcedure).toBeDefined();

-    // Verify the procedure includes tea pr create with --repo flag
+    // Verify the procedure includes tea pr create
    expect(committerProcedure).toContain("tea pr create");
-    expect(committerProcedure).toContain("--repo");

-    // Verify the --repo flag appears before or together with tea pr create
-    // This ensures the command is: tea pr create --repo <owner/repo> ...
-    const teaPrCreateMatch = committerProcedure?.match(/tea pr create[^\n]*/);
-    expect(teaPrCreateMatch).not.toBeNull();
-
-    if (teaPrCreateMatch) {
-      const teaCommandLine = teaPrCreateMatch[0];
-      expect(teaCommandLine).toContain("--repo");
-    }
+    // Verify the procedure warns about running from main repo dir (not worktree)
+    expect(committerProcedure).toMatch(/main repo directory/i);
+    expect(committerProcedure).toMatch(/not a worktree/i);
  });

  test("committer procedure should mention repo extraction from git remote", async () => {
@@ -1,85 +0,0 @@
-import { mkdtemp } from "node:fs/promises";
-import { tmpdir } from "node:os";
-import { join } from "node:path";
-import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
-import { describe, expect, test } from "vitest";
-import { appendThreadHistory, loadThreadHistory } from "../store.js";
-
-describe("thread cancel status", () => {
-  test("cancelled history entry has reason 'cancelled'", async () => {
-    const tmpDir = await mkdtemp(join(tmpdir(), "uwf-cancel-test-"));
-    const threadId = "01JTEST000000000000CANCEL1" as ThreadId;
-
-    await appendThreadHistory(tmpDir, {
-      thread: threadId,
-      workflow: "test-workflow",
-      head: "test-head-hash" as CasRef,
-      completedAt: Date.now(),
-      reason: "cancelled",
-    });
-
-    const history = await loadThreadHistory(tmpDir);
-    expect(history).toHaveLength(1);
-    expect(history[0]?.reason).toBe("cancelled");
-  });
-
-  test("completed history entry has reason 'completed'", async () => {
-    const tmpDir = await mkdtemp(join(tmpdir(), "uwf-cancel-test-"));
-    const threadId = "01JTEST000000000000CANCEL2" as ThreadId;
-
-    await appendThreadHistory(tmpDir, {
-      thread: threadId,
-      workflow: "test-workflow",
-      head: "test-head-hash" as CasRef,
-      completedAt: Date.now(),
-      reason: "completed",
-    });
-
-    const history = await loadThreadHistory(tmpDir);
-    expect(history).toHaveLength(1);
-    expect(history[0]?.reason).toBe("completed");
-  });
-
-  test("legacy history entry without reason parses as null", async () => {
-    const tmpDir = await mkdtemp(join(tmpdir(), "uwf-cancel-test-"));
-    const threadId = "01JTEST000000000000CANCEL3" as ThreadId;
-
-    // Simulate legacy entry without reason field
-    await appendThreadHistory(tmpDir, {
-      thread: threadId,
-      workflow: "test-workflow",
-      head: "test-head-hash" as CasRef,
-      completedAt: Date.now(),
-      reason: null,
-    });
-
-    const history = await loadThreadHistory(tmpDir);
-    expect(history).toHaveLength(1);
-    expect(history[0]?.reason).toBeNull();
-  });
-
-  test("mixed completed and cancelled entries preserve distinct reasons", async () => {
-    const tmpDir = await mkdtemp(join(tmpdir(), "uwf-cancel-test-"));
-
-    await appendThreadHistory(tmpDir, {
-      thread: "01JTEST000000000000CANCEL4" as ThreadId,
-      workflow: "test-workflow",
-      head: "head1" as CasRef,
-      completedAt: Date.now(),
-      reason: "completed",
-    });
-
-    await appendThreadHistory(tmpDir, {
-      thread: "01JTEST000000000000CANCEL5" as ThreadId,
-      workflow: "test-workflow",
-      head: "head2" as CasRef,
-      completedAt: Date.now(),
-      reason: "cancelled",
-    });
-
-    const history = await loadThreadHistory(tmpDir);
-    expect(history).toHaveLength(2);
-    expect(history[0]?.reason).toBe("completed");
-    expect(history[1]?.reason).toBe("cancelled");
-  });
-});
@@ -74,7 +74,6 @@ async function completeThread(
    workflow: workflowHash,
    head: headHash,
    completedAt: Date.now(),
-    reason: null,
  });
 }

@@ -758,7 +758,6 @@ describe("cmdStepList with completed threads", () => {
      workflow: workflowHash,
      head: step2Hash,
      completedAt: Date.now(),
-      reason: null,
    });

    const result = await cmdStepList(tmpDir, threadId);
@@ -887,7 +886,6 @@ describe("cmdStepShow with completed threads", () => {
      workflow: workflowHash,
      head: stepHash,
      completedAt: Date.now(),
-      reason: null,
    });

    const result = await cmdStepShow(tmpDir, stepHash);
@@ -951,7 +949,6 @@ describe("cmdThreadRead with completed threads", () => {
      workflow: workflowHash,
      head: stepHash,
      completedAt: Date.now(),
-      reason: null,
    });

    const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
@@ -1014,7 +1011,6 @@ describe("cmdThreadRead with completed threads", () => {
      workflow: workflowHash,
      head: step3Hash,
      completedAt: Date.now(),
-      reason: null,
    });

    const markdown = await cmdThreadRead(
@@ -1,4 +1,4 @@
-#!/usr/bin/env node
+#!/usr/bin/env bun

 import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
 import { Command } from "commander";
@@ -13,19 +13,13 @@ import {
  cmdCasSchemaList,
  cmdCasWalk,
 } from "./commands/cas.js";
-import { cmdConfigGet, cmdConfigList, cmdConfigSet } from "./commands/config.js";
 import { cmdLogClean, cmdLogList, cmdLogShow } from "./commands/log.js";
 import { cmdSetup, cmdSetupInteractive } from "./commands/setup.js";
 import {
-  cmdSkillActor,
-  cmdSkillAdapter,
  cmdSkillArchitecture,
-  cmdSkillAuthor,
  cmdSkillCli,
-  cmdSkillDeveloper,
  cmdSkillList,
  cmdSkillModerator,
-  cmdSkillUser,
  cmdSkillYaml,
 } from "./commands/skill.js";
 import { cmdStepFork, cmdStepList, cmdStepRead, cmdStepShow } from "./commands/step.js";
@@ -187,11 +181,11 @@ function parseStatusFilter(status: string | undefined): ThreadStatus[] | null {
  if (raw === "active") return ["idle", "running"];

  const parts = raw.split(",").map((s) => s.trim());
-  const validStatuses: ThreadStatus[] = ["idle", "running", "completed", "cancelled"];
+  const validStatuses: ThreadStatus[] = ["idle", "running", "completed"];
  for (const part of parts) {
    if (!validStatuses.includes(part as ThreadStatus)) {
      process.stderr.write(
-        `Invalid status: ${part}. Must be one of: idle, running, completed, cancelled, active\n`,
+        `Invalid status: ${part}. Must be one of: idle, running, completed, active\n`,
      );
      process.exit(1);
    }
@@ -244,7 +238,7 @@ thread
  .description("List threads")
  .option(
    "--status <status>",
-    "Filter by status: idle, running, completed, cancelled, active (idle+running), or comma-separated values",
+    "Filter by status: idle, running, completed, active (idle+running), or comma-separated values",
  )
  .option("--after <date>", "Filter threads created after this date (ISO or relative like '7d')")
  .option("--before <date>", "Filter threads created before this date (ISO or relative like '7d')")
@@ -508,34 +502,6 @@ skill
    console.log(cmdSkillYaml());
  });

-skill
-  .command("actor")
-  .description("Print the actor reference (frontmatter protocol + CAS)")
-  .action(() => {
-    console.log(cmdSkillActor());
-  });
-
-skill
-  .command("adapter")
-  .description("Print the adapter reference (building agent adapters)")
-  .action(() => {
-    console.log(cmdSkillAdapter());
-  });
-
-skill
-  .command("author")
-  .description("Print the author reference (workflow YAML design guide)")
-  .action(() => {
-    console.log(cmdSkillAuthor());
-  });
-
-skill
-  .command("developer")
-  .description("Print the developer reference (coding conventions + architecture)")
-  .action(() => {
-    console.log(cmdSkillDeveloper());
-  });
-
 skill
  .command("moderator")
  .description("Print the moderator reference")
@@ -543,13 +509,6 @@ skill
    console.log(cmdSkillModerator());
  });

-skill
-  .command("user")
-  .description("Print the user reference (CLI guide + typical workflows)")
-  .action(() => {
-    console.log(cmdSkillUser());
-  });
-
 skill
  .command("list")
  .description("List all available skill names")
@@ -564,7 +523,7 @@ program
  .option("--base-url <url>", "OpenAI-compatible API base URL")
  .option("--api-key <key>", "API key")
  .option("--model <name>", "Default model name")
-  .option("--agent <name>", "Default agent adapter (e.g. hermes → uwf-hermes)")
+  .option("--agent <name>", "Default agent alias")
  .action(
    (opts: {
      provider?: string;
@@ -752,47 +711,6 @@ log
    });
  });

-const config = program.command("config").description("Configuration management");
-
-config
-  .command("list")
-  .description("Display all configuration values (masks API keys)")
-  .action(() => {
-    const storageRoot = resolveStorageRoot();
-    runAction(async () => {
-      const result = await cmdConfigList(storageRoot);
-      writeOutput(result);
-    });
-  });
-
-config
-  .command("get")
-  .description("Get a specific configuration value")
-  .argument(
-    "<key>",
-    "Dot-notation path to config value (e.g., defaultAgent, providers.dashscope.baseUrl)",
-  )
-  .action((key: string) => {
-    const storageRoot = resolveStorageRoot();
-    runAction(async () => {
-      const result = await cmdConfigGet(storageRoot, key);
-      writeOutput({ value: result });
-    });
-  });
-
-config
-  .command("set")
-  .description("Set a specific configuration value")
-  .argument("<key>", "Dot-notation path to config value")
-  .argument("<value>", "New value (use JSON array for 'args' key, e.g., '[\"--flag\"]')")
-  .action((key: string, value: string) => {
-    const storageRoot = resolveStorageRoot();
-    runAction(async () => {
-      const result = await cmdConfigSet(storageRoot, key, value);
-      writeOutput(result);
-    });
-  });
-
 program.parseAsync(process.argv).catch((e: unknown) => {
  const message = e instanceof Error ? e.message : String(e);
  process.stderr.write(`${message}\n`);
@@ -1,289 +0,0 @@
-import { existsSync, mkdirSync, readFileSync, writeFileSync } from "node:fs";
-import { join } from "node:path";
-import { parse, stringify } from "yaml";
-
-/**
- * Valid configuration key schema
- */
-const VALID_CONFIG_KEYS: Record<string, { nested: boolean; knownFields?: string[] }> = {
-  providers: {
-    nested: true,
-    knownFields: ["baseUrl", "apiKey"],
-  },
-  models: {
-    nested: true,
-    knownFields: ["provider", "name"],
-  },
-  agents: {
-    nested: true,
-    knownFields: ["command", "args"],
-  },
-  defaultAgent: { nested: false },
-  defaultModel: { nested: false },
-};
-
-/**
- * Validate a config key path against the known schema
- */
-function validateConfigKey(path: string[]): void {
-  if (path.length === 0) {
-    throw new Error("Path cannot be empty");
-  }
-
-  const topLevel = path[0];
-  const schema = VALID_CONFIG_KEYS[topLevel];
-
-  if (!schema) {
-    const validKeys = Object.keys(VALID_CONFIG_KEYS).join(", ");
-    throw new Error(`Unknown config key: ${topLevel}. Valid top-level keys are: ${validKeys}`);
-  }
-
-  // Scalar keys cannot have nested paths
-  if (!schema.nested && path.length > 1) {
-    throw new Error(`${topLevel} is a scalar key and cannot have nested properties`);
-  }
-
-  // Nested keys must have at least 3 segments (e.g., providers.myProvider.baseUrl)
-  if (schema.nested && path.length < 3) {
-    const fields = schema.knownFields?.join(", ") ?? "";
-    throw new Error(
-      `Incomplete path for ${topLevel}. Must specify a field (e.g., ${topLevel}.<name>.<field>). Valid fields: ${fields}`,
-    );
-  }
-
-  // Validate the field name for nested keys
-  if (schema.nested && path.length >= 3 && schema.knownFields) {
-    const field = path[path.length - 1];
-    if (!schema.knownFields.includes(field)) {
-      throw new Error(
-        `Unknown field '${field}' in ${topLevel}. Valid fields are: ${schema.knownFields.join(", ")}`,
-      );
-    }
-  }
-}
-
-/**
- * Returns the path to the config.yaml file
- */
-export function getConfigPath(storageRoot: string): string {
-  return join(storageRoot, "config.yaml");
-}
-
-/**
- * Load and parse YAML config file
- */
-export function loadConfig(configPath: string): Record<string, unknown> {
-  if (!existsSync(configPath)) {
-    throw new Error(`Config file not found: ${configPath}`);
-  }
-  const content = readFileSync(configPath, "utf8");
-  if (!content.trim()) {
-    return {};
-  }
-  try {
-    const parsed = parse(content);
-    return (parsed ?? {}) as Record<string, unknown>;
-  } catch (error) {
-    throw new Error(
-      `Invalid YAML in config file: ${error instanceof Error ? error.message : String(error)}`,
-    );
-  }
-}
-
-/**
- * Save config as YAML
- */
-export function saveConfig(configPath: string, config: Record<string, unknown>): void {
-  const dir = join(configPath, "..");
-  if (!existsSync(dir)) {
-    mkdirSync(dir, { recursive: true });
-  }
-  const yaml = stringify(config);
-  writeFileSync(configPath, yaml, "utf8");
-}
-
-/**
- * Parse dot-notation key into path segments
- */
-export function parseDotPath(key: string): string[] {
-  return key.split(".");
-}
-
-/**
- * Get nested value from object using path array
- */
-export function getNestedValue(obj: Record<string, unknown>, path: string[]): unknown {
-  let current: unknown = obj;
-  for (const segment of path) {
-    if (current === null || current === undefined || typeof current !== "object") {
-      return undefined;
-    }
-    current = (current as Record<string, unknown>)[segment];
-  }
-  return current;
-}
-
-/**
- * Set nested value in object using path array (mutates obj)
- */
-export function setNestedValue(obj: Record<string, unknown>, path: string[], value: unknown): void {
-  if (path.length === 0) {
-    throw new Error("Path cannot be empty");
-  }
-
-  let current: Record<string, unknown> = obj;
-
-  // Navigate/create to the parent of the target
-  for (let i = 0; i < path.length - 1; i++) {
-    const segment = path[i];
-    const next = current[segment];
-
-    if (next === null || next === undefined) {
-      // Create intermediate object
-      const newObj: Record<string, unknown> = {};
-      current[segment] = newObj;
-      current = newObj;
-    } else if (typeof next === "object" && !Array.isArray(next)) {
-      // Navigate into existing object
-      current = next as Record<string, unknown>;
-    } else {
-      // Cannot navigate into non-object
-      throw new Error(
-        `Cannot set property '${path[i + 1]}' on non-object at path '${path.slice(0, i + 1).join(".")}'`,
-      );
-    }
-  }
-
-  // Set the final value
-  const lastSegment = path[path.length - 1];
-  current[lastSegment] = value;
-}
-
-/**
- * Deep clone and mask all apiKey values in providers section
- */
-export function maskApiKeys(config: Record<string, unknown>): Record<string, unknown> {
-  // Deep clone
-  const cloned = JSON.parse(JSON.stringify(config)) as Record<string, unknown>;
-
-  // Mask apiKey values in providers
-  if (cloned.providers && typeof cloned.providers === "object") {
-    const providers = cloned.providers as Record<string, unknown>;
-    for (const providerName of Object.keys(providers)) {
-      const provider = providers[providerName];
-      if (provider && typeof provider === "object") {
-        const providerObj = provider as Record<string, unknown>;
-        if ("apiKey" in providerObj) {
-          providerObj.apiKey = "***MASKED***";
-        }
-      }
-    }
-  }
-
-  return cloned;
-}
-
-/**
- * List all configuration values (masks API keys)
- */
-export async function cmdConfigList(storageRoot: string): Promise<unknown> {
-  const configPath = getConfigPath(storageRoot);
-  const config = loadConfig(configPath);
-  const masked = maskApiKeys(config);
-  return masked;
-}
-
-/**
- * Get a specific configuration value
- */
-export async function cmdConfigGet(storageRoot: string, key: string): Promise<unknown> {
-  const configPath = getConfigPath(storageRoot);
-  const config = loadConfig(configPath);
-  const path = parseDotPath(key);
-  const value = getNestedValue(config, path);
-
-  if (value === undefined) {
-    throw new Error(`Key not found: ${key}`);
-  }
-
-  return value;
-}
-
-/**
- * Parse value for args key (must be JSON array)
- */
-function parseArgsValue(value: string): unknown {
-  if (value.startsWith("[")) {
-    try {
-      const parsed = JSON.parse(value);
-      if (!Array.isArray(parsed)) {
-        throw new Error("Value must be an array");
-      }
-      return parsed;
-    } catch (error) {
-      throw new Error(
-        `Invalid JSON array for args key: ${error instanceof Error ? error.message : String(error)}`,
-      );
-    }
-  }
-  throw new Error("Value for 'args' key must be a JSON array starting with '['");
-}
-
-/**
- * Validate that we're not setting a property on a non-object
- */
-function validateParentPath(
-  config: Record<string, unknown>,
-  path: string[],
-  lastSegment: string,
-): void {
-  if (path.length > 1) {
-    const parentPath = path.slice(0, -1);
-    const parent = getNestedValue(config, parentPath);
-    if (parent !== null && parent !== undefined && typeof parent !== "object") {
-      throw new Error(
-        `Cannot set property '${lastSegment}' on non-object at path '${parentPath.join(".")}'`,
-      );
-    }
-  }
-}
-
-/**
- * Set a specific configuration value
- */
-export async function cmdConfigSet(
-  storageRoot: string,
-  key: string,
-  value: string,
-): Promise<unknown> {
-  const configPath = getConfigPath(storageRoot);
-
-  // Load existing config or create empty one
-  let config: Record<string, unknown>;
-  if (existsSync(configPath)) {
-    config = loadConfig(configPath);
-  } else {
-    config = {};
-  }
-
-  const path = parseDotPath(key);
-
-  // Validate the key path
-  validateConfigKey(path);
-
-  const lastSegment = path[path.length - 1];
-
-  // Parse value if it's for an array key (args)
-  let parsedValue: unknown = value;
-  if (lastSegment === "args") {
-    parsedValue = parseArgsValue(value);
-  }
-
-  // Validate we're not setting a property on a non-object
-  validateParentPath(config, path, lastSegment);
-
-  setNestedValue(config, path, parsedValue);
-  saveConfig(configPath, config);
-
-  return { key, value: parsedValue };
-}
@@ -85,6 +85,10 @@ function getConfigPath(root: string): string {
  return join(root, "config.yaml");
 }

+function getEnvPath(root: string): string {
+  return join(root, ".env");
+}
+
 /**
 * Load existing config.yaml or return empty structure.
 */
@@ -102,6 +106,37 @@ function loadExistingConfig(configPath: string): Record<string, unknown> {
  return {};
 }

+/**
+ * Load existing .env as key=value map.
+ */
+function loadEnvFile(envPath: string): Record<string, string> {
+  const env: Record<string, string> = {};
+  try {
+    if (existsSync(envPath)) {
+      for (const line of readFileSync(envPath, "utf8").split("\n")) {
+        const trimmed = line.trim();
+        if (trimmed === "" || trimmed.startsWith("#")) continue;
+        const eq = trimmed.indexOf("=");
+        if (eq > 0) {
+          env[trimmed.slice(0, eq)] = trimmed.slice(eq + 1);
+        }
+      }
+    }
+  } catch {
+    // ignore
+  }
+  return env;
+}
+
+function saveEnvFile(envPath: string, env: Record<string, string>): void {
+  const lines = Object.entries(env).map(([k, v]) => `${k}=${v}`);
+  writeFileSync(envPath, `${lines.join("\n")}\n`, "utf8");
+}
+
+function apiKeyEnvName(providerName: string): string {
+  return `${providerName.toUpperCase().replace(/[^A-Z0-9]/g, "_")}_API_KEY`;
+}
+
 // ──────────────────────────────────────────────────────────────────────────────
 // Extracted helpers — _discoverAgents
 // ──────────────────────────────────────────────────────────────────────────────
@@ -362,7 +397,8 @@ function mergeConfig(existing: Record<string, unknown>, args: SetupArgs): Record
      : {}
  ) as Record<string, unknown>;

-  providers[args.provider] = { baseUrl: args.baseUrl, apiKey: args.apiKey };
+  const envName = apiKeyEnvName(args.provider);
+  providers[args.provider] = { baseUrl: args.baseUrl, apiKeyEnv: envName };

  const models = (
    typeof existing.models === "object" && existing.models !== null
@@ -377,7 +413,7 @@ function mergeConfig(existing: Record<string, unknown>, args: SetupArgs): Record
      : {}
  ) as Record<string, unknown>;

-  const agentName = _agentNameFromBinary(args.agent ?? "hermes");
+  const agentName = args.agent ?? "hermes";
  // Ensure the selected agent has an entry
  if (!agents[agentName]) {
    agents[agentName] = { command: `uwf-${agentName}`, args: [] };
@@ -401,17 +437,25 @@ export async function cmdSetup(args: SetupArgs): Promise<Record<string, unknown>
  mkdirSync(storageRoot, { recursive: true });

  const configPath = getConfigPath(storageRoot);
+  const envPath = getEnvPath(storageRoot);

  const existing = loadExistingConfig(configPath);
  const merged = mergeConfig(existing, args);

  writeFileSync(configPath, stringify(merged, { indent: 2 }), "utf8");

+  // Write API key to .env
+  const envName = apiKeyEnvName(args.provider);
+  const envData = loadEnvFile(envPath);
+  envData[envName] = args.apiKey;
+  saveEnvFile(envPath, envData);
+
  // Validate model connectivity
  const validation = await validateModel(args.baseUrl, args.apiKey, args.model);

  return {
    configPath,
+    envPath,
    provider: args.provider,
    model: args.model,
    defaultAgent: merged.defaultAgent,
@@ -1,26 +1,11 @@
 export {
-  generateActorReference as cmdSkillActor,
-  generateAdapterReference as cmdSkillAdapter,
  generateArchitectureReference as cmdSkillArchitecture,
-  generateAuthorReference as cmdSkillAuthor,
  generateCliReference as cmdSkillCli,
-  generateDeveloperReference as cmdSkillDeveloper,
  generateModeratorReference as cmdSkillModerator,
-  generateUserReference as cmdSkillUser,
  generateYamlReference as cmdSkillYaml,
 } from "@uncaged/workflow-util";

-const SKILL_NAMES = [
-  "cli",
-  "architecture",
-  "yaml",
-  "moderator",
-  "actor",
-  "user",
-  "author",
-  "developer",
-  "adapter",
-] as const;
+const SKILL_NAMES = ["cli", "architecture", "yaml", "moderator"] as const;

 export function cmdSkillList(): ReadonlyArray<string> {
  return [...SKILL_NAMES];
@@ -331,7 +331,7 @@ export async function cmdThreadShow(storageRoot: string, threadId: ThreadId): Pr
  fail(`thread not found: ${threadId}`);
 }

-export type ThreadStatus = "idle" | "running" | "completed" | "cancelled";
+export type ThreadStatus = "idle" | "running" | "completed";

 export type ThreadListItemWithStatus = ThreadListItem & {
  status: ThreadStatus;
@@ -389,7 +389,7 @@ async function collectCompletedThreads(
        thread: entry.thread,
        workflow: entry.workflow,
        head: entry.head,
-        status: entry.reason === "cancelled" ? "cancelled" : "completed",
+        status: "completed",
      });
    }
  }
@@ -444,10 +444,7 @@ export async function cmdThreadList(
  let items = await collectActiveThreads(storageRoot, uwf, index);

  // Collect completed threads (if relevant for status filter)
-  const includeCompleted =
-    statusFilter === null ||
-    statusFilter.includes("completed") ||
-    statusFilter.includes("cancelled");
+  const includeCompleted = statusFilter === null || statusFilter.includes("completed");
  if (includeCompleted) {
    const activeIds = new Set(items.map((i) => i.thread));
    const completedItems = await collectCompletedThreads(storageRoot, activeIds);
@@ -814,7 +811,6 @@ async function archiveThread(
    workflow,
    head,
    completedAt: Date.now(),
-    reason: "completed",
  });
 }

@@ -1151,7 +1147,6 @@ export async function cmdThreadCancel(
    workflow,
    head,
    completedAt: Date.now(),
-    reason: "cancelled",
  };
  await appendThreadHistory(storageRoot, historyEntry);

@@ -88,7 +88,6 @@ export function getHistoryPath(storageRoot: string): string {

 export type ThreadHistoryLine = ThreadListItem & {
  completedAt: number;
-  reason: "completed" | "cancelled" | null;
 };

 export type UwfStore = {
@@ -229,15 +228,7 @@ export async function loadThreadHistory(storageRoot: string): Promise<ThreadHist
        typeof head === "string" &&
        typeof completedAt === "number"
      ) {
-        const reason = rec.reason;
-        const parsedReason = reason === "completed" || reason === "cancelled" ? reason : null;
-        lines.push({
-          thread: thread as ThreadId,
-          workflow,
-          head,
-          completedAt,
-          reason: parsedReason,
-        });
+        lines.push({ thread: thread as ThreadId, workflow, head, completedAt });
      }
    }
    return lines;
@@ -1,12 +1,10 @@
 # @uncaged/workflow-agent-hermes

-`uwf-hermes` — an **agent adapter** that bridges the `uwf` workflow engine and the Hermes CLI.
+`uwf-hermes` agent — spawns Hermes chat via ACP and captures session detail.

 ## Overview

-`uwf-hermes` is an adapter (not the Hermes CLI itself). The `uwf` engine speaks a generic agent protocol (stdin/stdout frontmatter contract); `uwf-hermes` translates that protocol into Hermes ACP (Agent Client Protocol) calls. Other adapters (e.g. `uwf-claude-code`, `uwf-cursor`) do the same for their respective CLIs.
-
-On first visit to a role it sends a composed prompt (role definition, task, history, edge prompt); on continuation it resumes the cached session. Session transcripts and raw output are stored as CAS detail nodes.
+Layer 3 agent implementation. Wraps the Hermes CLI using the Agent Client Protocol (ACP). On first visit to a role it sends a composed prompt (role definition, task, history, edge prompt); on continuation it resumes the cached session. Session transcripts and raw output are stored as CAS detail nodes.

 **Dependencies:** `@uncaged/json-cas`, `@uncaged/workflow-util-agent`, `@uncaged/workflow-protocol`, `@uncaged/workflow-util`

@@ -1,28 +0,0 @@
-import { describe, expect, test } from "bun:test";
-import { readFileSync } from "node:fs";
-import { join } from "node:path";
-
-const PKG_ROOT = join(import.meta.dir, "..");
-
-describe("Issue #551 — bin entry & engines", () => {
-  test("package.json declares bun in engines", () => {
-    const pkg = JSON.parse(readFileSync(join(PKG_ROOT, "package.json"), "utf-8"));
-    expect(pkg.engines).toBeDefined();
-    expect(pkg.engines.bun).toBeDefined();
-    expect(pkg.engines.bun).toMatch(/^>=?\s*[\d.]+/);
-  });
-
-  test("bin entry file has bun shebang", () => {
-    const pkg = JSON.parse(readFileSync(join(PKG_ROOT, "package.json"), "utf-8"));
-    const binPath = pkg.bin["uwf-hermes"];
-    const content = readFileSync(join(PKG_ROOT, binPath), "utf-8");
-    expect(content.startsWith("#!/usr/bin/env bun")).toBe(true);
-  });
-
-  test("README.md explains uwf-hermes is an adapter", () => {
-    const readme = readFileSync(join(PKG_ROOT, "README.md"), "utf-8");
-    expect(readme.toLowerCase()).toContain("adapter");
-    expect(readme).toMatch(/uwf-hermes/);
-    expect(readme).toMatch(/hermes/);
-  });
-});
@@ -1,15 +1,9 @@
-import { Database } from "bun:sqlite";
 import { describe, expect, test } from "bun:test";
-import { mkdtemp, rm, writeFile } from "node:fs/promises";
-import { tmpdir } from "node:os";
-import { join } from "node:path";
 import { createMemoryStore, refs, validate, walk } from "@uncaged/json-cas";

 import {
  computeDurationMs,
  extractLastAssistantContent,
-  getHermesDbPath,
-  loadHermesSessionFromDb,
  messageToTurnPayload,
  parseSessionIdFromStdout,
  storeHermesSessionDetail,
@@ -130,236 +124,3 @@ describe("storeHermesSessionDetail", () => {
    }
  });
 });
-
-// ── SQLite fallback tests ──────────────────────────────────────────
-
-function createTestDb(dbPath: string): Database {
-  const db = new Database(dbPath);
-  db.run(`CREATE TABLE sessions (
-    id TEXT PRIMARY KEY,
-    model TEXT NOT NULL,
-    started_at INTEGER NOT NULL
-  )`);
-  db.run(`CREATE TABLE messages (
-    id INTEGER PRIMARY KEY AUTOINCREMENT,
-    session_id TEXT NOT NULL,
-    role TEXT NOT NULL,
-    content TEXT,
-    reasoning TEXT,
-    tool_calls TEXT,
-    FOREIGN KEY (session_id) REFERENCES sessions(id)
-  )`);
-  return db;
-}
-
-describe("getHermesDbPath", () => {
-  test("returns correct path", () => {
-    const { homedir } = require("node:os");
-    const { join } = require("node:path");
-    expect(getHermesDbPath()).toBe(join(homedir(), ".hermes", "state.db"));
-  });
-});
-
-describe("loadHermesSessionFromDb", () => {
-  test("returns session data from SQLite", async () => {
-    const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
-    const dbPath = join(tmpDir, "state.db");
-    const db = createTestDb(dbPath);
-
-    const sessionId = "test-session-001";
-    const startedAt = 1748099519;
-    db.run("INSERT INTO sessions (id, model, started_at) VALUES (?, ?, ?)", [
-      sessionId,
-      "claude-opus-4.6",
-      startedAt,
-    ]);
-    db.run(
-      "INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
-      [sessionId, "user", "hello", null, null],
-    );
-    db.run(
-      "INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
-      [sessionId, "assistant", "hi there", "thinking...", null],
-    );
-    db.close();
-
-    const result = await loadHermesSessionFromDb(sessionId, dbPath);
-    expect(result).not.toBeNull();
-    expect(result!.session_id).toBe(sessionId);
-    expect(result!.model).toBe("claude-opus-4.6");
-    expect(result!.messages).toHaveLength(2);
-    expect(result!.messages[0]!.role).toBe("user");
-    expect(result!.messages[0]!.content).toBe("hello");
-    expect(result!.messages[1]!.role).toBe("assistant");
-    expect(result!.messages[1]!.content).toBe("hi there");
-    expect(result!.messages[1]!.reasoning).toBe("thinking...");
-
-    await rm(tmpDir, { recursive: true });
-  });
-
-  test("returns null when no session exists in DB", async () => {
-    const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
-    const dbPath = join(tmpDir, "state.db");
-    const db = createTestDb(dbPath);
-    db.close();
-
-    const result = await loadHermesSessionFromDb("nonexistent", dbPath);
-    expect(result).toBeNull();
-
-    await rm(tmpDir, { recursive: true });
-  });
-
-  test("returns null when DB file does not exist", async () => {
-    const result = await loadHermesSessionFromDb("any-id", "/tmp/nonexistent-hermes-db.db");
-    expect(result).toBeNull();
-  });
-
-  test("correctly parses tool_calls from DB JSON string", async () => {
-    const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
-    const dbPath = join(tmpDir, "state.db");
-    const db = createTestDb(dbPath);
-
-    const sessionId = "test-tool-calls";
-    db.run("INSERT INTO sessions (id, model, started_at) VALUES (?, ?, ?)", [
-      sessionId,
-      "gpt-4",
-      1748099519,
-    ]);
-    const toolCallsJson = JSON.stringify([
-      { function: { name: "read_file", arguments: '{"path":"x"}' } },
-    ]);
-    db.run(
-      "INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
-      [sessionId, "assistant", "", null, toolCallsJson],
-    );
-    db.close();
-
-    const result = await loadHermesSessionFromDb(sessionId, dbPath);
-    expect(result).not.toBeNull();
-    expect(result!.messages[0]!.tool_calls).toEqual([
-      { function: { name: "read_file", arguments: '{"path":"x"}' } },
-    ]);
-
-    await rm(tmpDir, { recursive: true });
-  });
-
-  test("handles null fields in DB messages gracefully", async () => {
-    const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
-    const dbPath = join(tmpDir, "state.db");
-    const db = createTestDb(dbPath);
-
-    const sessionId = "test-nulls";
-    db.run("INSERT INTO sessions (id, model, started_at) VALUES (?, ?, ?)", [
-      sessionId,
-      "model",
-      1748099519,
-    ]);
-    db.run(
-      "INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
-      [sessionId, "assistant", null, null, null],
-    );
-    db.close();
-
-    const result = await loadHermesSessionFromDb(sessionId, dbPath);
-    expect(result).not.toBeNull();
-    const msg = result!.messages[0]!;
-    expect(msg.content).toBeNull();
-    expect(msg.reasoning).toBeNull();
-    expect(msg.tool_calls).toBeNull();
-
-    await rm(tmpDir, { recursive: true });
-  });
-
-  test("messages ordered by insertion order", async () => {
-    const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
-    const dbPath = join(tmpDir, "state.db");
-    const db = createTestDb(dbPath);
-
-    const sessionId = "test-order";
-    db.run("INSERT INTO sessions (id, model, started_at) VALUES (?, ?, ?)", [
-      sessionId,
-      "model",
-      1748099519,
-    ]);
-    db.run(
-      "INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
-      [sessionId, "user", "first", null, null],
-    );
-    db.run(
-      "INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
-      [sessionId, "assistant", "second", null, null],
-    );
-    db.run(
-      "INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
-      [sessionId, "user", "third", null, null],
-    );
-    db.close();
-
-    const result = await loadHermesSessionFromDb(sessionId, dbPath);
-    expect(result).not.toBeNull();
-    expect(result!.messages.map((m) => m.content)).toEqual(["first", "second", "third"]);
-
-    await rm(tmpDir, { recursive: true });
-  });
-
-  test("converts unix timestamp to ISO string for session_start", async () => {
-    const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
-    const dbPath = join(tmpDir, "state.db");
-    const db = createTestDb(dbPath);
-
-    const sessionId = "test-timestamp";
-    const startedAt = 1748099519;
-    db.run("INSERT INTO sessions (id, model, started_at) VALUES (?, ?, ?)", [
-      sessionId,
-      "model",
-      startedAt,
-    ]);
-    db.close();
-
-    const result = await loadHermesSessionFromDb(sessionId, dbPath);
-    expect(result).not.toBeNull();
-    expect(result!.session_start).toBe(new Date(startedAt * 1000).toISOString());
-
-    await rm(tmpDir, { recursive: true });
-  });
-});
-
-describe("loadHermesSession with SQLite fallback", () => {
-  test("JSON file takes priority over DB", async () => {
-    const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
-    const dbPath = join(tmpDir, "state.db");
-    const jsonPath = join(tmpDir, "session.json");
-
-    // Create DB with one model value
-    const db = createTestDb(dbPath);
-    const sessionId = "test-priority";
-    db.run("INSERT INTO sessions (id, model, started_at) VALUES (?, ?, ?)", [
-      sessionId,
-      "db-model",
-      1748099519,
-    ]);
-    db.run(
-      "INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
-      [sessionId, "user", "from db", null, null],
-    );
-    db.close();
-
-    // Create JSON file with a different model value
-    const jsonData: HermesSessionJson = {
-      session_id: sessionId,
-      model: "json-model",
-      session_start: "2026-05-24T12:00:00.000Z",
-      messages: [{ role: "user", content: "from json", reasoning: null, tool_calls: null }],
-    };
-    await writeFile(jsonPath, JSON.stringify(jsonData));
-
-    // loadHermesSession reads from JSON path, so we test the existing function directly
-    // The JSON-first priority is inherent in the implementation
-    const { readFile } = await import("node:fs/promises");
-    const text = await readFile(jsonPath, "utf8");
-    const parsed = JSON.parse(text);
-    expect(parsed.model).toBe("json-model");
-
-    await rm(tmpDir, { recursive: true });
-  });
-});
@@ -42,8 +42,5 @@
  "bugs": {
    "url": "https://github.com/shazhou-ww/uncaged-workflow/issues"
  },
-  "engines": {
-    "bun": ">= 1.0.0"
-  },
  "license": "MIT"
 }
@@ -1,4 +1,3 @@
-import { Database } from "bun:sqlite";
 import { readFile } from "node:fs/promises";
 import { homedir } from "node:os";
 import { join } from "node:path";
@@ -109,99 +108,15 @@ function parseSessionJson(raw: unknown): HermesSessionJson | null {
  return { session_id, model, session_start, messages };
 }

-export function getHermesDbPath(): string {
-  return join(homedir(), ".hermes", "state.db");
-}
-
-type DbSessionRow = {
-  id: string;
-  model: string;
-  started_at: number;
-};
-
-type DbMessageRow = {
-  role: string;
-  content: string | null;
-  reasoning: string | null;
-  tool_calls: string | null;
-};
-
-function parseDbToolCalls(raw: string | null): HermesSessionMessage["tool_calls"] {
-  if (raw === null) {
-    return null;
-  }
-  try {
-    const parsed = JSON.parse(raw) as unknown;
-    return parseToolCalls(parsed);
-  } catch {
-    return null;
-  }
-}
-
-function dbMessageToSessionMessage(row: DbMessageRow): HermesSessionMessage {
-  return {
-    role: row.role,
-    content: row.content ?? null,
-    reasoning: row.reasoning ?? null,
-    tool_calls: parseDbToolCalls(row.tool_calls),
-  };
-}
-
-export function loadHermesSessionFromDb(
-  sessionId: string,
-  dbPath: string | null = null,
-): HermesSessionJson | null {
-  const resolvedPath = dbPath ?? getHermesDbPath();
-  let db: InstanceType<typeof Database> | null = null;
-  try {
-    db = new Database(resolvedPath, { readonly: true });
-    const session = db
-      .query("SELECT id, model, started_at FROM sessions WHERE id = ?")
-      .get(sessionId) as DbSessionRow | null;
-    if (session === null) {
-      return null;
-    }
-    const rows = db
-      .query(
-        "SELECT role, content, reasoning, tool_calls FROM messages WHERE session_id = ? ORDER BY id",
-      )
-      .all(sessionId) as DbMessageRow[];
-
-    const messages: HermesSessionMessage[] = [];
-    for (const row of rows) {
-      const role = row.role;
-      if (role !== "user" && role !== "assistant" && role !== "tool") {
-        continue;
-      }
-      messages.push(dbMessageToSessionMessage(row));
-    }
-
-    return {
-      session_id: session.id,
-      model: session.model,
-      session_start: new Date(session.started_at * 1000).toISOString(),
-      messages,
-    };
-  } catch {
-    return null;
-  } finally {
-    db?.close();
-  }
-}
-
 export async function loadHermesSession(sessionId: string): Promise<HermesSessionJson | null> {
  const path = getHermesSessionPath(sessionId);
  try {
    const text = await readFile(path, "utf8");
    const raw = JSON.parse(text) as unknown;
-    const result = parseSessionJson(raw);
-    if (result !== null) {
-      return result;
-    }
+    return parseSessionJson(raw);
  } catch {
-    // JSON file not available, fall through to DB
+    return null;
  }
-  return loadHermesSessionFromDb(sessionId);
 }

 export function computeDurationMs(sessionStart: string, nowMs: number = Date.now()): number {
@@ -100,7 +100,7 @@ type ProviderAlias = string;
 type ModelAlias = string;
 type AgentAlias = string;

-type ProviderConfig = { baseUrl: string; apiKey: string };
+type ProviderConfig = { baseUrl: string; apiKeyEnv: string };
 type ModelConfig = {
  provider: ProviderAlias;
  name: string;
@@ -151,7 +151,7 @@ export type Scenario = string;

 export type ProviderConfig = {
  baseUrl: string;
-  apiKey: string;
+  apiKeyEnv: string;
 };

 export type ModelConfig = {
@@ -1,7 +1,8 @@
 import { getSchema, validate } from "@uncaged/json-cas";

 import type { CasRef, ModelAlias, WorkflowConfig } from "@uncaged/workflow-protocol";
-import { createAgentStore, resolveStorageRoot } from "./storage.js";
+import { config as loadDotenv } from "dotenv";
+import { createAgentStore, getEnvPath, resolveStorageRoot } from "./storage.js";

 export type ResolvedLlmProvider = {
  baseUrl: string;
@@ -37,9 +38,9 @@ export function resolveModel(config: WorkflowConfig, alias: ModelAlias): Resolve
  if (providerEntry === undefined) {
    throw new Error(`unknown provider "${modelEntry.provider}" for model "${alias}"`);
  }
-  const apiKey = providerEntry.apiKey;
+  const apiKey = process.env[providerEntry.apiKeyEnv];
  if (apiKey === undefined || apiKey === "") {
-    throw new Error(`missing API key for provider: ${modelEntry.provider}`);
+    throw new Error(`missing API key env var: ${providerEntry.apiKeyEnv}`);
  }
  return {
    baseUrl: providerEntry.baseUrl,
@@ -129,7 +130,7 @@ export type ExtractResult = {

 /**
 * Call an OpenAI-compatible LLM to extract structured output matching outputSchema.
- * Loads config.yaml from the workflow storage root.
+ * Loads config.yaml and .env from the workflow storage root.
 */
 export async function extract(
  rawOutput: string,
@@ -137,6 +138,7 @@ export async function extract(
  config: WorkflowConfig,
 ): Promise<ExtractResult> {
  const storageRoot = resolveStorageRoot();
+  loadDotenv({ path: getEnvPath(storageRoot) });

  const { store } = await createAgentStore(storageRoot);
  const schema = getSchema(store, outputSchema);
@@ -84,11 +84,11 @@ function normalizeProviders(raw: unknown): Record<ProviderAlias, ProviderConfig>
      throw new Error(`config.providers.${name} must be a mapping`);
    }
    const baseUrl = entry.baseUrl;
-    const apiKey = entry.apiKey;
-    if (typeof baseUrl !== "string" || typeof apiKey !== "string") {
-      throw new Error(`config.providers.${name} requires baseUrl and apiKey`);
+    const apiKeyEnv = entry.apiKeyEnv;
+    if (typeof baseUrl !== "string" || typeof apiKeyEnv !== "string") {
+      throw new Error(`config.providers.${name} requires baseUrl and apiKeyEnv`);
    }
-    providers[name] = { baseUrl, apiKey };
+    providers[name] = { baseUrl, apiKeyEnv };
  }
  return providers;
 }
@@ -0,0 +1,78 @@
+# @uncaged/workflow-util
+
+## 0.5.0-alpha.4
+
+### Patch Changes
+
+- Replace optionalEnv/requireEnv with unified env(name, fallback) API
+- Updated dependencies [f74b482]
+- Updated dependencies [f74b482]
+  - @uncaged/workflow-protocol@0.5.0-alpha.4
+
+## 0.5.0-alpha.3
+
+### Patch Changes
+
+- Updated dependencies
+  - @uncaged/workflow-protocol@0.5.0-alpha.3
+
+## 0.5.0-alpha.2
+
+### Patch Changes
+
+- Updated dependencies
+  - @uncaged/workflow-protocol@0.5.0-alpha.2
+
+## 0.5.0-alpha.1
+
+### Patch Changes
+
+- @uncaged/workflow-protocol@0.5.0-alpha.1
+
+## 0.5.0-alpha.0
+
+### Patch Changes
+
+- Updated dependencies
+  - @uncaged/workflow-protocol@0.5.0-alpha.0
+
+## 0.4.5
+
+### Patch Changes
+
+- Updated dependencies
+  - @uncaged/workflow-protocol@0.4.5
+
+## 0.4.4
+
+### Patch Changes
+
+- Updated dependencies
+  - @uncaged/workflow-protocol@0.4.4
+
+## 0.4.3
+
+### Patch Changes
+
+- Include src/ in published packages so bun runtime can resolve the 'bun' exports condition.
+- Updated dependencies
+  - @uncaged/workflow-protocol@0.4.3
+
+## 0.4.2
+
+### Patch Changes
+
+- Fix workspace dependency resolution: use workspace:^ so published packages resolve to compatible versions instead of exact (non-existent) versions.
+- Updated dependencies
+  - @uncaged/workflow-protocol@0.4.2
+
+## 0.4.0
+
+### Minor Changes
+
+- Fix package exports for published packages and adopt changesets for version management.
+
+### Patch Changes
+
+- Updated dependencies
+  - @uncaged/workflow-protocol@0.4.0
@@ -1,68 +0,0 @@
-export function generateActorReference(): string {
-  return `# Actor Reference
-
-You are executing a workflow role. Your system prompt defines your goal, procedure, and output requirements. This reference covers two things you need to know about the workflow engine.
-
-## 1. Frontmatter Output Protocol
-
-Your response **MUST** begin with a YAML frontmatter block at byte position 0 — no preamble text before it.
-
-\`\`\`
---
-status: done
-myField: some value
---
-
-... markdown body (your work, explanation, notes) ...
-\`\`\`
-
-### Standard Field
-
-| Field | Values | Default | Description |
-|-------|--------|---------|-------------|
-| \`status\` | \`done\`, \`needs_input\`, \`in_progress\`, \`failed\` | \`done\` | Completion signal — determines which graph edge the moderator follows next |
-
-### Schema-Defined Fields
-
-Your role's output schema (shown in the system prompt under "Deliverable Format") defines additional fields. Output **only** the fields listed there — do not invent extra fields.
-
-### Body
-
-Everything after the closing \`---\` fence is the markdown body. Use it for explanations, logs, or human-readable notes. The body is stored but not parsed by the engine.
-
-### Retry
-
-If the engine cannot parse your frontmatter, it will ask you to retry (up to 2 times). Just output the corrected frontmatter block — don't panic.
-
-## 2. CAS (Content-Addressable Store)
-
-Your frontmatter output is automatically stored in CAS. You can also **use CAS directly** to store intermediate artifacts, build merkle DAGs for large outputs, or reference data from previous steps.
-
-### Commands
-
-\`\`\`
-uwf cas put-text <text>           # store plain text, print hash
-uwf cas put <type-hash> <json>    # store typed JSON data, print hash
-uwf cas get <hash>                # read a CAS node (type + payload)
-uwf cas has <hash>                # check if a hash exists
-uwf cas refs <hash>               # list direct references from a node
-uwf cas walk <hash>               # recursive traversal from a node
-uwf cas schema list               # list registered schemas
-uwf cas schema get <hash>         # show a schema definition
-\`\`\`
-
-### Merkle DAG Pattern
-
-For large outputs, store parts individually and reference their hashes:
-
-\`\`\`bash
-# Store individual sections
-HASH1=$(uwf cas put-text "section 1 content")
-HASH2=$(uwf cas put-text "section 2 content")
-
-# Reference hashes in your frontmatter or in a parent node
-\`\`\`
-
-This enables progressive loading — consumers can fetch the root and resolve children on demand.
-`;
-}
@@ -1,163 +0,0 @@
-export function generateAdapterReference(): string {
-  return `# Adapter Reference
-
-Guide for building a new agent adapter (CLI binary) for the workflow engine.
-
-## What Is an Adapter
-
-An adapter is a CLI command (e.g. \`uwf-hermes\`, \`uwf-builtin\`) that the engine spawns to execute a role. It bridges the workflow engine and an LLM/agent backend. The engine calls it with:
-
-\`\`\`
-uwf-<name> --thread <id> --role <role> --prompt <text>
-\`\`\`
-
-The adapter must produce frontmatter markdown output. The engine handles argument parsing, context building, output extraction, and CAS persistence — you just implement the LLM interaction.
-
-## Quick Start
-
-\`\`\`typescript
-import { createAgent } from "@uncaged/workflow-util-agent";
-import type { AgentContext, AgentRunResult, AgentContinueFn, AgentRunFn } from "@uncaged/workflow-util-agent";
-
-const run: AgentRunFn = async (ctx: AgentContext): Promise<AgentRunResult> => {
-  // 1. Build your prompt from ctx
-  // 2. Call your LLM backend
-  // 3. Return the result
-  return { output: rawMarkdown, detailHash, sessionId };
-};
-
-const continue_: AgentContinueFn = async (sessionId, message, store) => {
-  // Resume an existing session with a correction message
-  return { output: correctedMarkdown, detailHash, sessionId };
-};
-
-const main = createAgent({ name: "my-agent", run, continue: continue_ });
-main();
-\`\`\`
-
-## The \`createAgent\` Factory
-
-\`createAgent(options)\` returns an async \`main()\` function that handles the full lifecycle:
-
-1. Parses CLI args (\`--thread\`, \`--role\`, \`--prompt\`)
-2. Loads \`.env\` from storage root
-3. Builds \`AgentContext\` (thread history, workflow definition, role prompt)
-4. Injects \`outputFormatInstruction\` from the role's frontmatter schema
-5. Calls your \`run(ctx)\` function
-6. Extracts frontmatter from your output via \`tryFrontmatterFastPath()\`
-7. If extraction fails, calls your \`continue(sessionId, correctionMessage, store)\` up to 2 times
-8. Persists the validated output as a CAS step node
-9. Prints the step hash to stdout
-
-You only implement \`run\` and \`continue\`.
-
-## AgentOptions
-
-\`\`\`typescript
-type AgentOptions = {
-  name: string;           // Adapter name (used in step records as "uwf-<name>")
-  run: AgentRunFn;        // Execute a role from scratch
-  continue: AgentContinueFn;  // Resume a session for frontmatter correction
-};
-\`\`\`
-
-## AgentContext
-
-The \`ctx\` object passed to your \`run\` function:
-
-| Field | Type | Description |
-|-------|------|-------------|
-| \`threadId\` | \`string\` | Thread ULID |
-| \`role\` | \`string\` | Role name being executed |
-| \`edgePrompt\` | \`string\` | Moderator's task instruction for this step |
-| \`workflow\` | \`WorkflowPayload\` | Full workflow definition (roles, graph) |
-| \`start\` | \`StartNodePayload\` | Thread start data (workflow hash, user prompt) |
-| \`steps\` | \`StepContext[]\` | Previous steps with expanded outputs |
-| \`store\` | \`Store\` | CAS store for reading/writing data |
-| \`outputFormatInstruction\` | \`string\` | Frontmatter format instruction (inject into system prompt) |
-| \`isFirstVisit\` | \`boolean\` | True if this role hasn't run before in this thread |
-
-## AgentRunResult
-
-Your \`run\` and \`continue\` functions must return:
-
-\`\`\`typescript
-type AgentRunResult = {
-  output: string;       // Raw markdown with frontmatter (must start with ---)
-  detailHash: string;   // CAS hash of session detail (turn history, metadata)
-  sessionId: string;    // Session ID for potential continue() calls
-};
-\`\`\`
-
-## Building the Prompt
-
-Use helpers from \`@uncaged/workflow-util-agent\`:
-
-| Helper | Purpose |
-|--------|---------|
-| \`buildRolePrompt(roleDef)\` | Assemble Goal/Capabilities/Prepare/Procedure/Output sections |
-| \`buildContinuationPrompt(steps, role, edgePrompt)\` | For re-entry: steps since last visit + edge prompt |
-| \`ctx.outputFormatInstruction\` | Pre-built frontmatter format block (inject into system prompt) |
-
-Typical system prompt structure:
-\`\`\`
-[outputFormatInstruction]
-[rolePrompt from buildRolePrompt()]
-[workflow metadata]
-\`\`\`
-
-## Storing Session Detail
-
-Store your turn history as a CAS merkle DAG for debugging and replay:
-
-\`\`\`typescript
-// Store each turn as a CAS text node
-const turnHash = await store.put(textSchema, { content: turnData });
-
-// Build a detail node referencing all turns
-const detailHash = await store.put(detailSchema, { turns: turnHashes });
-\`\`\`
-
-The \`detailHash\` is preserved from the first \`run()\` call — retry \`continue()\` calls don't overwrite it.
-
-## Registration
-
-Register your adapter in \`~/.uncaged/workflow/config.yaml\`:
-
-\`\`\`yaml
-agents:
-  my-agent:
-    command: uwf-my-agent
-    args: []
-\`\`\`
-
-Use it:
-\`\`\`bash
-uwf thread exec <thread-id> --agent my-agent
-\`\`\`
-
-Or set as default:
-\`\`\`yaml
-defaultAgent: my-agent
-\`\`\`
-
-## Existing Adapters
-
-| Adapter | Package | Backend |
-|---------|---------|---------|
-| \`uwf-hermes\` | \`@uncaged/workflow-agent-hermes\` | Hermes ACP (chat sessions) |
-| \`uwf-builtin\` | \`@uncaged/workflow-agent-builtin\` | Direct OpenAI API (tools + loop) |
-| \`uwf-claude-code\` | \`@uncaged/workflow-agent-claude-code\` | Claude Code CLI |
-
-Study these for patterns on prompt building, session management, and detail storage.
-
-## Checklist
-
-1. Implement \`run(ctx)\` — build prompt, call LLM, return output + detailHash + sessionId
-2. Implement \`continue(sessionId, message, store)\` — resume session for frontmatter correction
-3. Store session detail as CAS nodes (for debugging)
-4. Ensure output starts with \`---\` frontmatter block
-5. Add a \`bin\` entry in \`package.json\` for the CLI command
-6. Register in config.yaml and test with \`uwf thread exec --agent <name>\`
-`;
-}
@@ -1,183 +0,0 @@
-export function generateAuthorReference(): string {
-  return `# Author Reference
-
-Guide for designing and writing workflow YAML definitions.
-
-## Workflow Structure
-
-\`\`\`yaml
-name: solve-issue              # verb-first kebab-case
-description: "..."             # human-readable summary
-
-roles:                         # named actors
-  planner:
-    description: "..."         # short purpose
-    goal: "..."                # system-level goal for the agent
-    capabilities: [...]        # skill keywords the agent should load
-    procedure: |               # step-by-step instructions
-      1. Do this
-      2. Do that
-    output: "..."              # what the agent should produce
-    frontmatter:               # JSON Schema for structured output
-      oneOf:
-        - properties:
-            $status: { const: "ready" }
-            plan: { type: string }
-          required: [$status, plan]
-        - properties:
-            $status: { const: "failed" }
-            error: { type: string }
-          required: [$status, error]
-
-graph:                         # status-based routing
-  $START:
-    _: { role: planner, prompt: "Analyze the issue." }
-  planner:
-    ready: { role: developer, prompt: "Implement {{{plan}}}." }
-    failed: { role: $END, prompt: "Failed: {{{error}}}" }
-\`\`\`
-
-## Role Definition
-
-| Field | Purpose |
-|-------|---------|
-| \`description\` | Short description for humans and moderator context |
-| \`goal\` | Injected as the agent's system-level objective |
-| \`capabilities\` | Keyword tags — agent loads matching skills before starting |
-| \`procedure\` | Step-by-step instructions the agent follows |
-| \`output\` | Describes what to produce and which \`$status\` values to use |
-| \`frontmatter\` | JSON Schema defining the structured output fields |
-
-### Role Design Principles
-
- **Single responsibility** — each role does one thing well
- **Minimal context** — don't overload a role with too many steps; split if needed
- **Clear status values** — each status should map to a distinct graph edge
- **Explicit output** — tell the agent exactly what \`$status\` values are valid
-
-## Frontmatter Schema
-
-The \`frontmatter\` field is a standard JSON Schema. It defines the structured fields the agent must output in YAML frontmatter.
-
-### \`$status\` Field
-
-\`$status\` is the only standard field. Its value determines which graph edge the moderator follows. Use \`const\` to constrain each variant:
-
-\`\`\`yaml
-frontmatter:
-  oneOf:
-    - properties:
-        $status: { const: "done" }
-        result: { type: string }
-      required: [$status, result]
-    - properties:
-        $status: { const: "failed" }
-        error: { type: string }
-      required: [$status, error]
-\`\`\`
-
-### Custom Fields
-
-Add any fields you need for data passing between roles. These are available in edge prompts via Mustache templates.
-
-### Flat Schema (Single Status)
-
-When a role has only one outcome:
-
-\`\`\`yaml
-frontmatter:
-  properties:
-    $status: { const: "done" }
-    summary: { type: string }
-  required: [$status, summary]
-\`\`\`
-
-## Graph Routing
-
-The graph maps each role's \`$status\` values to the next role:
-
-\`\`\`
-graph[role][$status] → { role: nextRole, prompt: edgePrompt }
-\`\`\`
-
-### Special Nodes
-
-| Node | Purpose |
-|------|---------|
-| \`$START\` | Entry point — status key is always \`_\` (unconditional) |
-| \`$END\` | Terminal — thread completes and is archived |
-
-### Edge Prompts
-
-Use triple-brace Mustache (\`{{{field}}}\`) to pass data from the previous step's output:
-
-\`\`\`yaml
-graph:
-  planner:
-    ready: { role: developer, prompt: "Implement plan {{{plan}}} in {{{repoPath}}}." }
-\`\`\`
-
-The fields referenced must exist in the source role's frontmatter schema.
-
-### Loops and Branching
-
-Roles can route back to previous roles (loops) or to different roles based on status (branching):
-
-\`\`\`yaml
-graph:
-  reviewer:
-    approved: { role: tester, prompt: "Run tests." }
-    rejected: { role: developer, prompt: "Fix: {{{comments}}}" }  # loop back
-\`\`\`
-
-### Fail Routing
-
-Route failures to a cleanup role or \`$END\`:
-
-\`\`\`yaml
-graph:
-  developer:
-    done: { role: reviewer, prompt: "Review changes." }
-    failed: { role: cleanup, prompt: "Clean up: {{{error}}}" }
-\`\`\`
-
-## Self-Testing
-
-### Step-by-Step Verification
-
-\`\`\`bash
-# Start a thread directly from YAML file (no registration needed)
-uwf thread start my-workflow.yaml -p "Test prompt"
-
-# Or register first, then start by name
-uwf workflow add my-workflow.yaml
-uwf thread start my-workflow -p "Test prompt"
-
-# Execute one step at a time to verify routing
-uwf thread exec <thread-id>
-
-# Inspect step output
-uwf step list <thread-id>
-uwf step show <step-hash>
-
-# Check the CAS data
-uwf cas get <output-hash>
-\`\`\`
-
-### Validation Checklist
-
-1. Every \`$status\` value in a role's frontmatter has a matching edge in the graph
-2. Every field referenced in edge prompts (\`{{{field}}}\`) exists in the source role's schema
-3. Every role referenced in the graph exists in \`roles\`
-4. \`$START\` has exactly one edge with key \`_\`
-5. At least one path leads to \`$END\`
-6. No orphan roles (defined but never routed to)
-
-## Common Pitfalls
-
- **Missing graph edge** — if a role can produce \`$status: failed\` but the graph has no \`failed\` edge, the moderator will error
- **Mustache field mismatch** — referencing \`{{{branch}}}\` in an edge prompt but the source schema has \`branchName\` instead
- **Overly complex roles** — a role with 20 steps should be split; each role should be completable in one agent turn
- **No fail path** — always handle failure; route to cleanup or \`$END\`
-`;
-}
@@ -1,140 +0,0 @@
-export function generateDeveloperReference(): string {
-  return `# Developer Reference
-
-Guide for contributing to the workflow engine codebase.
-
-## Monorepo Structure
-
-\`\`\`
-packages/
-  workflow-protocol/      # Shared types (WorkflowPayload, StepNodePayload, etc.)
-  workflow-util/          # Base32, ULID, logger, frontmatter parsing, skill references
-  workflow-util-agent/    # createAgent factory, context builder, extract pipeline
-  workflow-agent-hermes/  # uwf-hermes CLI (spawns Hermes chat sessions)
-  workflow-agent-builtin/ # uwf-builtin CLI (direct LLM calls via OpenAI API)
-  cli-workflow/           # uwf CLI (moderator, thread/step/cas/config commands)
-\`\`\`
-
-Dependency layers (each only imports from packages above it):
-\`\`\`
-protocol → util → util-agent → agent-hermes / agent-builtin / cli-workflow
-\`\`\`
-
-External CAS: \`@uncaged/json-cas\` (store API, hashing, schema validation) + \`@uncaged/json-cas-fs\` (filesystem backend).
-
-## Coding Conventions
-
-### Functional-first
-
-| Rule | Description |
-|------|-------------|
-| \`type\` over \`interface\` | All type definitions use \`type\` |
-| \`function\` over \`class\` | Pure functions + closures, no class |
-| No \`this\` | Functions must not depend on \`this\` context |
-| No inheritance | No \`extends\`, \`implements\`, \`abstract\` |
-| No optional properties | Use \`T \\| null\` instead of \`?:\` |
-| Immutability first | Use \`Readonly<T>\`, \`as const\`, avoid mutation |
-
-Classes allowed only when required by third-party libraries or for Error subclasses.
-
-### Error Handling
-
- \`Result<T, E>\` type for expected failures (\`ok\`/\`err\` constructors from \`@uncaged/workflow-util\`)
- \`throw\` only for unrecoverable bugs
- No try-catch for flow control
-
-### Async
-
-Always \`async/await\`, never \`.then()\` chains.
-
-### Logging
-
-\`console.*\` is banned (Biome \`noConsole\` rule). Use the structured logger:
-
-\`\`\`typescript
-import { createLogger } from "@uncaged/workflow-util";
-const log = createLogger();
-log("4KNMR2PX", "Loading workflow...");  // 8-char Crockford Base32 tag
-\`\`\`
-
-Each call site gets a unique hand-written tag. \`grep "4KNMR2PX"\` in logs → instant code location.
-
-CLI package (\`@uncaged/cli-workflow\`) may use \`console.log\` for user-facing output with a biome-ignore comment.
-
-### No Dynamic Import
-
-No \`await import()\` in production code. Always static top-level \`import\`. Test files are exempt.
-
-### Naming
-
- Workflow names: verb-first kebab-case (\`solve-issue\`, \`review-code\`)
- IDs: Crockford Base32 — CAS hash (XXH64, 13-char), Thread ID (ULID, 26-char)
-
-## Development Workflow
-
-\`\`\`bash
-bun install                 # install all workspace deps
-bun run build               # tsc --build (all packages)
-bun run check               # tsc + biome check + lint-log-tags
-bun run format              # biome format --write
-bun test                    # run all tests
-\`\`\`
-
-Before committing: \`bun run check\` + \`bun test\` must both pass.
-
-### Testing
-
- \`cli-workflow\`: vitest
- Other packages: \`bun test\`
- Test files live in \`__tests__/\` directories
-
-### Publishing
-
-Fixed-mode versioning — all \`@uncaged/*\` packages share the same version number.
-
-\`\`\`bash
-bun changeset               # describe the change
-bun version                 # bump versions + changelogs
-bun release                 # build + test + publish to npmjs
-\`\`\`
-
-## Key Modules
-
-### Moderator (\`cli-workflow/src/moderator/\`)
-
-Status-based graph evaluator. Reads \`graph[lastRole][output.$status]\` to determine the next role. Zero LLM cost.
-
-### Extract Pipeline (\`workflow-util-agent/src/\`)
-
-1. Agent produces frontmatter markdown
-2. \`parseFrontmatterMarkdown()\` extracts YAML frontmatter
-3. \`tryFrontmatterFastPath()\` validates against role's output schema
-4. If fast path fails, retries up to 2 times via agent continue
-5. Validated output stored as CAS node
-
-### createAgent Factory (\`workflow-util-agent/src/run.ts\`)
-
-Shared entry point for all agent CLIs. Handles:
- Argument parsing (\`--thread\`, \`--role\`, \`--prompt\`)
- Context building (thread history, workflow definition)
- Output extraction and CAS persistence
- Frontmatter retry loop
-
-### CAS Integration
-
-All data is CAS-addressed via \`@uncaged/json-cas\`:
- \`store.put(schemaHash, data)\` → content hash
- \`store.get(hash)\` → node
- \`validate(store, node)\` → schema check
- Schemas registered at workflow add time
-
-## Commit Convention
-
-\`\`\`
-<type>(<scope>): <description>
-
-type: feat | fix | refactor | docs | chore | test
-scope: workflow | cli | moderator | util-agent | hermes | util | protocol
-\`\`\`
-`;
-}
@@ -1,10 +1,6 @@
-export { generateActorReference } from "./actor-reference.js";
-export { generateAdapterReference } from "./adapter-reference.js";
 export { generateArchitectureReference } from "./architecture-reference.js";
-export { generateAuthorReference } from "./author-reference.js";
 export { encodeUint64AsCrockford } from "./base32.js";
 export { generateCliReference } from "./cli-reference.js";
-export { generateDeveloperReference } from "./developer-reference.js";
 export { env } from "./env.js";
 export type {
  AgentFrontmatter,
@@ -31,5 +27,4 @@ export { err, ok } from "./result.js";
 export { getDefaultWorkflowStorageRoot, getGlobalCasDir } from "./storage-root.js";
 export type { LogFn, Result } from "./types.js";
 export { extractUlidTimestamp, generateUlid } from "./ulid.js";
-export { generateUserReference } from "./user-reference.js";
 export { generateYamlReference } from "./yaml-reference.js";
@@ -1,125 +0,0 @@
-export function generateUserReference(): string {
-  return `# User Reference
-
-Guide for using the uwf CLI to manage workflows and threads.
-
-## Quick Start
-
-\`\`\`bash
-# 1. Configure provider and model
-uwf setup
-
-# 2. Register a workflow
-uwf workflow add my-workflow.yaml
-
-# 3. Start a thread (creates but does not execute)
-uwf thread start my-workflow -p "Build a login page"
-
-# 4. Execute the thread (runs moderator → agent → extract cycles)
-uwf thread exec <thread-id>          # one step
-uwf thread exec <thread-id> -c 10    # up to 10 steps
-uwf thread exec <thread-id> -c 10 --background  # run in background
-\`\`\`
-
-## Concepts
-
- **Workflow** — YAML definition with roles and a routing graph; stored as a CAS node
- **Thread** — A running instance of a workflow; a chain of step nodes in CAS
- **Step** — One moderator → agent → extract cycle; contains the role's structured output
- **CAS** — Content-addressable store; every artifact is hashed (XXH64, Crockford Base32)
-
-## Setup
-
-\`\`\`
-uwf setup                                          # interactive wizard
-uwf setup --provider <name> --base-url <url> \\
-           --api-key <key> --model <name>           # non-interactive
-           [--agent <name>]                         # optional default agent
-\`\`\`
-
-Config is stored at \`~/.uncaged/workflow/config.yaml\`. Override storage root with \`UNCAGED_WORKFLOW_STORAGE_ROOT\`.
-
-## Workflow Commands
-
-\`\`\`
-uwf workflow add <file>            # register from YAML file
-uwf workflow show <id>             # show by name or CAS hash
-uwf workflow list                  # list all registered workflows
-\`\`\`
-
-You can also pass a file path directly to \`uwf thread start\` without registering first.
-
-## Thread Lifecycle
-
-\`\`\`
-uwf thread start <workflow> -p <prompt>            # create thread
-uwf thread exec <thread-id>                        # execute one step
-               [--agent <cmd>]                     # override agent
-               [-c, --count <n>]                   # run n steps
-               [--background]                      # run in background
-uwf thread show <thread-id>                        # show head pointer
-uwf thread list                                    # list all threads
-               [--status <filter>]                 # idle, running, completed, cancelled, active (comma-separated)
-               [--after <thread-id>]               # pagination: after this thread
-               [--before <thread-id>]              # pagination: before this thread
-               [--skip <n>]                        # skip first n results
-               [--take <n>]                        # limit results
-uwf thread read <thread-id>                        # render context as markdown
-               [--quota <chars>]                   # max output chars (default 4000)
-               [--before <step-hash>]              # pagination
-               [--start]                           # include start step
-uwf thread stop <thread-id>                        # stop background execution
-uwf thread cancel <thread-id>                      # cancel and archive thread
-\`\`\`
-
-### Typical Lifecycle
-
-\`\`\`
-start → exec (repeat) → thread reaches $END → auto-completed
-                       → or: cancel to abort
-\`\`\`
-
-## Step Commands
-
-\`\`\`
-uwf step list <thread-id>         # list all steps
-uwf step show <step-hash>         # show step details
-uwf step fork <step-hash>         # fork thread from a step (branch)
-\`\`\`
-
-Forking creates a new thread that shares history up to the fork point — useful for retrying from a known-good state.
-
-## CAS Commands
-
-\`\`\`
-uwf cas get <hash>                 # read a node (type + payload)
-            [--timestamp]          # include timestamp
-uwf cas put <type-hash> <data>     # store typed JSON, print hash
-uwf cas put-text <text>            # store plain text, print hash
-uwf cas has <hash>                 # check existence
-uwf cas refs <hash>                # list direct references
-uwf cas walk <hash>                # recursive traversal
-uwf cas reindex                    # rebuild type index
-uwf cas schema list                # list schemas
-uwf cas schema get <hash>          # show schema definition
-\`\`\`
-
-## Log Commands
-
-\`\`\`
-uwf log list                       # list log files
-uwf log show                       # show log entries
-           [--thread <id>]         # filter by thread
-           [--process <pid>]       # filter by process
-           [--date <YYYY-MM-DD>]   # filter by date
-uwf log clean --before <date>      # delete old logs
-\`\`\`
-
-## Global Options
-
-\`\`\`
-uwf --format <json|yaml>           # output format (default: json)
-uwf -V, --version                  # print version
-\`\`\`
-`;
-}
@@ -1,120 +0,0 @@
-#!/usr/bin/env bash
-# Check development environment prerequisites for uncaged/workflow.
-# Non-interactive — prints actionable fix instructions on failure.
-# Exit 0 = all good, exit 1 = missing dependencies.
-set -euo pipefail
-
-errors=0
-
-check() {
-  local name="$1" check_cmd="$2" fix_msg="$3"
-  if eval "$check_cmd" >/dev/null 2>&1; then
-    echo "✅ $name"
-  else
-    echo "❌ $name"
-    echo "   Fix: $fix_msg"
-    errors=$((errors + 1))
-  fi
-}
-
-check_version() {
-  local name="$1" cmd="$2" fix_msg="$3"
-  local version
-  if version=$(eval "$cmd" 2>/dev/null | head -1); then
-    echo "✅ $name — $version"
-  else
-    echo "❌ $name"
-    echo "   Fix: $fix_msg"
-    errors=$((errors + 1))
-  fi
-}
-
-echo "=== Runtime ==="
-check_version "bun" "bun --version" \
-  "curl -fsSL https://bun.sh/install | bash"
-
-check_version "node" "node --version" \
-  "Install Node.js 20+: https://nodejs.org/"
-
-check_version "python3" "python3 --version" \
-  "Install Python 3.11+: https://www.python.org/ or use uv: curl -LsSf https://astral.sh/uv/install.sh | sh && uv python install 3.11"
-
-echo ""
-echo "=== Tools ==="
-check_version "hermes" "hermes --version" \
-  "See https://github.com/hermes-ai/hermes-agent for installation. Typical: pip install hermes-agent (or uv pip install -e . for dev)"
-
-check_version "claude" "claude --version" \
-  "npm install -g @anthropic-ai/claude-code"
-
-echo ""
-echo "=== Workflow ==="
-
-# Check repo location
-REPO_DIR="${WORKFLOW_REPO:-$(cd "$(dirname "$0")/.." && pwd)}"
-check "repo at ~/repos/workflow or WORKFLOW_REPO set" \
-  "[ -f '$REPO_DIR/packages/cli-workflow/src/cli.ts' ]" \
-  "Clone the repo: git clone https://git.shazhou.work/uncaged/workflow ~/repos/workflow"
-
-# Check bun install
-check "node_modules installed" \
-  "[ -d '$REPO_DIR/node_modules' ]" \
-  "cd $REPO_DIR && bun install"
-
-# Check build
-check "packages built (dist/)" \
-  "[ -f '$REPO_DIR/packages/cli-workflow/dist/cli.js' ]" \
-  "cd $REPO_DIR && bun run build"
-
-# Check uwf is runnable
-check_version "uwf" "bun $REPO_DIR/packages/cli-workflow/src/cli.ts --version" \
-  "cd $REPO_DIR && bun install && bun run build"
-
-# Check uwf symlink
-check "uwf in PATH" \
-  "command -v uwf" \
-  "sudo ln -sf $REPO_DIR/packages/cli-workflow/dist/cli.js /usr/bin/uwf && sudo chmod +x /usr/bin/uwf"
-
-# Check uwf-hermes
-check "uwf-hermes in PATH" \
-  "command -v uwf-hermes" \
-  "bun link in packages/workflow-agent-hermes, or: echo '#!/usr/bin/env bun' > ~/.local/bin/uwf-hermes && echo 'import \"$REPO_DIR/packages/workflow-agent-hermes/src/cli.ts\"' >> ~/.local/bin/uwf-hermes && chmod +x ~/.local/bin/uwf-hermes"
-
-# Check uwf-claude-code
-check "uwf-claude-code in PATH" \
-  "command -v uwf-claude-code" \
-  "Create wrapper: echo '#!/bin/bash\nexec bun run $REPO_DIR/packages/workflow-agent-claude-code/src/cli.ts \"\$@\"' > ~/.local/bin/uwf-claude-code && chmod +x ~/.local/bin/uwf-claude-code"
-
-echo ""
-echo "=== Config ==="
-
-# Check workflow config exists
-CONFIG_DIR="${UNCAGED_WORKFLOW_STORAGE_ROOT:-$HOME/.uncaged/workflow}"
-check "config.yaml exists" \
-  "[ -f '$CONFIG_DIR/config.yaml' ]" \
-  "Run: uwf setup"
-
-# Check config has apiKey (not apiKeyEnv)
-if [ -f "$CONFIG_DIR/config.yaml" ]; then
-  check "config uses apiKey (not legacy apiKeyEnv)" \
-    "grep -q 'apiKey:' '$CONFIG_DIR/config.yaml' && ! grep -q 'apiKeyEnv:' '$CONFIG_DIR/config.yaml'" \
-    "Run: uwf setup (re-configure to write apiKey directly)"
-fi
-
-echo ""
-echo "=== Docker (optional, for E2E tests) ==="
-check_version "docker" "docker --version" \
-  "sudo apt install -y docker.io && sudo usermod -aG docker \$USER"
-
-check "docker daemon running" \
-  "docker info" \
-  "sudo systemctl start docker"
-
-echo ""
-if [ "$errors" -gt 0 ]; then
-  echo "⚠️  $errors issue(s) found. Fix them and re-run this script."
-  exit 1
-else
-  echo "🎉 All checks passed!"
-  exit 0
-fi
@@ -1,377 +0,0 @@
-#!/usr/bin/env bash
-# E2E walkthrough for uncaged/workflow.
-# Runs inside Docker with isolated UNCAGED_WORKFLOW_STORAGE_ROOT.
-# Exercises: setup → workflow add → thread start/exec → cancel/fork → read/inspect.
-#
-# Usage:
-#   sudo -E scripts/e2e-walkthrough.sh [--agent <agent>] [--provider <provider>] [--model <model>] [--api-key <key>]
-#
-# Requires: Docker running, $HOME mount approach (see scripts/check-dev-env.sh).
-# Produces: JSON report on stdout, logs in $E2E_DIR.
-#
-# IMPORTANT: Must run with `sudo -E` to preserve $HOME (Docker needs root).
-#
-# Known Issues (WIP):
-#   1. `echo '$OUT' | jq` breaks when $OUT contains single quotes (e.g. workflow show
-#      output with YAML). Fix: use heredoc or pipe variable directly.
-#   2. Config may still have old `apiKeyEnv` field — thread exec will fail with
-#      "no API key". Fix: re-run `uwf setup` or manually set `apiKey` in config.
-#   3. Bootstrap installs jq via apt-get which adds ~30s startup time.
-#      Consider baking a custom image or using node's JSON.parse instead.
-#   4. `bun install` in container may modify host's lockfile/node_modules.
-#      Consider `--frozen-lockfile` or read-only mount for non-essential paths.
-
-set -euo pipefail
-
-# --- Args ---
-AGENT="uwf-builtin"
-PROVIDER=""
-MODEL=""
-API_KEY=""
-KEEP_CONTAINER=false
-
-while [[ $# -gt 0 ]]; do
-  case "$1" in
-    --agent)     AGENT="$2";    shift 2 ;;
-    --provider)  PROVIDER="$2"; shift 2 ;;
-    --model)     MODEL="$2";    shift 2 ;;
-    --api-key)   API_KEY="$2";  shift 2 ;;
-    --keep)      KEEP_CONTAINER=true; shift ;;
-    *) echo "Unknown arg: $1" >&2; exit 1 ;;
-  esac
-done
-
-# --- Resolve paths ---
-SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
-REPO_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
-E2E_DIR=$(mktemp -d /tmp/uwf-e2e-XXXXXX)
-CONTAINER_NAME="uwf-e2e-$(date +%s)"
-
-echo "=== uwf E2E walkthrough ===" >&2
-echo "Agent:     $AGENT" >&2
-echo "Provider:  ${PROVIDER:-"(from config)"}" >&2
-echo "Model:     ${MODEL:-"(from config)"}" >&2
-echo "E2E dir:   $E2E_DIR" >&2
-echo "Container: $CONTAINER_NAME" >&2
-echo "" >&2
-
-# --- Cleanup ---
-cleanup() {
-  if [ "$KEEP_CONTAINER" = false ]; then
-    docker rm -f "$CONTAINER_NAME" 2>/dev/null || true
-  fi
-}
-trap cleanup EXIT
-
-# --- Build inner script ---
-# This runs INSIDE the container with an isolated storage root.
-cat > "$E2E_DIR/run.sh" << 'INNER_SCRIPT'
-#!/usr/bin/env bash
-set -euo pipefail
-
-# Isolated storage — never touches host's ~/.uncaged/workflow
-export UNCAGED_WORKFLOW_STORAGE_ROOT="/tmp/uwf-e2e-storage"
-mkdir -p "$UNCAGED_WORKFLOW_STORAGE_ROOT"
-
-REPO_DIR="$1"
-AGENT="$2"
-PROVIDER="$3"
-MODEL="$4"
-API_KEY="$5"
-
-# Ensure tools are in PATH (derive HOME from REPO_DIR to avoid container HOME issues)
-REAL_HOME="${6:-$HOME}"
-export HOME="$REAL_HOME"
-export PATH="$REAL_HOME/.bun/bin:$REAL_HOME/.hermes/hermes-agent/venv/bin:$REAL_HOME/.local/share/npm/bin:$PATH"
-
-# Resolve uwf
-UWF="bun $REPO_DIR/packages/cli-workflow/src/cli.ts"
-
-PASS=0
-FAIL=0
-RESULTS=()
-
-run_test() {
-  local name="$1"
-  shift
-  local output exit_code
-  echo "--- TEST: $name ---" >&2
-  output=$("$@" 2>&1) && exit_code=0 || exit_code=$?
-  if [ $exit_code -eq 0 ]; then
-    PASS=$((PASS + 1))
-    RESULTS+=("{\"name\":\"$name\",\"status\":\"pass\"}")
-    echo "  ✅ PASS" >&2
-  else
-    FAIL=$((FAIL + 1))
-    # Escape output for JSON
-    local escaped
-    escaped=$(echo "$output" | head -5 | tr '\n' ' ' | sed 's/"/\\"/g' | cut -c1-200)
-    RESULTS+=("{\"name\":\"$name\",\"status\":\"fail\",\"error\":\"$escaped\"}")
-    echo "  ❌ FAIL: $output" >&2
-  fi
-  echo "$output"
-}
-
-assert_contains() {
-  local haystack="$1" needle="$2"
-  if echo "$haystack" | grep -q "$needle"; then
-    return 0
-  else
-    echo "Expected to contain: $needle" >&2
-    echo "Got: $haystack" >&2
-    return 1
-  fi
-}
-
-assert_json_field() {
-  local json="$1" field="$2"
-  if echo "$json" | jq -e ".$field" >/dev/null 2>&1; then
-    return 0
-  else
-    echo "Missing JSON field: $field" >&2
-    return 1
-  fi
-}
-
-# ============================================================
-# Phase 1: Environment check
-# ============================================================
-echo "" >&2
-echo "=== Phase 1: Environment ===" >&2
-
-run_test "uwf --version" bash -c "$UWF --version"
-
-# ============================================================
-# Phase 2: Setup (non-interactive)
-# ============================================================
-echo "" >&2
-echo "=== Phase 2: Setup ===" >&2
-
-if [ -n "$PROVIDER" ] && [ -n "$MODEL" ] && [ -n "$API_KEY" ]; then
-  SETUP_CMD="$UWF setup --provider $PROVIDER --base-url https://api.openai.com/v1 --api-key $API_KEY --model $MODEL"
-  if [ -n "$AGENT" ]; then
-    SETUP_CMD="$SETUP_CMD --agent $AGENT"
-  fi
-  run_test "uwf setup (non-interactive)" bash -c "$SETUP_CMD"
-else
-  # Copy host config if available
-  if [ -f "$HOME/.uncaged/workflow/config.yaml" ]; then
-    cp "$HOME/.uncaged/workflow/config.yaml" "$UNCAGED_WORKFLOW_STORAGE_ROOT/config.yaml"
-    echo "  Copied host config.yaml" >&2
-  fi
-fi
-
-# Test config commands
-OUT=$(run_test "uwf config list" bash -c "$UWF config list")
-run_test "config list is valid JSON" bash -c "echo '$OUT' | jq . >/dev/null"
-
-# ============================================================
-# Phase 3: Workflow registration
-# ============================================================
-echo "" >&2
-echo "=== Phase 3: Workflow registration ===" >&2
-
-# Use the example workflow
-EXAMPLE_WF="$REPO_DIR/examples/solve-issue.yaml"
-if [ ! -f "$EXAMPLE_WF" ]; then
-  echo "No example workflow found, creating minimal test workflow" >&2
-  EXAMPLE_WF="/tmp/test-workflow.yaml"
-  cat > "$EXAMPLE_WF" << 'WF'
-name: test-e2e
-roles:
-  worker:
-    goal: "Respond to the prompt with a brief answer."
-    outputSchema:
-      type: object
-      required: ["$status", "answer"]
-      properties:
-        $status:
-          type: string
-          enum: ["done"]
-        answer:
-          type: string
-graph:
-  - from: $START
-    to: worker
-  - from: worker
-    condition:
-      $status: done
-    to: $END
-WF
-fi
-
-OUT=$(run_test "uwf workflow add" bash -c "$UWF workflow add $EXAMPLE_WF")
-run_test "workflow add returns hash" bash -c "echo '$OUT' | jq -e '.hash'"
-
-OUT=$(run_test "uwf workflow list" bash -c "$UWF workflow list")
-run_test "workflow list is non-empty" bash -c "echo '$OUT' | jq -e 'length > 0'"
-
-# Get workflow name
-WF_NAME=$(echo "$OUT" | jq -r '.[0].name // empty')
-run_test "workflow has a name" bash -c "[ -n '$WF_NAME' ]"
-
-OUT=$(run_test "uwf workflow show" bash -c "$UWF workflow show $WF_NAME")
-run_test "workflow show returns roles" bash -c "echo '$OUT' | jq -e '.payload.roles'"
-
-# ============================================================
-# Phase 4: Thread lifecycle
-# ============================================================
-echo "" >&2
-echo "=== Phase 4: Thread lifecycle ===" >&2
-
-# Start a thread
-OUT=$(run_test "uwf thread start" bash -c "$UWF thread start $WF_NAME -p 'E2E test: what is 2+2?'")
-THREAD_ID=$(echo "$OUT" | jq -r '.thread // empty')
-run_test "thread start returns thread ID" bash -c "[ -n '$THREAD_ID' ]"
-
-# List threads
-OUT=$(run_test "uwf thread list" bash -c "$UWF thread list")
-run_test "thread appears in list" bash -c "echo '$OUT' | jq -e '.[] | select(.thread==\"$THREAD_ID\")'"
-
-# Show thread
-OUT=$(run_test "uwf thread show" bash -c "$UWF thread show $THREAD_ID")
-run_test "thread show returns head" bash -c "echo '$OUT' | jq -e '.head'"
-
-# Execute one step
-EXEC_ARGS=""
-if [ -n "$AGENT" ]; then
-  EXEC_ARGS="--agent $AGENT"
-fi
-OUT=$(run_test "uwf thread exec (1 step)" bash -c "$UWF thread exec $THREAD_ID $EXEC_ARGS")
-run_test "thread exec returns step info" bash -c "echo '$OUT' | jq -e '.head'"
-
-# ============================================================
-# Phase 5: Read & Inspect
-# ============================================================
-echo "" >&2
-echo "=== Phase 5: Read & Inspect ===" >&2
-
-# Step list
-OUT=$(run_test "uwf step list" bash -c "$UWF step list $THREAD_ID")
-STEP_COUNT=$(echo "$OUT" | jq '.steps | length')
-run_test "step list has steps" bash -c "[ $STEP_COUNT -gt 1 ]"
-
-# Get last step hash
-LAST_STEP=$(echo "$OUT" | jq -r '.steps[-1].hash // empty')
-run_test "last step has hash" bash -c "[ -n '$LAST_STEP' ]"
-
-# Step show
-if [ -n "$LAST_STEP" ]; then
-  OUT=$(run_test "uwf step show" bash -c "$UWF step show $LAST_STEP")
-  run_test "step show returns role" bash -c "echo '$OUT' | jq -e '.role'"
-fi
-
-# Thread read
-OUT=$(run_test "uwf thread read" bash -c "$UWF thread read $THREAD_ID")
-run_test "thread read produces output" bash -c "[ -n '$OUT' ]"
-
-# CAS operations
-if [ -n "$LAST_STEP" ]; then
-  OUT=$(run_test "uwf cas get" bash -c "$UWF cas get $LAST_STEP")
-  run_test "cas get returns type" bash -c "echo '$OUT' | jq -e '.type'"
-
-  OUT=$(run_test "uwf cas has" bash -c "$UWF cas has $LAST_STEP")
-
-  OUT=$(run_test "uwf cas refs" bash -c "$UWF cas refs $LAST_STEP")
-
-  OUT=$(run_test "uwf cas walk" bash -c "$UWF cas walk $LAST_STEP")
-  run_test "cas walk returns nodes" bash -c "echo '$OUT' | jq -e 'length > 0'"
-fi
-
-# ============================================================
-# Phase 6: Cancel & Fork
-# ============================================================
-echo "" >&2
-echo "=== Phase 6: Cancel & Fork ===" >&2
-
-# Start a second thread for cancel test
-OUT=$(run_test "thread start (for cancel)" bash -c "$UWF thread start $WF_NAME -p 'E2E cancel test'")
-CANCEL_THREAD=$(echo "$OUT" | jq -r '.thread // empty')
-
-if [ -n "$CANCEL_THREAD" ]; then
-  OUT=$(run_test "uwf thread cancel" bash -c "$UWF thread cancel $CANCEL_THREAD")
-  run_test "cancelled thread status" bash -c "$UWF thread list --status completed | jq -e '.[] | select(.thread==\"$CANCEL_THREAD\")'"
-fi
-
-# Fork from the first thread's last step
-if [ -n "$LAST_STEP" ]; then
-  OUT=$(run_test "uwf step fork" bash -c "$UWF step fork $LAST_STEP")
-  FORK_THREAD=$(echo "$OUT" | jq -r '.thread // empty')
-  run_test "fork creates new thread" bash -c "[ -n '$FORK_THREAD' ] && [ '$FORK_THREAD' != '$THREAD_ID' ]"
-fi
-
-# ============================================================
-# Phase 7: Log inspection
-# ============================================================
-echo "" >&2
-echo "=== Phase 7: Logs ===" >&2
-
-OUT=$(run_test "uwf log list" bash -c "$UWF log list")
-OUT=$(run_test "uwf log show" bash -c "$UWF log show --thread $THREAD_ID 2>&1 || true")
-
-# ============================================================
-# Phase 8: Config operations
-# ============================================================
-echo "" >&2
-echo "=== Phase 8: Config get/set ===" >&2
-
-OUT=$(run_test "uwf config get defaultAgent" bash -c "$UWF config get defaultAgent")
-OUT=$(run_test "uwf config set (test key)" bash -c "$UWF config set models.test.name test-model")
-OUT=$(run_test "uwf config get (verify set)" bash -c "$UWF config get models.test.name")
-run_test "config set value persisted" bash -c "echo '$OUT' | grep -q 'test-model'"
-
-# ============================================================
-# Report
-# ============================================================
-echo "" >&2
-echo "=== Results ===" >&2
-echo "Pass: $PASS  Fail: $FAIL" >&2
-
-# JSON report
-echo "{"
-echo "  \"pass\": $PASS,"
-echo "  \"fail\": $FAIL,"
-echo "  \"agent\": \"$AGENT\","
-echo "  \"tests\": [$(IFS=,; echo "${RESULTS[*]}")]"
-echo "}"
-
-[ $FAIL -eq 0 ]
-INNER_SCRIPT
-
-chmod +x "$E2E_DIR/run.sh"
-
-# --- Run in Docker ---
-echo "Starting Docker container..." >&2
-
-# --- Build bootstrap script (runs first inside container) ---
-cat > "$E2E_DIR/bootstrap.sh" << BOOTSTRAP
-#!/usr/bin/env bash
-set -uo pipefail
-echo "Installing jq..." >&2
-apt-get update -qq >&2 && apt-get install -y -qq jq >&2
-echo "jq installed" >&2
-
-# All tools come from host via mount
-export HOME='$HOME'
-export PATH="$HOME/.bun/bin:$HOME/.hermes/hermes-agent/venv/bin:$HOME/.local/share/npm/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
-
-# Ensure bun modules are resolved for this environment
-cd '$REPO_DIR'
-echo "Running bun install..." >&2
-which bun >&2
-bun install 2>&1 | tail -3 >&2
-echo "bun install done" >&2
-
-# Run E2E (pass HOME explicitly as 6th arg)
-bash /e2e/run.sh '$REPO_DIR' '$AGENT' '$PROVIDER' '$MODEL' '$API_KEY' '$HOME'
-BOOTSTRAP
-chmod +x "$E2E_DIR/bootstrap.sh"
-
-docker run --rm \
-  --name "$CONTAINER_NAME" \
-  -v "$HOME:$HOME" \
-  -v "$E2E_DIR:/e2e" \
-  -e HOME="$HOME" \
-  -w "$REPO_DIR" \
-  node:22-bookworm \
-  bash /e2e/bootstrap.sh
Author	SHA1	Message	Date
xingyue	86205f1a15	improve: committer — check git status before staging (from retrospect PR #578 ) Developer already commits changes, so committer's git add -A is redundant. Now checks git status first and skips to push if tree is clean.	2026-05-30 15:45:17 +08:00
xingyue	f741729b41	feat: retrospect-workflow — add Phase 0 validation - Check workflow exists in current project, block with wrong_project if not - Compare thread's workflow hash vs current version - If versions differ, diff and filter out already-fixed findings - New status: wrong_project → $END with clear error message	2026-05-30 15:32:33 +08:00
xingyue	5b26602fd4	fix: retrospect-workflow — proposer/developer must work in workflow repo, not analyzed repo First trial run revealed that proposer set repoPath to the analyzed repo (json-cas), causing developer to create worktrees there and pick up unrelated changes. Reviewer correctly rejected 3 times but developer couldn't fix it because the procedure was fundamentally pointing at the wrong repo.	2026-05-30 15:28:24 +08:00
xingyue	f12b60385a	test: update worktree test to match new tea pr create procedure tea pr create should run from main repo dir instead of using --repo flag, because tea cannot detect repo from worktree .git files.	2026-05-30 14:24:33 +08:00
xingyue	d54d448585	fix: committer procedure — tea pr create must run from main repo, not worktree tea cannot detect repo from worktree .git files, causing repeated failures. Also removed --repo flag guidance — let tea auto-detect from git remote.	2026-05-30 14:23:37 +08:00