fix(validate): support enum-based multi-exit and upgrade json-cas to 0.5.3

Two fixes for 'uwf thread start solve-issue' failures: 1. json-cas 0.5.2 (npm) was missing oneOf in ALLOWED_SCHEMA_KEYS. Published json-cas 0.5.3 with the fix, bumped all packages to ^0.5.3. 2. Semantic validator only recognized oneOf-based multi-exit schemas. Roles using $status with enum (e.g. enum: [approved, rejected]) were incorrectly treated as single-exit. Added isEnumMultiExit() support. Changes: - validate-semantic.ts: isEnumMultiExit(), getEnumStatuses(), checkSingleExitMustache() - All package.json: @uncaged/json-cas ^0.5.2 → ^0.5.3 - validate-semantic.test.ts: 5 new enum multi-exit tests (Suite 3b) - solve-issue-tea-worktree.test.ts: updated for current workflow structure 小橘 🍊
fix(cli): remove Chinese text from uwf --help description
2026-05-25 13:13:51 +00:00 · 2026-05-25 12:42:10 +00:00 · 2026-05-25 12:39:34 +00:00 · 2026-05-25 12:38:40 +00:00 · 2026-05-25 12:31:27 +00:00 · 2026-05-25 12:29:18 +00:00
26 changed files with 606 additions and 102 deletions
@@ -0,0 +1,28 @@
+name: CI
+
+on:
+  push:
+    branches: ['*']
+  pull_request:
+    branches: [main]
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Setup Bun
+        uses: oven-sh/setup-bun@v2
+
+      - name: Install dependencies
+        run: bun install
+
+      - name: Lint
+        run: bun run lint
+
+      - name: Type check
+        run: bun run typecheck
+
+      - name: Test
+        run: bun test
@@ -1,92 +1,198 @@
 name: "solve-issue"
-description: "End-to-end issue resolution"
+description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds."
 roles:
  planner:
-    description: "Creates implementation plan"
-    goal: "You are a planning agent. You analyze issues and create implementation plans grounded in the actual codebase."
+    description: "Analyzes issue and outputs a TDD test spec"
+    goal: "You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify."
    capabilities:
      - issue-analysis
      - planning
-      - file-read
-      - shell
    procedure: |
-      1. Locate the code repository:
-         - Check if the current working directory is the repo (look for package.json, .git, etc.)
-         - If the task mentions a repo URL, clone it first.
-         - If this is a new project, create the repo and note the path.
-      2. Explore the codebase — read the relevant source files mentioned in the issue. Understand the current architecture, types, and conventions (check CLAUDE.md, CONTRIBUTING.md, .cursor/rules/).
-      3. Identify which files need changes and what the changes should be, with specific code references.
-      4. Output the plan with:
-         - `repoPath`: absolute path to the repository root
-         - `plan`: detailed implementation plan with file paths and code references
-         - `steps`: concrete action items for the developer
-    output: |
-      Provide repoPath, plan summary, and steps in the frontmatter.
-      The plan MUST reference actual file paths and code structures you found by reading the source.
-      Do NOT guess — if you haven't read a file, read it before referencing it.
+      On first run (no previous steps):
+      1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
+      2. Look for project conventions files (CLAUDE.md, CONTRIBUTING.md, .cursor/rules/) in the repo
+      3. Assess whether the issue has enough information to produce a test spec
+      4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output $status=insufficient_info
+      5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
+
+      On subsequent runs (bounced back by tester with fix_spec):
+      1. Read the tester's output from the previous step to understand what's wrong with the spec
+      2. Revise the test spec accordingly
+
+      After producing the test spec:
+      1. Store it via `uwf cas put-text "<markdown content>"` and capture the returned hash
+      2. Put the hash in frontmatter.plan (required when $status=ready)
+      3. Set repoPath to the absolute path of the repository root
+    output: "Output a brief summary of the test spec. Set $status to ready (with plan hash and repoPath) or insufficient_info."
    frontmatter:
-      type: object
-      properties:
-        $status:
-          enum: ["_"]
-        repoPath:
-          type: string
-        plan:
-          type: string
-      required: [$status, repoPath, plan]
+      oneOf:
+        - properties:
+            $status: { const: "ready" }
+            plan: { type: string }
+            repoPath: { type: string }
+          required: [$status, plan, repoPath]
+        - properties:
+            $status: { const: "insufficient_info" }
+          required: [$status]
  developer:
-    description: "Implements code changes"
-    goal: "You are a developer agent. You implement code changes according to plans."
+    description: "TDD implementation per test spec"
+    goal: "You are a developer agent. You implement code changes following TDD — write tests first, then implementation."
    capabilities:
-      - file-edit
-      - shell
-      - testing
+      - coding
    procedure: |
-      1. Read the planner's output to get the repoPath and implementation plan.
-      2. cd to the repoPath before making any changes.
-      3. Create a feature branch from the default branch.
-      4. Implement the plan — write code, tests, and ensure existing tests pass.
-      5. Run the project's lint/check command (e.g. `bun run check`, `npm run lint`) and fix ALL errors before proceeding. Build and lint must pass cleanly.
-      6. Commit your changes with a descriptive message referencing the issue.
-    output: "List all files changed and provide a summary of the implementation."
+      IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
+      The repo path and other details are provided in your task prompt.
+
+      Before starting any work, set up an isolated worktree:
+      1. cd into the repo path provided in your task prompt
+      2. `git fetch origin` to get latest refs
+      3. First time (no existing branch):
+         - `git worktree add .worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
+         - `cd .worktrees/fix/<issue-number>-<short-slug> && bun install`
+      4. If bounced back from reviewer or tester (branch already exists):
+         - cd into the existing worktree under `.worktrees/fix/<issue-number>-<short-slug>`
+         - `git fetch origin && git rebase origin/main`
+      5. ALL subsequent work must happen inside the worktree directory.
+
+      Then implement TDD:
+      6. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the planner's output in your task prompt)
+      7. If bounced back from reviewer or tester: read the previous role's feedback in your task prompt
+      8. Write tests first based on the spec
+      9. Implement the code to make tests pass
+      10. Ensure `bun run build` passes with no errors
+      11. Run `bun test` to verify all tests pass
+
+      If you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,
+      or repeated attempts fail), set $status=failed with a reason.
+    output: "List all files changed and provide a summary. Set $status to done (with branch/worktree), or failed (with reason)."
    frontmatter:
-      type: object
-      properties:
-        $status:
-          enum: ["_"]
-        filesChanged:
-          type: array
-          items:
-            type: string
-        summary:
-          type: string
-      required: [$status, filesChanged, summary]
+      oneOf:
+        - properties:
+            $status: { const: "done" }
+            branch: { type: string }
+            worktree: { type: string }
+          required: [$status, branch, worktree]
+        - properties:
+            $status: { const: "failed" }
+            reason: { type: string }
+          required: [$status, reason]
  reviewer:
-    description: "Reviews code changes"
-    goal: "You are a code reviewer. You review implementations for correctness and quality."
+    description: "Code standards compliance check"
+    goal: "You are a code reviewer. You verify code standards compliance — NOT functionality (that's the tester's job)."
    capabilities:
      - code-review
      - static-analysis
    procedure: |
-      1. Run hard checks first — build (`bun run build` or equivalent) and lint (`bunx biome check .` or equivalent) MUST pass with zero errors. If they fail, reject immediately.
-      2. Then review code quality: correctness, edge cases, naming, project conventions (CLAUDE.md), and test coverage.
-      3. Only reject for hard check failures or genuine correctness/security issues. Style suggestions alone should not block approval.
-    output: "Approve or reject with detailed comments explaining your decision."
+      The worktree path is provided in your task prompt. cd into it first.
+
+      Before reviewing, verify the git branch:
+      1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
+      2. If the branch doesn't correspond to the issue, flag it in your output and reject
+
+      Then perform code review:
+      Hard checks (must all pass):
+      3. `bun run build` — no build errors
+      4. `bunx biome check` — no lint violations
+      5. TypeScript strict mode — no type errors
+
+      Soft checks (review against project conventions if CLAUDE.md / .cursor/rules exist):
+      - Naming conventions, module boundaries, code style
+      - No `console.log` in production code
+      - No dynamic imports in production code
+
+      Only review standards compliance. Do NOT test functionality.
+      If rejecting, you MUST explain the specific reason in your output.
+    output: "Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments)."
    frontmatter:
-      type: object
-      properties:
-        $status:
-          enum: ["approved", "rejected"]
-        comments:
-          type: string
-      required: [$status, comments]
+      oneOf:
+        - properties:
+            $status: { const: "approved" }
+            branch: { type: string }
+            worktree: { type: string }
+          required: [$status, branch, worktree]
+        - properties:
+            $status: { const: "rejected" }
+            comments: { type: string }
+            worktree: { type: string }
+          required: [$status, comments, worktree]
+  tester:
+    description: "Functional correctness verification"
+    goal: "You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec."
+    capabilities:
+      - testing
+    procedure: |
+      The worktree path is provided in your task prompt. cd into it first.
+
+      1. Run `bun test` for automated test verification
+      2. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the planner step in the thread history)
+      3. Verify each scenario in the spec is covered and passing
+      4. Determine outcome:
+         - passed: all scenarios verified, tests pass
+         - fix_code: tests fail or implementation doesn't match spec → send back to developer
+         - fix_spec: the spec itself is wrong or incomplete → send back to planner
+    output: "Report test results per scenario. Set $status to passed (with branch/worktree), fix_code (with report), or fix_spec (with report)."
+    frontmatter:
+      oneOf:
+        - properties:
+            $status: { const: "passed" }
+            branch: { type: string }
+            worktree: { type: string }
+          required: [$status, branch, worktree]
+        - properties:
+            $status: { const: "fix_code" }
+            report: { type: string }
+          required: [$status, report]
+        - properties:
+            $status: { const: "fix_spec" }
+            report: { type: string }
+          required: [$status, report]
+  committer:
+    description: "Commits and creates PR"
+    goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue."
+    capabilities: []
+    procedure: |
+      The worktree path, branch name, and repo info are provided in your task prompt.
+      cd into the worktree first.
+
+      Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
+      1. Stage all changes: `git add -A`
+      2. Commit with a descriptive message referencing the issue: `git commit -m "type: description\n\nFixes #N"`
+      3. Push the branch: `git push -u origin <branch-name>`
+         - If push hook fails: capture the error log in your output, mark hook_failed
+      4. On push success: create a PR via `tea pr create --repo <owner/repo> --title "..." --description "..."`
+         - Extract owner/repo from: `git remote get-url origin | sed 's/.*[:/]\([^/]*\/[^.]*\).*/\1/'`
+         - PR description must include: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
+         - On tea failure: capture stderr/stdout, include PR details for manual creation, mark hook_failed
+      5. After PR creation, clean up the worktree:
+         - cd to the repo root (parent of .worktrees)
+         - `git worktree remove <worktree-path>`
+    output: "Include PR URL on success or error log on failure. Set $status to committed (with prUrl) or hook_failed (with error)."
+    frontmatter:
+      oneOf:
+        - properties:
+            $status: { const: "committed" }
+            prUrl: { type: string }
+          required: [$status, prUrl]
+        - properties:
+            $status: { const: "hook_failed" }
+            error: { type: string }
+          required: [$status, error]
 graph:
  $START:
-    _: { role: "planner", prompt: "Analyze the issue described in the task and produce a detailed implementation plan." }
+    _: { role: "planner", prompt: "Analyze the issue and produce an implementation plan." }
  planner:
-    _: { role: "developer", prompt: "Implement the plan from the planner. Write code, tests, and ensure existing tests pass." }
+    insufficient_info: { role: "$END", prompt: "Insufficient information to proceed; end the workflow." }
+    ready: { role: "developer", prompt: "Implement the TDD test spec (CAS hash: {{{plan}}}) in repo {{{repoPath}}}." }
  developer:
-    _: { role: "reviewer", prompt: "Review the developer's implementation against the plan for correctness and quality." }
+    done: { role: "reviewer", prompt: "Review branch {{{branch}}} at {{{worktree}}} for code standards compliance." }
+    failed: { role: "$END", prompt: "Developer failed: {{{reason}}}. Ending workflow." }
  reviewer:
-    approved: { role: "$END", prompt: "The review passed. Complete the workflow." }
-    rejected: { role: "developer", prompt: "The reviewer rejected your implementation. Read their feedback and fix the issues: {{{comments}}}" }
+    rejected: { role: "developer", prompt: "Reviewer rejected: {{{comments}}}. Fix the issues in repo {{{worktree}}}." }
+    approved: { role: "tester", prompt: "Review passed. Run tests on branch {{{branch}}} at {{{worktree}}}." }
+  tester:
+    fix_code: { role: "developer", prompt: "Tests found code issues: {{{report}}}. Fix and re-submit." }
+    fix_spec: { role: "planner", prompt: "Tests found spec issues: {{{report}}}. Revise the test spec." }
+    passed: { role: "committer", prompt: "All tests passed. Commit and push branch {{{branch}}} from {{{worktree}}}." }
+  committer:
+    hook_failed: { role: "developer", prompt: "Push hook failed: {{{error}}}. Fix and re-submit." }
+    committed: { role: "$END", prompt: "PR created: {{{prUrl}}}. Workflow complete." }
@@ -11,8 +11,8 @@
    "uwf": "./src/cli.ts"
  },
  "dependencies": {
-    "@uncaged/json-cas": "^0.5.2",
-    "@uncaged/json-cas-fs": "^0.5.2",
+    "@uncaged/json-cas": "^0.5.3",
+    "@uncaged/json-cas-fs": "^0.5.3",
    "@uncaged/workflow-protocol": "workspace:^",
    "@uncaged/workflow-util": "workspace:^",
    "@uncaged/workflow-util-agent": "workspace:^",
@@ -1,5 +1,5 @@
-import { describe, expect, test } from "vitest";
 import type { Target, WorkflowPayload } from "@uncaged/workflow-protocol";
+import { describe, expect, test } from "vitest";

 import { evaluate } from "../moderator/evaluate.js";

@@ -0,0 +1,137 @@
+import { readFileSync } from "node:fs";
+import { mkdtemp, rm } from "node:fs/promises";
+import { tmpdir } from "node:os";
+import { join } from "node:path";
+import { afterEach, beforeEach, describe, expect, test, vi } from "vitest";
+import { parse } from "yaml";
+import { _agentNameFromBinary, _printAgentMenu, cmdSetup } from "../commands/setup.js";
+
+// ─── _agentNameFromBinary ────────────────────────────────────────────────────
+
+describe("_agentNameFromBinary", () => {
+  test("strips uwf- prefix", () => {
+    expect(_agentNameFromBinary("uwf-hermes")).toBe("hermes");
+  });
+
+  test("strips uwf- prefix for compound names", () => {
+    expect(_agentNameFromBinary("uwf-claude-code")).toBe("claude-code");
+  });
+
+  test("returns as-is when no uwf- prefix", () => {
+    expect(_agentNameFromBinary("hermes")).toBe("hermes");
+  });
+
+  test("handles uwf-builtin", () => {
+    expect(_agentNameFromBinary("uwf-builtin")).toBe("builtin");
+  });
+});
+
+// ─── _printAgentMenu ─────────────────────────────────────────────────────────
+
+describe("_printAgentMenu", () => {
+  test("prints known agents with labels", () => {
+    const logs: string[] = [];
+    vi.spyOn(console, "log").mockImplementation((...args: unknown[]) => {
+      logs.push(args.join(" "));
+    });
+
+    _printAgentMenu(["uwf-hermes", "uwf-claude-code"]);
+
+    expect(logs.some((l) => l.includes("Hermes"))).toBe(true);
+    expect(logs.some((l) => l.includes("Claude Code"))).toBe(true);
+
+    vi.restoreAllMocks();
+  });
+
+  test("prints unknown agents with binary name as label", () => {
+    const logs: string[] = [];
+    vi.spyOn(console, "log").mockImplementation((...args: unknown[]) => {
+      logs.push(args.join(" "));
+    });
+
+    _printAgentMenu(["uwf-custom-agent"]);
+
+    expect(logs.some((l) => l.includes("uwf-custom-agent"))).toBe(true);
+
+    vi.restoreAllMocks();
+  });
+});
+
+// ─── cmdSetup agent config ───────────────────────────────────────────────────
+
+describe("cmdSetup agent configuration", () => {
+  let storageRoot: string;
+
+  beforeEach(async () => {
+    storageRoot = await mkdtemp(join(tmpdir(), "uwf-setup-agent-"));
+  });
+
+  afterEach(async () => {
+    vi.restoreAllMocks();
+    await rm(storageRoot, { recursive: true, force: true });
+  });
+
+  const baseArgs = () => ({
+    provider: "testprovider",
+    baseUrl: "https://api.test.com/v1",
+    apiKey: "sk-test",
+    model: "test-model",
+    storageRoot,
+  });
+
+  test("defaults to hermes agent when no agent specified", async () => {
+    vi.spyOn(globalThis, "fetch").mockResolvedValue(
+      new Response(JSON.stringify({}), { status: 200 }),
+    );
+
+    const result = await cmdSetup(baseArgs());
+
+    expect(result.defaultAgent).toBe("hermes");
+    const config = parse(readFileSync(join(storageRoot, "config.yaml"), "utf8"));
+    expect(config.agents.hermes).toEqual({ command: "uwf-hermes", args: [] });
+    expect(config.defaultAgent).toBe("hermes");
+  });
+
+  test("writes specified agent as default", async () => {
+    vi.spyOn(globalThis, "fetch").mockResolvedValue(
+      new Response(JSON.stringify({}), { status: 200 }),
+    );
+
+    const result = await cmdSetup({ ...baseArgs(), agent: "claude-code" });
+
+    expect(result.defaultAgent).toBe("claude-code");
+    const config = parse(readFileSync(join(storageRoot, "config.yaml"), "utf8"));
+    expect(config.agents["claude-code"]).toEqual({ command: "uwf-claude-code", args: [] });
+    expect(config.defaultAgent).toBe("claude-code");
+  });
+
+  test("preserves existing agents when adding new one", async () => {
+    vi.spyOn(globalThis, "fetch").mockResolvedValue(
+      new Response(JSON.stringify({}), { status: 200 }),
+    );
+
+    // First setup with hermes
+    await cmdSetup(baseArgs());
+    // Second setup with claude-code
+    await cmdSetup({ ...baseArgs(), agent: "claude-code" });
+
+    const config = parse(readFileSync(join(storageRoot, "config.yaml"), "utf8"));
+    expect(config.agents.hermes).toBeDefined();
+    expect(config.agents["claude-code"]).toBeDefined();
+    expect(config.defaultAgent).toBe("claude-code");
+  });
+
+  test("updates defaultAgent on re-run with different agent", async () => {
+    vi.spyOn(globalThis, "fetch").mockResolvedValue(
+      new Response(JSON.stringify({}), { status: 200 }),
+    );
+
+    await cmdSetup(baseArgs());
+    const config1 = parse(readFileSync(join(storageRoot, "config.yaml"), "utf8"));
+    expect(config1.defaultAgent).toBe("hermes");
+
+    await cmdSetup({ ...baseArgs(), agent: "builtin" });
+    const config2 = parse(readFileSync(join(storageRoot, "config.yaml"), "utf8"));
+    expect(config2.defaultAgent).toBe("builtin");
+  });
+});
@@ -250,6 +250,110 @@ describe("Suite 3: Status-Edge Consistency", () => {
  });
 });

+describe("Suite 3b: Enum-Based Multi-Exit", () => {
+  test("3b.1 enum multi-exit passes with matching graph keys", () => {
+    const wf = makeWorkflow();
+    wf.roles.reviewer = {
+      ...wf.roles.reviewer,
+      frontmatter: {
+        type: "object",
+        properties: {
+          $status: { enum: ["approved", "rejected"] },
+          comments: { type: "string" },
+        },
+        required: ["$status", "comments"],
+      } as unknown as string,
+    };
+    wf.graph.reviewer = {
+      approved: { role: "$END", prompt: "Done" },
+      rejected: { role: "writer", prompt: "Fix: {{{comments}}}" },
+    };
+    const errors = validateWorkflow(wf);
+    expect(errors).toEqual([]);
+  });
+
+  test("3b.2 enum multi-exit with extra graph key", () => {
+    const wf = makeWorkflow();
+    wf.roles.reviewer = {
+      ...wf.roles.reviewer,
+      frontmatter: {
+        type: "object",
+        properties: {
+          $status: { enum: ["approved", "rejected"] },
+          comments: { type: "string" },
+        },
+        required: ["$status", "comments"],
+      } as unknown as string,
+    };
+    wf.graph.reviewer = {
+      approved: { role: "$END", prompt: "Done" },
+      rejected: { role: "writer", prompt: "Fix" },
+      timeout: { role: "$END", prompt: "Timed out" },
+    };
+    const errors = validateWorkflow(wf);
+    expect(errors.some((e) => e.includes("extra status keys: timeout"))).toBe(true);
+  });
+
+  test("3b.3 enum multi-exit with missing graph key", () => {
+    const wf = makeWorkflow();
+    wf.roles.reviewer = {
+      ...wf.roles.reviewer,
+      frontmatter: {
+        type: "object",
+        properties: {
+          $status: { enum: ["approved", "rejected"] },
+          comments: { type: "string" },
+        },
+        required: ["$status", "comments"],
+      } as unknown as string,
+    };
+    wf.graph.reviewer = {
+      approved: { role: "$END", prompt: "Done" },
+    };
+    const errors = validateWorkflow(wf);
+    expect(errors.some((e) => e.includes("missing status keys: rejected"))).toBe(true);
+  });
+
+  test("3b.4 enum with single value (not multi-exit) treated as single-exit", () => {
+    const wf = makeWorkflow();
+    wf.roles.writer = {
+      ...wf.roles.writer,
+      frontmatter: {
+        type: "object",
+        properties: {
+          $status: { enum: ["_"] },
+          plan: { type: "string" },
+        },
+        required: ["$status", "plan"],
+      } as unknown as string,
+    };
+    wf.graph.writer = { _: { role: "reviewer", prompt: "Review: {{{plan}}}" } };
+    const errors = validateWorkflow(wf);
+    expect(errors).toEqual([]);
+  });
+
+  test("3b.5 enum multi-exit mustache var not in frontmatter", () => {
+    const wf = makeWorkflow();
+    wf.roles.reviewer = {
+      ...wf.roles.reviewer,
+      frontmatter: {
+        type: "object",
+        properties: {
+          $status: { enum: ["approved", "rejected"] },
+          comments: { type: "string" },
+        },
+        required: ["$status", "comments"],
+      } as unknown as string,
+    };
+    wf.graph.reviewer = {
+      approved: { role: "$END", prompt: "Done: {{{nonexistent}}}" },
+      rejected: { role: "writer", prompt: "Fix: {{{comments}}}" },
+    };
+    const errors = validateWorkflow(wf);
+    expect(errors.some((e) => e.includes("nonexistent") && e.includes("not found"))).toBe(true);
+  });
+});
+
 describe("Suite 4: Mustache Template Variable Existence", () => {
  test("4.1 prompt references nonexistent variable (single-exit)", () => {
    const wf = makeWorkflow();
@@ -31,7 +31,13 @@ function makeMinimalPayload(name: string, description: string): WorkflowPayload
        capabilities: [],
        procedure: "",
        output: "",
-        frontmatter: { type: "0000000000000" } as unknown as CasRef,
+        frontmatter: {
+          type: "object",
+          properties: {
+            $status: { type: "string" },
+          },
+          required: ["$status"],
+        } as unknown as CasRef,
      },
    },
    graph: {
@@ -55,8 +55,7 @@ program
  .description(
    "Stateless workflow CLI\n\n" +
      "Four-layer architecture:\n" +
-      "  workflow → thread → step → turn\n" +
-      "  模板定义   执行实例   单步结果   agent内部交互",
+      "  workflow → thread → step → turn",
  )
  .version(pkg.default.version, "-V, --version");
 program.option("--format <fmt>", "Output format: json or yaml", "json");
@@ -297,6 +297,80 @@ export function _printModelMenu(models: string[], termCols: number): void {
  }
 }

+// ──────────────────────────────────────────────────────────────────────────────
+// Agent selection prompt
+// ──────────────────────────────────────────────────────────────────────────────
+
+/** Known agent binary → display label mapping. */
+const KNOWN_AGENTS: Record<string, string> = {
+  "uwf-hermes": "Hermes (hermes-agent)",
+  "uwf-claude-code": "Claude Code",
+  "uwf-cursor": "Cursor",
+  "uwf-builtin": "Built-in (lightweight, no external agent)",
+};
+
+/** Extract short agent name from binary name: uwf-claude-code → claude-code */
+export function _agentNameFromBinary(binary: string): string {
+  return binary.replace(/^uwf-/, "");
+}
+
+/** Prints numbered agent list to stdout. */
+export function _printAgentMenu(agents: string[]): void {
+  const numWidth = String(agents.length).length;
+  for (let i = 0; i < agents.length; i++) {
+    const bin = agents[i] ?? "";
+    const label = KNOWN_AGENTS[bin] ?? bin;
+    const num = String(i + 1).padStart(numWidth);
+    console.log(`  ${num}) ${label}  (${bin})`);
+  }
+  console.log("");
+}
+
+/**
+ * Interactive agent selection. Discovers uwf-* binaries, lets user pick default.
+ * Returns short agent name (e.g. "hermes", "claude-code").
+ */
+export async function _promptAgentSelection(
+  rl: ReturnType<typeof createInterface>,
+): Promise<string> {
+  console.log("Discovering installed agents...\n");
+  const agents = await _discoverAgents();
+
+  if (agents.length === 0) {
+    console.log("  No uwf-* agent binaries found in PATH.\n");
+    console.log("  Install one first, for example:");
+    console.log("    npm i -g @uncaged/workflow-agent-hermes");
+    console.log("    npm i -g @uncaged/workflow-agent-claude-code\n");
+    const manual = (
+      await rl.question("Agent binary name (e.g. uwf-hermes), or press Enter to skip: ")
+    ).trim();
+    if (!manual) return "hermes";
+    return _agentNameFromBinary(manual.startsWith("uwf-") ? manual : `uwf-${manual}`);
+  }
+
+  if (agents.length === 1) {
+    const name = _agentNameFromBinary(agents[0] ?? "uwf-hermes");
+    const label = KNOWN_AGENTS[agents[0] ?? ""] ?? agents[0];
+    console.log(`  Found 1 agent: ${label} — auto-selected.\n`);
+    return name;
+  }
+
+  console.log(`  Found ${agents.length} agents:\n`);
+  _printAgentMenu(agents);
+  const choice = (await rl.question(`Choose default agent [1-${agents.length}]: `)).trim();
+  const n = Number.parseInt(choice, 10);
+  if (!Number.isNaN(n) && n >= 1 && n <= agents.length) {
+    const selected = agents[n - 1] ?? "uwf-hermes";
+    const name = _agentNameFromBinary(selected);
+    console.log(`  → ${name}\n`);
+    return name;
+  }
+  // Treat as literal name
+  const name = _agentNameFromBinary(choice.startsWith("uwf-") ? choice : `uwf-${choice}`);
+  console.log(`  → ${name}\n`);
+  return name;
+}
+
 type ValidationResult = { ok: boolean; error: string | null };

 /** Prints the model validation result to stdout. */
@@ -340,8 +414,9 @@ function mergeConfig(existing: Record<string, unknown>, args: SetupArgs): Record
  ) as Record<string, unknown>;

  const agentName = args.agent ?? "hermes";
-  if (Object.keys(agents).length === 0) {
-    agents.hermes = { command: "uwf-hermes", args: [] };
+  // Ensure the selected agent has an entry
+  if (!agents[agentName]) {
+    agents[agentName] = { command: `uwf-${agentName}`, args: [] };
  }

  return {
@@ -349,7 +424,7 @@ function mergeConfig(existing: Record<string, unknown>, args: SetupArgs): Record
    providers,
    models,
    agents,
-    defaultAgent: existing.defaultAgent ?? agentName,
+    defaultAgent: agentName,
    defaultModel: existing.defaultModel ?? "default",
  };
 }
@@ -543,11 +618,17 @@ export async function cmdSetupInteractive(storageRoot: string): Promise<Record<s
    rl2.close();
    console.log(`  → ${providerName}/${model}\n`);

+    // 4. Agent discovery & selection
+    const rl3 = createInterface({ input, output });
+    const agentName = await _promptAgentSelection(rl3);
+    rl3.close();
+
    const setupResult = await cmdSetup({
      provider: providerName,
      baseUrl,
      apiKey,
      model,
+      agent: agentName,
      storageRoot,
    });

@@ -2,8 +2,6 @@ import { execFileSync, spawn } from "node:child_process";
 import { access, readFile } from "node:fs/promises";
 import { dirname, isAbsolute, resolve as resolvePath } from "node:path";
 import { validate } from "@uncaged/json-cas";
-import { getEnvPath, loadWorkflowConfig } from "@uncaged/workflow-util-agent";
-import { evaluate } from "../moderator/index.js";
 import type {
  AgentAlias,
  AgentConfig,
@@ -24,9 +22,11 @@ import {
  generateUlid,
  type ProcessLogger,
 } from "@uncaged/workflow-util";
+import { getEnvPath, loadWorkflowConfig } from "@uncaged/workflow-util-agent";
 import { config as loadDotenv } from "dotenv";
 import { parse } from "yaml";
 import { createMarker, deleteMarker, isThreadRunning } from "../background/index.js";
+import { evaluate } from "../moderator/index.js";
 import {
  appendThreadHistory,
  createUwfStore,
@@ -23,6 +23,28 @@ function isOneOfSchema(fm: unknown): fm is SchemaObj & { oneOf: SchemaObj[] } {
  return Array.isArray(obj.oneOf);
 }

+/** Check if a frontmatter schema uses enum-based multi-exit ($status with multiple enum values). */
+function isEnumMultiExit(fm: unknown): boolean {
+  if (typeof fm !== "object" || fm === null) return false;
+  const obj = fm as SchemaObj;
+  const props = obj.properties as Record<string, SchemaObj> | undefined;
+  if (!props?.$status) return false;
+  const statusDef = props.$status;
+  if (!Array.isArray(statusDef.enum)) return false;
+  // Filter out "_" (wildcard) — if remaining values > 1, it's multi-exit
+  const statuses = (statusDef.enum as string[]).filter((s) => s !== "_");
+  return statuses.length > 1;
+}
+
+/** Extract status values from an enum-based $status field. */
+function getEnumStatuses(fm: SchemaObj): string[] {
+  const props = fm.properties as Record<string, SchemaObj> | undefined;
+  if (!props?.$status) return [];
+  const statusDef = props.$status;
+  if (!Array.isArray(statusDef.enum)) return [];
+  return (statusDef.enum as string[]).filter((s) => s !== "_");
+}
+
 /** Get property names from a schema object. */
 function getPropertyNames(schema: SchemaObj): Set<string> {
  const props = schema.properties;
@@ -230,6 +252,11 @@ function checkRoleConsistency(payload: WorkflowPayload, errors: string[]): void
      checkOneOfDiscriminant(roleName, variants, statuses, errors);
      checkMultiExitEdges(roleName, graphKeys, new Set(statuses), errors);
      checkMultiExitMustache(roleName, graphEntry, variants, errors);
+    } else if (isEnumMultiExit(fm)) {
+      const statuses = getEnumStatuses(fm as SchemaObj);
+      checkMultiExitEdges(roleName, graphKeys, new Set(statuses), errors);
+      // For enum-based schemas, mustache vars come from the flat properties
+      checkSingleExitMustache(roleName, graphEntry, fm as SchemaObj, errors);
    } else {
      checkSingleExitRole(roleName, graphKeys, graphEntry, fm as SchemaObj | null, errors);
    }
@@ -265,6 +292,27 @@ function checkSingleExitRole(
  }
 }

+/** Check mustache vars in all edge prompts against flat schema properties. */
+function checkSingleExitMustache(
+  roleName: string,
+  graphEntry: Record<string, { role: string; prompt: string }>,
+  fm: SchemaObj,
+  errors: string[],
+): void {
+  const propNames = getPropertyNames(fm);
+  for (const [status, target] of Object.entries(graphEntry)) {
+    const vars = extractMustacheVars(target.prompt);
+    for (const v of vars) {
+      if (v === "$status") continue;
+      if (!propNames.has(v)) {
+        errors.push(
+          `prompt variable "${v}" in graph[${roleName}][${status}] not found in role "${roleName}" frontmatter`,
+        );
+      }
+    }
+  }
+}
+
 /**
 * Validate a parsed WorkflowPayload for semantic correctness.
 * Returns an array of error messages. Empty array = valid.
@@ -5,8 +5,5 @@
    "outDir": "dist"
  },
  "include": ["src"],
-  "references": [
-    { "path": "../workflow-protocol" },
-    { "path": "../workflow-util-agent" }
-  ]
+  "references": [{ "path": "../workflow-protocol" }, { "path": "../workflow-util-agent" }]
 }
@@ -22,7 +22,7 @@
    "test:ci": "bun test"
  },
  "dependencies": {
-    "@uncaged/json-cas": "^0.5.2",
+    "@uncaged/json-cas": "^0.5.3",
    "@uncaged/workflow-util-agent": "workspace:^",
    "@uncaged/workflow-util": "workspace:^"
  },
@@ -1,4 +1,5 @@
 import type { Store } from "@uncaged/json-cas";
+import { createLogger, generateUlid } from "@uncaged/workflow-util";
 import {
  type AgentContext,
  type AgentRunResult,
@@ -7,7 +8,6 @@ import {
  resolveModel,
  resolveStorageRoot,
 } from "@uncaged/workflow-util-agent";
-import { createLogger, generateUlid } from "@uncaged/workflow-util";

 import { storeBuiltinDetail } from "./detail.js";
 import type { ChatMessage } from "./llm/index.js";
@@ -1,5 +1,5 @@
-import type { ResolvedLlmProvider } from "@uncaged/workflow-util-agent";
 import { createLogger } from "@uncaged/workflow-util";
+import type { ResolvedLlmProvider } from "@uncaged/workflow-util-agent";

 import {
  type ChatMessage,
@@ -1,6 +1,6 @@
 import { describe, expect, test } from "bun:test";
-import type { AgentContext } from "@uncaged/workflow-util-agent";
 import type { ThreadId } from "@uncaged/workflow-protocol";
+import type { AgentContext } from "@uncaged/workflow-util-agent";
 import { buildClaudeCodePrompt } from "../src/claude-code.js";

 function makeCtx(overrides: Partial<AgentContext> = {}): AgentContext {
@@ -22,7 +22,7 @@
    "test:ci": "bun test"
  },
  "dependencies": {
-    "@uncaged/json-cas": "^0.5.2",
+    "@uncaged/json-cas": "^0.5.3",
    "@uncaged/workflow-util-agent": "workspace:^",
    "@uncaged/workflow-util": "workspace:^"
  },
@@ -1,5 +1,6 @@
 import { spawn } from "node:child_process";
 import type { Store } from "@uncaged/json-cas";
+import { createLogger } from "@uncaged/workflow-util";
 import {
  type AgentContext,
  type AgentRunResult,
@@ -9,7 +10,6 @@ import {
  getCachedSessionId,
  setCachedSessionId,
 } from "@uncaged/workflow-util-agent";
-import { createLogger } from "@uncaged/workflow-util";

 import { parseClaudeCodeStreamOutput, storeClaudeCodeDetail } from "./session-detail.js";

@@ -91,5 +91,3 @@ describe("handleSessionUpdate — helper extraction", () => {
    expect((client as any).messageChunks).toHaveLength(0);
  });
 });
-
-
@@ -1,6 +1,6 @@
 import { describe, expect, test } from "bun:test";
-import type { AgentContext } from "@uncaged/workflow-util-agent";
 import type { ThreadId } from "@uncaged/workflow-protocol";
+import type { AgentContext } from "@uncaged/workflow-util-agent";
 import { buildHermesPrompt } from "../src/hermes.js";

 function makeCtx(overrides: Partial<AgentContext> = {}): AgentContext {
@@ -1,6 +1,6 @@
 import { afterEach, describe, expect, it } from "bun:test";

-import { HermesAcpClient } from "../src/acp-client.js";
+import { HermesAcpClient } from "../../src/acp-client.js";

 /**
 * E2E test for cross-process session resume.
@@ -19,10 +19,10 @@
  },
  "scripts": {
    "test": "bun test",
-"test:ci": "bun test __tests__/*.test.ts"
+    "test:ci": "bun test __tests__/*.test.ts"
  },
  "dependencies": {
-    "@uncaged/json-cas": "^0.5.2",
+    "@uncaged/json-cas": "^0.5.3",
    "@uncaged/workflow-util-agent": "workspace:^",
    "@uncaged/workflow-protocol": "workspace:^",
    "@uncaged/workflow-util": "workspace:^"
@@ -1,4 +1,5 @@
 import type { Store } from "@uncaged/json-cas";
+import { createLogger } from "@uncaged/workflow-util";
 import {
  type AgentContext,
  type AgentRunResult,
@@ -6,7 +7,6 @@ import {
  buildRolePrompt,
  createAgent,
 } from "@uncaged/workflow-util-agent";
-import { createLogger } from "@uncaged/workflow-util";

 import { HermesAcpClient } from "./acp-client.js";
 import { getCachedSessionId, isResumeDisabled, setCachedSessionId } from "./session-cache.js";
@@ -1,10 +1,10 @@
 // Re-export session cache from the shared agent-kit package with agent name injected.

+import type { ThreadId } from "@uncaged/workflow-protocol";
 import {
  getCachedSessionId as getCachedSessionIdBase,
  setCachedSessionId as setCachedSessionIdBase,
 } from "@uncaged/workflow-util-agent";
-import type { ThreadId } from "@uncaged/workflow-protocol";

 export async function getCachedSessionId(threadId: ThreadId, role: string): Promise<string | null> {
  return getCachedSessionIdBase("hermes", threadId, role);
@@ -15,8 +15,8 @@
    }
  },
  "dependencies": {
-    "@uncaged/json-cas": "^0.5.2",
-    "@uncaged/json-cas-fs": "^0.5.2"
+    "@uncaged/json-cas": "^0.5.3",
+    "@uncaged/json-cas-fs": "^0.5.3"
  },
  "devDependencies": {
    "typescript": "^5.8.3"
@@ -19,8 +19,8 @@
    "test:ci": "bun test"
  },
  "dependencies": {
-    "@uncaged/json-cas": "^0.5.2",
-    "@uncaged/json-cas-fs": "^0.5.2",
+    "@uncaged/json-cas": "^0.5.3",
+    "@uncaged/json-cas-fs": "^0.5.3",
    "@uncaged/workflow-protocol": "workspace:^",
    "@uncaged/workflow-util": "workspace:^",
    "dotenv": "^16.6.1",
Author	SHA1	Message	Date
xiaoju	54dc8fcb39	fix(validate): support enum-based multi-exit and upgrade json-cas to 0.5.3 CI / test (pull_request) Failing after 5m55s Details Two fixes for 'uwf thread start solve-issue' failures: 1. json-cas 0.5.2 (npm) was missing oneOf in ALLOWED_SCHEMA_KEYS. Published json-cas 0.5.3 with the fix, bumped all packages to ^0.5.3. 2. Semantic validator only recognized oneOf-based multi-exit schemas. Roles using $status with enum (e.g. enum: [approved, rejected]) were incorrectly treated as single-exit. Added isEnumMultiExit() support. Changes: - validate-semantic.ts: isEnumMultiExit(), getEnumStatuses(), checkSingleExitMustache() - All package.json: @uncaged/json-cas ^0.5.2 → ^0.5.3 - validate-semantic.test.ts: 5 new enum multi-exit tests (Suite 3b) - solve-issue-tea-worktree.test.ts: updated for current workflow structure 小橘 🍊	2026-05-25 13:13:51 +00:00
xiaoju	a40e1bb847	fix(cli): remove Chinese text from uwf --help description CI / test (pull_request) Failing after 33s Details Remove the annotation line entirely — the layer names are self-explanatory. 小橘 🍊	2026-05-25 12:42:10 +00:00
xiaomo	2c8bcf7996	Merge pull request 'feat(setup): auto-discover and configure agents during uwf setup' (#515 ) from feat/424-setup-agent-discovery into main CI / test (push) Failing after 1m34s Details	2026-05-25 12:39:34 +00:00
xiaoju	af2a25bf87	feat(setup): auto-discover and configure agents during uwf setup CI / test (pull_request) Failing after 14m13s Details - Add agent discovery step to cmdSetupInteractive flow - _promptAgentSelection: discover uwf-* binaries, auto-select if only one, prompt user to choose if multiple, show install hints if none found - mergeConfig: always write selected agent entry, update defaultAgent - Known agent labels for hermes, claude-code, cursor, builtin - 10 new tests for _agentNameFromBinary, _printAgentMenu, cmdSetup agent config Fixes #424	2026-05-25 12:38:40 +00:00
xiaomo	0abc8bcb3e	Merge pull request 'fix(test): correct import path in resume-e2e integration test' (#514 ) from fix/hermes-integration-test-import into main CI / test (push) Failing after 1m44s Details	2026-05-25 12:31:27 +00:00
xiaoju	524e00a0a6	fix(test): correct import path in resume-e2e integration test CI / check (pull_request) Failing after 3m7s Details The test file moved to __tests__/integration/ but the import path was not updated from ../src/ to ../../src/. 小橘 🍊	2026-05-25 12:29:18 +00:00
xingyue	eba3c70e76	ci: add gitea actions workflow CI / test (push) Failing after 6m22s Details	2026-05-25 19:43:57 +08:00
xiaoju	e2d60fa72e	fix(test): use valid JSON Schema in workflow-resolution test fixture CI / check (push) Failing after 32s Details The test used a fake CasRef string as frontmatter, which fails putSchema validation when loading from YAML files. Replace with a proper JSON Schema object. Fixes pre-existing failures in workflow-resolution, cas-exit-code, and thread-step-count tests.	2026-05-25 11:29:05 +00:00
xiaoju	dfae96ad45	style: fix biome import ordering after package rename CI / check (push) Has been cancelled Details	2026-05-25 11:26:01 +00:00
xiaomo	2f4473f22c	Merge pull request 'refactor: rename workflow-agent-kit → workflow-util-agent, merge moderator' (#513 ) from refactor/512-rename-packages into main CI / check (push) Has been cancelled Details	2026-05-25 11:20:39 +00:00