Compare commits

...

12 Commits

Author SHA1 Message Date
xiaoju 5230462b8d fix: bin entry point to dist/cli.js for node compatibility
CI / test (pull_request) Failing after 9m12s
Fixes #523
2026-05-25 15:35:55 +00:00
xiaomo 4a39d3fdef Merge pull request 'feat(skill): expand uwf skill with architecture, yaml, moderator, list subcommands' (#521) from fix/517-expand-skill into main
CI / test (push) Failing after 22m38s
2026-05-25 15:00:34 +00:00
xingyue 4de13cea44 fix: correct skill references and remove hardcoded test path
CI / test (pull_request) Failing after 23m48s
- moderator-reference: use nested map graph format matching evaluate.ts
- yaml-reference: use goal/procedure/output/capabilities/frontmatter fields
  matching actual WorkflowPayload, not fabricated system/outputSchema
- skill.test.ts: replace hardcoded absolute path with __dirname-relative
- skill.test.ts: assert 'frontmatter' instead of 'outputSchema'
2026-05-25 22:59:38 +08:00
xingyue d9d542c570 fix: correct biome suppressions and formatting for #517
CI / test (pull_request) Failing after 9m9s
2026-05-25 22:47:00 +08:00
xingyue cf6115517c fix: auto-fix biome lint violations in skill.test.ts 2026-05-25 22:44:32 +08:00
xingyue 108f134020 feat(skill): add architecture, yaml, moderator, list subcommands (#517) 2026-05-25 22:42:05 +08:00
xiaomo 8123399189 Merge pull request 'fix(uwf-hermes): read turn data from session file instead of ACP stream' (#520) from fix/519-read-session-file into main
CI / test (push) Failing after 17m33s
2026-05-25 14:24:41 +00:00
xingyue 6324122168 fix(uwf-hermes): read turn data from Hermes session file instead of ACP stream
CI / test (pull_request) Failing after 12m19s
Closes #519

The ACP protocol's tool_call updates only carry a display title (not a
structured tool name) and omit rawInput for polished tools, making the
reconstructed messages unusable for step read/show.

Changes:
- hermes.ts: storePromptResult reads ~/.hermes/sessions/session_{id}.json
  via loadHermesSession() instead of using ACP-reconstructed messages
- acp-client.ts: strip message/tool-call collection logic, keep only
  text chunk accumulation for final response extraction
- step.ts: TurnData gains role + toolCalls fields; formatTurnBody
  renders them in step read markdown output
- README: document sessions.write_json_snapshots requirement
2026-05-25 22:21:03 +08:00
xiaoju 25b411f22e Merge pull request 'fix(validate): support enum-based multi-exit frontmatter schemas' (#518) from fix/enum-multi-exit-validation into main
CI / test (push) Failing after 15m56s
2026-05-25 13:23:10 +00:00
xiaoju 54dc8fcb39 fix(validate): support enum-based multi-exit and upgrade json-cas to 0.5.3
CI / test (pull_request) Failing after 5m55s
Two fixes for 'uwf thread start solve-issue' failures:

1. json-cas 0.5.2 (npm) was missing oneOf in ALLOWED_SCHEMA_KEYS.
   Published json-cas 0.5.3 with the fix, bumped all packages to ^0.5.3.

2. Semantic validator only recognized oneOf-based multi-exit schemas.
   Roles using $status with enum (e.g. enum: [approved, rejected]) were
   incorrectly treated as single-exit. Added isEnumMultiExit() support.

Changes:
- validate-semantic.ts: isEnumMultiExit(), getEnumStatuses(), checkSingleExitMustache()
- All package.json: @uncaged/json-cas ^0.5.2 → ^0.5.3
- validate-semantic.test.ts: 5 new enum multi-exit tests (Suite 3b)
- solve-issue-tea-worktree.test.ts: updated for current workflow structure

小橘 🍊
2026-05-25 13:13:51 +00:00
xiaoju a40e1bb847 fix(cli): remove Chinese text from uwf --help description
CI / test (pull_request) Failing after 33s
Remove the annotation line entirely — the layer names are self-explanatory.

小橘 🍊
2026-05-25 12:42:10 +00:00
xiaomo 2c8bcf7996 Merge pull request 'feat(setup): auto-discover and configure agents during uwf setup' (#515) from feat/424-setup-agent-discovery into main
CI / test (push) Failing after 1m34s
2026-05-25 12:39:34 +00:00
23 changed files with 868 additions and 317 deletions
+175 -69
View File
@@ -1,92 +1,198 @@
name: "solve-issue"
description: "End-to-end issue resolution"
description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds."
roles:
planner:
description: "Creates implementation plan"
goal: "You are a planning agent. You analyze issues and create implementation plans grounded in the actual codebase."
description: "Analyzes issue and outputs a TDD test spec"
goal: "You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify."
capabilities:
- issue-analysis
- planning
- file-read
- shell
procedure: |
1. Locate the code repository:
- Check if the current working directory is the repo (look for package.json, .git, etc.)
- If the task mentions a repo URL, clone it first.
- If this is a new project, create the repo and note the path.
2. Explore the codebase — read the relevant source files mentioned in the issue. Understand the current architecture, types, and conventions (check CLAUDE.md, CONTRIBUTING.md, .cursor/rules/).
3. Identify which files need changes and what the changes should be, with specific code references.
4. Output the plan with:
- `repoPath`: absolute path to the repository root
- `plan`: detailed implementation plan with file paths and code references
- `steps`: concrete action items for the developer
output: |
Provide repoPath, plan summary, and steps in the frontmatter.
The plan MUST reference actual file paths and code structures you found by reading the source.
Do NOT guess — if you haven't read a file, read it before referencing it.
On first run (no previous steps):
1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
2. Look for project conventions files (CLAUDE.md, CONTRIBUTING.md, .cursor/rules/) in the repo
3. Assess whether the issue has enough information to produce a test spec
4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output $status=insufficient_info
5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
On subsequent runs (bounced back by tester with fix_spec):
1. Read the tester's output from the previous step to understand what's wrong with the spec
2. Revise the test spec accordingly
After producing the test spec:
1. Store it via `uwf cas put-text "<markdown content>"` and capture the returned hash
2. Put the hash in frontmatter.plan (required when $status=ready)
3. Set repoPath to the absolute path of the repository root
output: "Output a brief summary of the test spec. Set $status to ready (with plan hash and repoPath) or insufficient_info."
frontmatter:
type: object
properties:
$status:
enum: ["_"]
repoPath:
type: string
plan:
type: string
required: [$status, repoPath, plan]
oneOf:
- properties:
$status: { const: "ready" }
plan: { type: string }
repoPath: { type: string }
required: [$status, plan, repoPath]
- properties:
$status: { const: "insufficient_info" }
required: [$status]
developer:
description: "Implements code changes"
goal: "You are a developer agent. You implement code changes according to plans."
description: "TDD implementation per test spec"
goal: "You are a developer agent. You implement code changes following TDD — write tests first, then implementation."
capabilities:
- file-edit
- shell
- testing
- coding
procedure: |
1. Read the planner's output to get the repoPath and implementation plan.
2. cd to the repoPath before making any changes.
3. Create a feature branch from the default branch.
4. Implement the plan — write code, tests, and ensure existing tests pass.
5. Run the project's lint/check command (e.g. `bun run check`, `npm run lint`) and fix ALL errors before proceeding. Build and lint must pass cleanly.
6. Commit your changes with a descriptive message referencing the issue.
output: "List all files changed and provide a summary of the implementation."
IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
The repo path and other details are provided in your task prompt.
Before starting any work, set up an isolated worktree:
1. cd into the repo path provided in your task prompt
2. `git fetch origin` to get latest refs
3. First time (no existing branch):
- `git worktree add .worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
- `cd .worktrees/fix/<issue-number>-<short-slug> && bun install`
4. If bounced back from reviewer or tester (branch already exists):
- cd into the existing worktree under `.worktrees/fix/<issue-number>-<short-slug>`
- `git fetch origin && git rebase origin/main`
5. ALL subsequent work must happen inside the worktree directory.
Then implement TDD:
6. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the planner's output in your task prompt)
7. If bounced back from reviewer or tester: read the previous role's feedback in your task prompt
8. Write tests first based on the spec
9. Implement the code to make tests pass
10. Ensure `bun run build` passes with no errors
11. Run `bun test` to verify all tests pass
If you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,
or repeated attempts fail), set $status=failed with a reason.
output: "List all files changed and provide a summary. Set $status to done (with branch/worktree), or failed (with reason)."
frontmatter:
type: object
properties:
$status:
enum: ["_"]
filesChanged:
type: array
items:
type: string
summary:
type: string
required: [$status, filesChanged, summary]
oneOf:
- properties:
$status: { const: "done" }
branch: { type: string }
worktree: { type: string }
required: [$status, branch, worktree]
- properties:
$status: { const: "failed" }
reason: { type: string }
required: [$status, reason]
reviewer:
description: "Reviews code changes"
goal: "You are a code reviewer. You review implementations for correctness and quality."
description: "Code standards compliance check"
goal: "You are a code reviewer. You verify code standards compliance — NOT functionality (that's the tester's job)."
capabilities:
- code-review
- static-analysis
procedure: |
1. Run hard checks first — build (`bun run build` or equivalent) and lint (`bunx biome check .` or equivalent) MUST pass with zero errors. If they fail, reject immediately.
2. Then review code quality: correctness, edge cases, naming, project conventions (CLAUDE.md), and test coverage.
3. Only reject for hard check failures or genuine correctness/security issues. Style suggestions alone should not block approval.
output: "Approve or reject with detailed comments explaining your decision."
The worktree path is provided in your task prompt. cd into it first.
Before reviewing, verify the git branch:
1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
2. If the branch doesn't correspond to the issue, flag it in your output and reject
Then perform code review:
Hard checks (must all pass):
3. `bun run build` — no build errors
4. `bunx biome check` — no lint violations
5. TypeScript strict mode — no type errors
Soft checks (review against project conventions if CLAUDE.md / .cursor/rules exist):
- Naming conventions, module boundaries, code style
- No `console.log` in production code
- No dynamic imports in production code
Only review standards compliance. Do NOT test functionality.
If rejecting, you MUST explain the specific reason in your output.
output: "Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments)."
frontmatter:
type: object
properties:
$status:
enum: ["approved", "rejected"]
comments:
type: string
required: [$status, comments]
oneOf:
- properties:
$status: { const: "approved" }
branch: { type: string }
worktree: { type: string }
required: [$status, branch, worktree]
- properties:
$status: { const: "rejected" }
comments: { type: string }
worktree: { type: string }
required: [$status, comments, worktree]
tester:
description: "Functional correctness verification"
goal: "You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec."
capabilities:
- testing
procedure: |
The worktree path is provided in your task prompt. cd into it first.
1. Run `bun test` for automated test verification
2. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the planner step in the thread history)
3. Verify each scenario in the spec is covered and passing
4. Determine outcome:
- passed: all scenarios verified, tests pass
- fix_code: tests fail or implementation doesn't match spec → send back to developer
- fix_spec: the spec itself is wrong or incomplete → send back to planner
output: "Report test results per scenario. Set $status to passed (with branch/worktree), fix_code (with report), or fix_spec (with report)."
frontmatter:
oneOf:
- properties:
$status: { const: "passed" }
branch: { type: string }
worktree: { type: string }
required: [$status, branch, worktree]
- properties:
$status: { const: "fix_code" }
report: { type: string }
required: [$status, report]
- properties:
$status: { const: "fix_spec" }
report: { type: string }
required: [$status, report]
committer:
description: "Commits and creates PR"
goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue."
capabilities: []
procedure: |
The worktree path, branch name, and repo info are provided in your task prompt.
cd into the worktree first.
Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
1. Stage all changes: `git add -A`
2. Commit with a descriptive message referencing the issue: `git commit -m "type: description\n\nFixes #N"`
3. Push the branch: `git push -u origin <branch-name>`
- If push hook fails: capture the error log in your output, mark hook_failed
4. On push success: create a PR via `tea pr create --repo <owner/repo> --title "..." --description "..."`
- Extract owner/repo from: `git remote get-url origin | sed 's/.*[:/]\([^/]*\/[^.]*\).*/\1/'`
- PR description must include: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
- On tea failure: capture stderr/stdout, include PR details for manual creation, mark hook_failed
5. After PR creation, clean up the worktree:
- cd to the repo root (parent of .worktrees)
- `git worktree remove <worktree-path>`
output: "Include PR URL on success or error log on failure. Set $status to committed (with prUrl) or hook_failed (with error)."
frontmatter:
oneOf:
- properties:
$status: { const: "committed" }
prUrl: { type: string }
required: [$status, prUrl]
- properties:
$status: { const: "hook_failed" }
error: { type: string }
required: [$status, error]
graph:
$START:
_: { role: "planner", prompt: "Analyze the issue described in the task and produce a detailed implementation plan." }
_: { role: "planner", prompt: "Analyze the issue and produce an implementation plan." }
planner:
_: { role: "developer", prompt: "Implement the plan from the planner. Write code, tests, and ensure existing tests pass." }
insufficient_info: { role: "$END", prompt: "Insufficient information to proceed; end the workflow." }
ready: { role: "developer", prompt: "Implement the TDD test spec (CAS hash: {{{plan}}}) in repo {{{repoPath}}}." }
developer:
_: { role: "reviewer", prompt: "Review the developer's implementation against the plan for correctness and quality." }
done: { role: "reviewer", prompt: "Review branch {{{branch}}} at {{{worktree}}} for code standards compliance." }
failed: { role: "$END", prompt: "Developer failed: {{{reason}}}. Ending workflow." }
reviewer:
approved: { role: "$END", prompt: "The review passed. Complete the workflow." }
rejected: { role: "developer", prompt: "The reviewer rejected your implementation. Read their feedback and fix the issues: {{{comments}}}" }
rejected: { role: "developer", prompt: "Reviewer rejected: {{{comments}}}. Fix the issues in repo {{{worktree}}}." }
approved: { role: "tester", prompt: "Review passed. Run tests on branch {{{branch}}} at {{{worktree}}}." }
tester:
fix_code: { role: "developer", prompt: "Tests found code issues: {{{report}}}. Fix and re-submit." }
fix_spec: { role: "planner", prompt: "Tests found spec issues: {{{report}}}. Revise the test spec." }
passed: { role: "committer", prompt: "All tests passed. Commit and push branch {{{branch}}} from {{{worktree}}}." }
committer:
hook_failed: { role: "developer", prompt: "Push hook failed: {{{error}}}. Fix and re-submit." }
committed: { role: "$END", prompt: "PR created: {{{prUrl}}}. Workflow complete." }
+3 -3
View File
@@ -8,11 +8,11 @@
],
"type": "module",
"bin": {
"uwf": "./src/cli.ts"
"uwf": "./dist/cli.js"
},
"dependencies": {
"@uncaged/json-cas": "^0.5.2",
"@uncaged/json-cas-fs": "^0.5.2",
"@uncaged/json-cas": "^0.5.3",
"@uncaged/json-cas-fs": "^0.5.3",
"@uncaged/workflow-protocol": "workspace:^",
"@uncaged/workflow-util": "workspace:^",
"@uncaged/workflow-util-agent": "workspace:^",
@@ -0,0 +1,78 @@
import { execFileSync } from "node:child_process";
import { dirname, join } from "node:path";
import { fileURLToPath } from "node:url";
import { describe, expect, test } from "vitest";
const __dirname = dirname(fileURLToPath(import.meta.url));
import {
cmdSkillArchitecture,
cmdSkillCli,
cmdSkillList,
cmdSkillModerator,
cmdSkillYaml,
} from "../commands/skill.js";
describe("skill commands", () => {
test("skill list returns all skill names", () => {
const result = cmdSkillList();
expect(result).toBeInstanceOf(Array);
expect(result).toContain("cli");
expect(result).toContain("architecture");
expect(result).toContain("yaml");
expect(result).toContain("moderator");
for (const name of result) {
expect(typeof name).toBe("string");
expect(name).toMatch(/^\S+$/);
}
});
test("skill architecture returns non-empty markdown string", () => {
const result = cmdSkillArchitecture();
expect(typeof result).toBe("string");
expect(result).toContain("CAS");
expect(result).toContain("Thread");
expect(result).toContain("Workflow");
expect(result).toContain("Step");
expect(result.length).toBeGreaterThan(200);
});
test("skill yaml returns non-empty markdown string", () => {
const result = cmdSkillYaml();
expect(typeof result).toBe("string");
expect(result).toContain("roles");
expect(result).toContain("graph");
expect(result).toContain("frontmatter");
expect(result.length).toBeGreaterThan(200);
});
test("skill moderator returns non-empty markdown string", () => {
const result = cmdSkillModerator();
expect(typeof result).toBe("string");
expect(result).toContain("routing");
expect(result).toContain("status");
expect(result.length).toBeGreaterThan(200);
// Check for edge or graph
expect(result).toMatch(/edge|graph/i);
});
test("skill cli returns CLI reference markdown", () => {
const result = cmdSkillCli();
expect(typeof result).toBe("string");
expect(result).toContain("uwf");
});
test("skill help subcommand is suppressed", () => {
const output = execFileSync("bun", ["src/cli.ts", "skill", "--help"], {
cwd: join(__dirname, "..", ".."),
encoding: "utf-8",
env: { ...process.env, PATH: `/opt/homebrew/bin:${process.env.PATH}` },
});
expect(output).not.toMatch(/help\s+\[command\]/i);
expect(output).toContain("cli");
expect(output).toContain("architecture");
expect(output).toContain("yaml");
expect(output).toContain("moderator");
expect(output).toContain("list");
});
});
@@ -453,7 +453,78 @@ describe("step read", () => {
expect(markdown).not.toContain("## Turn");
});
test("test 6: turn content with special characters", async () => {
test("test 6: displays role and tool calls in turn body", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
const detailSchemas = await registerDetailSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const turnHash = await store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content: "",
toolCalls: [{ name: "terminal", args: '{"command":"echo hi"}' }],
reasoning: null,
});
const detailHash = await store.put(detailSchemas.detail, {
sessionId: "session-1",
model: "test-model",
duration: 1000,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await store.put(schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: detailHash,
agent: "uwf-hermes",
startedAtMs: 1000000000000,
completedAtMs: 1000000005000,
});
const markdown = await cmdStepRead(tmpDir, stepHash, 4000);
expect(markdown).toContain("**Turn role:** assistant");
expect(markdown).toContain("**terminal**");
expect(markdown).toContain('{"command":"echo hi"}');
});
test("test 7: turn content with special characters", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
@@ -250,6 +250,110 @@ describe("Suite 3: Status-Edge Consistency", () => {
});
});
describe("Suite 3b: Enum-Based Multi-Exit", () => {
test("3b.1 enum multi-exit passes with matching graph keys", () => {
const wf = makeWorkflow();
wf.roles.reviewer = {
...wf.roles.reviewer,
frontmatter: {
type: "object",
properties: {
$status: { enum: ["approved", "rejected"] },
comments: { type: "string" },
},
required: ["$status", "comments"],
} as unknown as string,
};
wf.graph.reviewer = {
approved: { role: "$END", prompt: "Done" },
rejected: { role: "writer", prompt: "Fix: {{{comments}}}" },
};
const errors = validateWorkflow(wf);
expect(errors).toEqual([]);
});
test("3b.2 enum multi-exit with extra graph key", () => {
const wf = makeWorkflow();
wf.roles.reviewer = {
...wf.roles.reviewer,
frontmatter: {
type: "object",
properties: {
$status: { enum: ["approved", "rejected"] },
comments: { type: "string" },
},
required: ["$status", "comments"],
} as unknown as string,
};
wf.graph.reviewer = {
approved: { role: "$END", prompt: "Done" },
rejected: { role: "writer", prompt: "Fix" },
timeout: { role: "$END", prompt: "Timed out" },
};
const errors = validateWorkflow(wf);
expect(errors.some((e) => e.includes("extra status keys: timeout"))).toBe(true);
});
test("3b.3 enum multi-exit with missing graph key", () => {
const wf = makeWorkflow();
wf.roles.reviewer = {
...wf.roles.reviewer,
frontmatter: {
type: "object",
properties: {
$status: { enum: ["approved", "rejected"] },
comments: { type: "string" },
},
required: ["$status", "comments"],
} as unknown as string,
};
wf.graph.reviewer = {
approved: { role: "$END", prompt: "Done" },
};
const errors = validateWorkflow(wf);
expect(errors.some((e) => e.includes("missing status keys: rejected"))).toBe(true);
});
test("3b.4 enum with single value (not multi-exit) treated as single-exit", () => {
const wf = makeWorkflow();
wf.roles.writer = {
...wf.roles.writer,
frontmatter: {
type: "object",
properties: {
$status: { enum: ["_"] },
plan: { type: "string" },
},
required: ["$status", "plan"],
} as unknown as string,
};
wf.graph.writer = { _: { role: "reviewer", prompt: "Review: {{{plan}}}" } };
const errors = validateWorkflow(wf);
expect(errors).toEqual([]);
});
test("3b.5 enum multi-exit mustache var not in frontmatter", () => {
const wf = makeWorkflow();
wf.roles.reviewer = {
...wf.roles.reviewer,
frontmatter: {
type: "object",
properties: {
$status: { enum: ["approved", "rejected"] },
comments: { type: "string" },
},
required: ["$status", "comments"],
} as unknown as string,
};
wf.graph.reviewer = {
approved: { role: "$END", prompt: "Done: {{{nonexistent}}}" },
rejected: { role: "writer", prompt: "Fix: {{{comments}}}" },
};
const errors = validateWorkflow(wf);
expect(errors.some((e) => e.includes("nonexistent") && e.includes("not found"))).toBe(true);
});
});
describe("Suite 4: Mustache Template Variable Existence", () => {
test("4.1 prompt references nonexistent variable (single-exit)", () => {
const wf = makeWorkflow();
+41 -7
View File
@@ -1,4 +1,4 @@
#!/usr/bin/env bun
#!/usr/bin/env node
import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
import { Command } from "commander";
@@ -15,7 +15,13 @@ import {
} from "./commands/cas.js";
import { cmdLogClean, cmdLogList, cmdLogShow } from "./commands/log.js";
import { cmdSetup, cmdSetupInteractive } from "./commands/setup.js";
import { cmdSkillCli } from "./commands/skill.js";
import {
cmdSkillArchitecture,
cmdSkillCli,
cmdSkillList,
cmdSkillModerator,
cmdSkillYaml,
} from "./commands/skill.js";
import { cmdStepFork, cmdStepList, cmdStepRead, cmdStepShow } from "./commands/step.js";
import {
cmdThreadCancel,
@@ -55,8 +61,7 @@ program
.description(
"Stateless workflow CLI\n\n" +
"Four-layer architecture:\n" +
" workflow → thread → step → turn\n" +
" 模板定义 执行实例 单步结果 agent内部交互",
" workflow → thread → step → turn",
)
.version(pkg.default.version, "-V, --version");
program.option("--format <fmt>", "Output format: json or yaml", "json");
@@ -176,11 +181,11 @@ function parseStatusFilter(status: string | undefined): ThreadStatus[] | null {
if (raw === "active") return ["idle", "running"];
const parts = raw.split(",").map((s) => s.trim());
const validStatuses: ThreadStatus[] = ["idle", "running", "completed"];
const validStatuses: ThreadStatus[] = ["idle", "running", "completed", "cancelled"];
for (const part of parts) {
if (!validStatuses.includes(part as ThreadStatus)) {
process.stderr.write(
`Invalid status: ${part}. Must be one of: idle, running, completed, active\n`,
`Invalid status: ${part}. Must be one of: idle, running, completed, cancelled, active\n`,
);
process.exit(1);
}
@@ -233,7 +238,7 @@ thread
.description("List threads")
.option(
"--status <status>",
"Filter by status: idle, running, completed, active (idle+running), or comma-separated values",
"Filter by status: idle, running, completed, cancelled, active (idle+running), or comma-separated values",
)
.option("--after <date>", "Filter threads created after this date (ISO or relative like '7d')")
.option("--before <date>", "Filter threads created before this date (ISO or relative like '7d')")
@@ -474,6 +479,7 @@ For more information, see: uwf help thread list
});
const skill = program.command("skill").description("Built-in skill references for agents");
skill.addHelpCommand(false);
skill
.command("cli")
@@ -482,6 +488,34 @@ skill
console.log(cmdSkillCli());
});
skill
.command("architecture")
.description("Print the architecture reference")
.action(() => {
console.log(cmdSkillArchitecture());
});
skill
.command("yaml")
.description("Print the workflow YAML schema reference")
.action(() => {
console.log(cmdSkillYaml());
});
skill
.command("moderator")
.description("Print the moderator reference")
.action(() => {
console.log(cmdSkillModerator());
});
skill
.command("list")
.description("List all available skill names")
.action(() => {
console.log(cmdSkillList().join("\n"));
});
program
.command("setup")
.description("Configure provider, model, and agent")
+12 -1
View File
@@ -1 +1,12 @@
export { generateCliReference as cmdSkillCli } from "@uncaged/workflow-util";
export {
generateArchitectureReference as cmdSkillArchitecture,
generateCliReference as cmdSkillCli,
generateModeratorReference as cmdSkillModerator,
generateYamlReference as cmdSkillYaml,
} from "@uncaged/workflow-util";
const SKILL_NAMES = ["cli", "architecture", "yaml", "moderator"] as const;
export function cmdSkillList(): ReadonlyArray<string> {
return [...SKILL_NAMES];
}
+79 -16
View File
@@ -19,9 +19,16 @@ import {
walkChain,
} from "./shared.js";
type TurnToolCall = {
name: string;
args: string;
};
type TurnData = {
index: number;
role: string;
content: string;
toolCalls: TurnToolCall[] | null;
};
/**
@@ -128,8 +135,74 @@ function loadStepDetail(store: BootstrapCapableStore, detailRef: CasRef): Record
return detailNode.payload as Record<string, unknown>;
}
function parseTurnToolCalls(raw: unknown): TurnToolCall[] | null {
if (!Array.isArray(raw) || raw.length === 0) {
return null;
}
const calls: TurnToolCall[] = [];
for (const entry of raw) {
if (typeof entry !== "object" || entry === null) {
continue;
}
const record = entry as Record<string, unknown>;
const name = record.name;
const args = record.args;
if (typeof name === "string") {
calls.push({ name, args: typeof args === "string" ? args : "" });
}
}
return calls.length > 0 ? calls : null;
}
function formatTurnBody(turn: TurnData): string {
const parts: string[] = [];
parts.push(`**Turn role:** ${turn.role}`);
if (turn.toolCalls !== null) {
for (const call of turn.toolCalls) {
const argsSuffix = call.args !== "" ? `\`${call.args}\`` : "";
parts.push(`- **${call.name}**${argsSuffix}`);
}
}
if (turn.content !== "") {
if (parts.length > 0) {
parts.push("");
}
parts.push(turn.content);
}
return parts.join("\n");
}
function parseSingleTurn(
store: BootstrapCapableStore,
turnRef: unknown,
fallbackIndex: number,
): TurnData | null {
if (typeof turnRef !== "string") {
return null;
}
const turnNode = store.get(turnRef as CasRef);
if (turnNode === null) {
return null;
}
const turn = turnNode.payload as Record<string, unknown>;
const content = typeof turn.content === "string" ? turn.content : "";
const toolCalls = parseTurnToolCalls(turn.toolCalls);
if (content === "" && toolCalls === null) {
return null;
}
return {
index: typeof turn.index === "number" ? turn.index : fallbackIndex,
role: typeof turn.role === "string" ? turn.role : "assistant",
content,
toolCalls,
};
}
/**
* Load all turn nodes from CAS store and extract content
* Load all turn nodes from CAS store and extract display fields
*/
function loadTurnData(store: BootstrapCapableStore, turns: unknown): TurnData[] {
if (!Array.isArray(turns) || turns.length === 0) {
@@ -138,19 +211,9 @@ function loadTurnData(store: BootstrapCapableStore, turns: unknown): TurnData[]
const turnData: TurnData[] = [];
for (const turnRef of turns) {
if (typeof turnRef !== "string") {
continue;
}
const turnNode = store.get(turnRef as CasRef);
if (turnNode === null) {
continue;
}
const turn = turnNode.payload as Record<string, unknown>;
if (typeof turn.content === "string") {
turnData.push({
index: typeof turn.index === "number" ? turn.index : turnData.length,
content: turn.content,
});
const parsed = parseSingleTurn(store, turnRef, turnData.length);
if (parsed !== null) {
turnData.push(parsed);
}
}
return turnData;
@@ -168,7 +231,7 @@ function selectTurnsForQuota(turnData: TurnData[], availableQuota: number): Turn
if (turn === undefined) continue;
const turnHeader = `## Turn ${turn.index + 1}\n\n`;
const turnBlock = turnHeader + turn.content;
const turnBlock = turnHeader + formatTurnBody(turn);
const separatorCost = selectedTurns.length > 0 ? 2 : 0;
const addCost = turnBlock.length + separatorCost;
@@ -213,7 +276,7 @@ function formatStepMarkdown(
parts.push("");
parts.push(`## Turn ${turn.index + 1}`);
parts.push("");
parts.push(turn.content);
parts.push(formatTurnBody(turn));
}
return parts.join("\n");
@@ -23,6 +23,28 @@ function isOneOfSchema(fm: unknown): fm is SchemaObj & { oneOf: SchemaObj[] } {
return Array.isArray(obj.oneOf);
}
/** Check if a frontmatter schema uses enum-based multi-exit ($status with multiple enum values). */
function isEnumMultiExit(fm: unknown): boolean {
if (typeof fm !== "object" || fm === null) return false;
const obj = fm as SchemaObj;
const props = obj.properties as Record<string, SchemaObj> | undefined;
if (!props?.$status) return false;
const statusDef = props.$status;
if (!Array.isArray(statusDef.enum)) return false;
// Filter out "_" (wildcard) — if remaining values > 1, it's multi-exit
const statuses = (statusDef.enum as string[]).filter((s) => s !== "_");
return statuses.length > 1;
}
/** Extract status values from an enum-based $status field. */
function getEnumStatuses(fm: SchemaObj): string[] {
const props = fm.properties as Record<string, SchemaObj> | undefined;
if (!props?.$status) return [];
const statusDef = props.$status;
if (!Array.isArray(statusDef.enum)) return [];
return (statusDef.enum as string[]).filter((s) => s !== "_");
}
/** Get property names from a schema object. */
function getPropertyNames(schema: SchemaObj): Set<string> {
const props = schema.properties;
@@ -230,6 +252,11 @@ function checkRoleConsistency(payload: WorkflowPayload, errors: string[]): void
checkOneOfDiscriminant(roleName, variants, statuses, errors);
checkMultiExitEdges(roleName, graphKeys, new Set(statuses), errors);
checkMultiExitMustache(roleName, graphEntry, variants, errors);
} else if (isEnumMultiExit(fm)) {
const statuses = getEnumStatuses(fm as SchemaObj);
checkMultiExitEdges(roleName, graphKeys, new Set(statuses), errors);
// For enum-based schemas, mustache vars come from the flat properties
checkSingleExitMustache(roleName, graphEntry, fm as SchemaObj, errors);
} else {
checkSingleExitRole(roleName, graphKeys, graphEntry, fm as SchemaObj | null, errors);
}
@@ -265,6 +292,27 @@ function checkSingleExitRole(
}
}
/** Check mustache vars in all edge prompts against flat schema properties. */
function checkSingleExitMustache(
roleName: string,
graphEntry: Record<string, { role: string; prompt: string }>,
fm: SchemaObj,
errors: string[],
): void {
const propNames = getPropertyNames(fm);
for (const [status, target] of Object.entries(graphEntry)) {
const vars = extractMustacheVars(target.prompt);
for (const v of vars) {
if (v === "$status") continue;
if (!propNames.has(v)) {
errors.push(
`prompt variable "${v}" in graph[${roleName}][${status}] not found in role "${roleName}" frontmatter`,
);
}
}
}
}
/**
* Validate a parsed WorkflowPayload for semantic correctness.
* Returns an array of error messages. Empty array = valid.
+1 -1
View File
@@ -22,7 +22,7 @@
"test:ci": "bun test"
},
"dependencies": {
"@uncaged/json-cas": "^0.5.2",
"@uncaged/json-cas": "^0.5.3",
"@uncaged/workflow-util-agent": "workspace:^",
"@uncaged/workflow-util": "workspace:^"
},
@@ -22,7 +22,7 @@
"test:ci": "bun test"
},
"dependencies": {
"@uncaged/json-cas": "^0.5.2",
"@uncaged/json-cas": "^0.5.3",
"@uncaged/workflow-util-agent": "workspace:^",
"@uncaged/workflow-util": "workspace:^"
},
+9
View File
@@ -18,6 +18,15 @@ bun add -g @uncaged/workflow-agent-hermes
Requires the `hermes` CLI on `PATH`.
Hermes must write session JSON snapshots so `uwf-hermes` can load structured tool calls from disk. Add this to `~/.hermes/config.yaml`:
```yaml
sessions:
write_json_snapshots: true
```
Session files are stored at `~/.hermes/sessions/session_{sessionId}.json`.
## CLI Usage
Invoked by `uwf thread step` (not typically run directly):
@@ -2,7 +2,7 @@ import { afterEach, beforeEach, describe, expect, it } from "bun:test";
import { HermesAcpClient } from "../src/acp-client.js";
describe("handleSessionUpdate — helper extraction", () => {
describe("handleSessionUpdate — text extraction", () => {
let client: HermesAcpClient;
beforeEach(() => {
@@ -14,80 +14,41 @@ describe("handleSessionUpdate — helper extraction", () => {
});
it("agent_message_chunk accumulates text in messageChunks", () => {
(client as any).handleSessionUpdate({
(
client as unknown as { handleSessionUpdate: (u: Record<string, unknown>) => void }
).handleSessionUpdate({
sessionUpdate: "agent_message_chunk",
content: { type: "text", text: "hello" },
});
(client as any).handleSessionUpdate({
(
client as unknown as { handleSessionUpdate: (u: Record<string, unknown>) => void }
).handleSessionUpdate({
sessionUpdate: "agent_message_chunk",
content: { type: "text", text: " world" },
});
expect((client as any).messageChunks).toEqual(["hello", " world"]);
expect((client as unknown as { messageChunks: string[] }).messageChunks).toEqual([
"hello",
" world",
]);
});
it("agent_thought_chunk accumulates reasoning in reasoningChunks", () => {
(client as any).handleSessionUpdate({
sessionUpdate: "agent_thought_chunk",
content: { type: "text", text: "thinking" },
it("non-text chunks and other update types are ignored", () => {
(
client as unknown as { handleSessionUpdate: (u: Record<string, unknown>) => void }
).handleSessionUpdate({
sessionUpdate: "agent_message_chunk",
content: { type: "image", text: "ignored" },
});
expect((client as any).reasoningChunks).toEqual(["thinking"]);
});
it("tool_call registers a pending tool and flushes message chunks", () => {
(client as any).messageChunks = ["pre-tool text"];
(client as any).handleSessionUpdate({
(
client as unknown as { handleSessionUpdate: (u: Record<string, unknown>) => void }
).handleSessionUpdate({
sessionUpdate: "tool_call",
title: "Bash",
rawInput: { command: "ls" },
toolCallId: "tc-1",
});
expect((client as any).pendingTools.get("tc-1")).toEqual({
name: "Bash",
args: JSON.stringify({ command: "ls" }),
});
expect((client as any).messageChunks).toEqual([]);
expect((client as any).messages).toHaveLength(1);
expect((client as any).messages[0].role).toBe("assistant");
});
it("tool_call_update completed pushes tool_call and tool messages", () => {
(client as any).pendingTools.set("tc-2", { name: "Read", args: '{"path":"/foo"}' });
(client as any).handleSessionUpdate({
sessionUpdate: "tool_call_update",
status: "completed",
toolCallId: "tc-2",
rawOutput: "file contents",
});
const msgs = (client as any).messages as Array<{
role: string;
tool_calls: unknown;
content: string | null;
}>;
expect(msgs).toHaveLength(2);
expect(msgs[0].role).toBe("assistant");
expect(msgs[0].tool_calls).toEqual([
{ function: { name: "Read", arguments: '{"path":"/foo"}' } },
]);
expect(msgs[1].role).toBe("tool");
expect(msgs[1].content).toBe("file contents");
expect((client as any).pendingTools.has("tc-2")).toBe(false);
});
it("tool_call_update with non-string rawOutput JSON-stringifies it", () => {
(client as any).pendingTools.set("tc-3", { name: "Fetch", args: "" });
(client as any).handleSessionUpdate({
sessionUpdate: "tool_call_update",
status: "completed",
toolCallId: "tc-3",
rawOutput: { html: "<p>page</p>" },
});
const msgs = (client as any).messages as Array<{ role: string; content: string | null }>;
expect(msgs[1].content).toBe(JSON.stringify({ html: "<p>page</p>" }));
});
it("unknown updateType is a no-op", () => {
(client as any).handleSessionUpdate({ sessionUpdate: "unknown_type", data: {} });
expect((client as any).messages).toHaveLength(0);
expect((client as any).messageChunks).toHaveLength(0);
(
client as unknown as { handleSessionUpdate: (u: Record<string, unknown>) => void }
).handleSessionUpdate({ sessionUpdate: "unknown_type", data: {} });
expect((client as unknown as { messageChunks: string[] }).messageChunks).toHaveLength(0);
});
});
@@ -53,23 +53,4 @@ describe("HermesAcpClient", () => {
},
{ timeout: 2 * 60 * 1000 },
);
// TODO(#435): flaky — depends on live LLM; mock or move to integration suite
it.skip(
"prompt() collects structured messages including tool calls",
async () => {
await client.connect(process.cwd());
const result = await client.prompt("Run this command: echo TOOL_DETAIL_TEST");
expect(result.messages.length).toBeGreaterThan(0);
const toolMessages = result.messages.filter((m) => m.role === "tool");
expect(toolMessages.length).toBeGreaterThan(0);
const toolContent = toolMessages[0]?.content ?? "";
expect(toolContent).toContain("TOOL_DETAIL_TEST");
const assistantWithTools = result.messages.filter(
(m) => m.role === "assistant" && m.tool_calls !== null,
);
expect(assistantWithTools.length).toBeGreaterThan(0);
},
{ timeout: 2 * 60 * 1000 },
);
});
+1 -1
View File
@@ -22,7 +22,7 @@
"test:ci": "bun test __tests__/*.test.ts"
},
"dependencies": {
"@uncaged/json-cas": "^0.5.2",
"@uncaged/json-cas": "^0.5.3",
"@uncaged/workflow-util-agent": "workspace:^",
"@uncaged/workflow-protocol": "workspace:^",
"@uncaged/workflow-util": "workspace:^"
@@ -2,8 +2,6 @@ import type { ChildProcess } from "node:child_process";
import { spawn } from "node:child_process";
import { createInterface } from "node:readline";
import type { HermesSessionMessage } from "./types.js";
const HERMES_COMMAND = "hermes";
const PROTOCOL_VERSION = 1;
@@ -19,16 +17,9 @@ type PendingRequest = {
reject: (reason: Error) => void;
};
/** Tracks in-flight tool calls so we can build complete messages when they finish. */
type PendingToolCall = {
name: string;
args: string;
};
export type AcpPromptResult = {
text: string;
sessionId: string;
messages: HermesSessionMessage[];
};
export class HermesAcpClient {
@@ -38,11 +29,8 @@ export class HermesAcpClient {
private stderrBuffer = "";
private pending = new Map<number, PendingRequest>();
// Message collection state
/** Accumulated assistant text chunks from agent_message_chunk updates. */
private messageChunks: string[] = [];
private reasoningChunks: string[] = [];
private pendingTools = new Map<string, PendingToolCall>();
messages: HermesSessionMessage[] = [];
/** Spawn hermes acp, initialize, create session */
async connect(cwd: string): Promise<string> {
@@ -84,14 +72,13 @@ export class HermesAcpClient {
return sessionId;
}
/** Send prompt and collect full response text + structured messages. */
/** Send prompt and collect final assistant text from ACP stream chunks. */
async prompt(text: string): Promise<AcpPromptResult> {
if (this.sessionId === null) {
throw new Error("Not connected — call connect() first");
}
this.messageChunks = [];
this.reasoningChunks = [];
const response = await this.sendRequest("session/prompt", {
sessionId: this.sessionId,
@@ -104,28 +91,9 @@ export class HermesAcpClient {
);
}
// Flush any trailing assistant text that wasn't followed by a tool call.
this.flushAssistantMessage();
// Extract the final assistant text from collected messages.
let finalText = "";
for (let i = this.messages.length - 1; i >= 0; i--) {
const msg = this.messages[i];
if (
msg !== undefined &&
msg.role === "assistant" &&
msg.content !== null &&
msg.content.trim() !== ""
) {
finalText = msg.content;
break;
}
}
return {
text: finalText,
text: this.messageChunks.join(""),
sessionId: this.sessionId,
messages: this.messages,
};
}
@@ -242,94 +210,16 @@ export class HermesAcpClient {
}
}
// ---- Session update → structured messages ----
private handleSessionUpdate(update: Record<string, unknown>): void {
switch (update.sessionUpdate as string) {
case "agent_message_chunk":
this.handleAgentMessageChunk(update);
break;
case "agent_thought_chunk":
this.handleAgentThoughtChunk(update);
break;
case "tool_call":
this.handleToolCall(update);
break;
case "tool_call_update":
this.handleToolCallUpdate(update);
break;
default:
break;
if (update.sessionUpdate !== "agent_message_chunk") {
return;
}
}
private handleAgentMessageChunk(update: Record<string, unknown>): void {
const content = update.content as { type?: string; text?: string } | undefined;
if (content?.type === "text" && typeof content.text === "string") {
this.messageChunks.push(content.text);
}
}
private handleAgentThoughtChunk(update: Record<string, unknown>): void {
const content = update.content as { type?: string; text?: string } | undefined;
if (content?.type === "text" && typeof content.text === "string") {
this.reasoningChunks.push(content.text);
}
}
private handleToolCall(update: Record<string, unknown>): void {
const title = (update.title as string) ?? "";
const rawInput = update.rawInput;
const args = rawInput !== undefined && rawInput !== null ? JSON.stringify(rawInput) : "";
const toolCallId = update.toolCallId as string;
this.pendingTools.set(toolCallId, { name: title, args });
this.flushAssistantMessage();
}
private handleToolCallUpdate(update: Record<string, unknown>): void {
const status = update.status as string | undefined;
if (status !== "completed" && status !== "failed") return;
const toolCallId = update.toolCallId as string;
const pending = this.pendingTools.get(toolCallId);
const toolName = pending?.name ?? toolCallId;
const rawOutput = update.rawOutput;
const outputStr =
rawOutput !== undefined && rawOutput !== null
? typeof rawOutput === "string"
? rawOutput
: JSON.stringify(rawOutput)
: "";
this.messages.push({
role: "assistant",
content: null,
reasoning: null,
tool_calls: [{ function: { name: toolName, arguments: pending?.args ?? "" } }],
});
this.messages.push({
role: "tool",
content: outputStr,
reasoning: null,
tool_calls: null,
});
this.pendingTools.delete(toolCallId);
}
/** Flush any accumulated text/reasoning into an assistant message. */
private flushAssistantMessage(): void {
const text = this.messageChunks.join("");
const reasoning = this.reasoningChunks.join("");
if (text !== "" || reasoning !== "") {
this.messages.push({
role: "assistant",
content: text || null,
reasoning: reasoning || null,
tool_calls: null,
});
}
this.messageChunks = [];
this.reasoningChunks = [];
}
private rejectAll(err: Error): void {
for (const handler of this.pending.values()) {
handler.reject(err);
+10 -16
View File
@@ -10,7 +10,7 @@ import {
import { HermesAcpClient } from "./acp-client.js";
import { getCachedSessionId, isResumeDisabled, setCachedSessionId } from "./session-cache.js";
import { storeHermesSessionDetail } from "./session-detail.js";
import { loadHermesSession, storeHermesSessionDetail } from "./session-detail.js";
const log = createLogger({ sink: { kind: "stderr" } });
@@ -49,17 +49,11 @@ export function buildHermesPrompt(ctx: AgentContext): string {
return parts.join("\n");
}
async function storePromptResult(
store: Store,
sessionId: string,
messages: Awaited<ReturnType<HermesAcpClient["prompt"]>>["messages"],
): Promise<{ detailHash: string }> {
const session = {
session_id: sessionId,
model: "",
session_start: new Date().toISOString(),
messages,
};
async function storePromptResult(store: Store, sessionId: string): Promise<{ detailHash: string }> {
const session = await loadHermesSession(sessionId);
if (session === null) {
throw new Error(`Hermes session file not found: ${sessionId}`);
}
return storeHermesSessionDetail(store, session);
}
@@ -116,8 +110,8 @@ export function createHermesAgent(): () => Promise<void> {
async function runPrompt(ctx: AgentContext, useContinuation: boolean): Promise<AgentRunResult> {
const effectiveCtx = useContinuation ? ctx : { ...ctx, isFirstVisit: true };
const fullPrompt = buildHermesPrompt(effectiveCtx);
const { text, sessionId, messages } = await client.prompt(fullPrompt);
const { detailHash } = await storePromptResult(ctx.store, sessionId, messages);
const { text, sessionId } = await client.prompt(fullPrompt);
const { detailHash } = await storePromptResult(ctx.store, sessionId);
if (!isResumeDisabled()) {
await setCachedSessionId(ctx.threadId, ctx.role, sessionId);
@@ -152,8 +146,8 @@ export function createHermesAgent(): () => Promise<void> {
): Promise<AgentRunResult> {
// Client is already connected from runHermes — same ACP session,
// so the agent sees the full conversation history (crucial for retries).
const { text, sessionId, messages } = await client.prompt(message);
const { detailHash } = await storePromptResult(store, sessionId, messages);
const { text, sessionId } = await client.prompt(message);
const { detailHash } = await storePromptResult(store, sessionId);
return { output: text, detailHash, sessionId };
}
+2 -2
View File
@@ -15,8 +15,8 @@
}
},
"dependencies": {
"@uncaged/json-cas": "^0.5.2",
"@uncaged/json-cas-fs": "^0.5.2"
"@uncaged/json-cas": "^0.5.3",
"@uncaged/json-cas-fs": "^0.5.3"
},
"devDependencies": {
"typescript": "^5.8.3"
+2 -2
View File
@@ -19,8 +19,8 @@
"test:ci": "bun test"
},
"dependencies": {
"@uncaged/json-cas": "^0.5.2",
"@uncaged/json-cas-fs": "^0.5.2",
"@uncaged/json-cas": "^0.5.3",
"@uncaged/json-cas-fs": "^0.5.3",
"@uncaged/workflow-protocol": "workspace:^",
"@uncaged/workflow-util": "workspace:^",
"dotenv": "^16.6.1",
@@ -0,0 +1,60 @@
export function generateArchitectureReference(): string {
return `# Workflow Engine — Architecture Reference
## Key Concepts
### CAS (Content-Addressed Storage)
Every artifact in the workflow engine is stored as a CAS node — an immutable, content-addressed record identified by its XXH64 hash (13-char Crockford Base32). CAS provides deduplication, integrity verification, and an append-only audit trail.
Stored artifacts include:
- **Workflow definitions** — the YAML-parsed payload
- **Step nodes** — each moderator→agent→extract cycle
- **Detail nodes** — per-step metadata and turn history
- **Turn records** — individual agent interactions within a step
### Thread
A Thread is a single execution of a Workflow, identified by a ULID (26-char Crockford Base32: 10 timestamp + 16 random). Thread state is an immutable CAS chain — each step points to its predecessor via a \`prev\` hash, forming a linked list.
Active threads are indexed in \`threads.yaml\`; completed threads move to \`history.jsonl\`.
A thread progresses by running \`uwf thread exec\`, which performs one moderator→agent→extract cycle per step.
### Workflow
A Workflow is a YAML definition (\`WorkflowPayload\`) stored as a CAS node. It defines:
- **Roles** — named actors with system prompts and output schemas
- **Graph** — status-based routing edges between roles
- **Conditions** — edge predicates evaluated by the moderator
Workflow names follow verb-first kebab-case: \`solve-issue\`, \`review-code\`.
### Step
A Step is one moderator→agent→extract cycle, stored as a CAS node (\`StepNodePayload\`). Each step contains:
- **output** — the agent's extracted frontmatter output
- **detail** — a CAS reference to turn-level records
- **prev** — CAS hash of the previous step (forming the chain)
- **role** — which role produced this step
### Turn
A Turn is an agent-internal interaction within a single Step. Turns are stored per-turn in the detail node, capturing the raw agent I/O before extraction.
## Data Flow
\`\`\`
uwf thread exec <thread-id>
→ Moderator evaluates graph edges based on current status
→ Selects next role (or $END)
→ Agent CLI is spawned with context
→ Agent produces frontmatter markdown
→ Extract pipeline parses output into structured data
→ New CAS step node is appended to the thread chain
\`\`\`
## Storage Layout
All data lives under \`~/.uncaged/workflow/\`:
- \`cas/\` — content-addressed store (XXH64-keyed)
- \`threads.yaml\` — active thread index
- \`history.jsonl\` — completed thread archive
- \`registry.yaml\` — workflow name → CAS hash mapping
`;
}
+3
View File
@@ -1,3 +1,4 @@
export { generateArchitectureReference } from "./architecture-reference.js";
export { encodeUint64AsCrockford } from "./base32.js";
export { generateCliReference } from "./cli-reference.js";
export { env } from "./env.js";
@@ -13,6 +14,7 @@ export {
validateFrontmatter,
} from "./frontmatter-markdown/index.js";
export { createLogger } from "./logger.js";
export { generateModeratorReference } from "./moderator-reference.js";
export type {
CreateProcessLoggerOptions,
ProcessLogFn,
@@ -25,3 +27,4 @@ export { err, ok } from "./result.js";
export { getDefaultWorkflowStorageRoot, getGlobalCasDir } from "./storage-root.js";
export type { LogFn, Result } from "./types.js";
export { extractUlidTimestamp, generateUlid } from "./ulid.js";
export { generateYamlReference } from "./yaml-reference.js";
@@ -0,0 +1,56 @@
export function generateModeratorReference(): string {
return `# Moderator Reference
## Overview
The moderator is the workflow engine's routing component. It evaluates the directed graph defined in the workflow YAML to determine the next role (or \`$END\`) after each step — with zero LLM cost.
## Status-Based Routing
The moderator uses **status-based routing**: it inspects the previous step's extracted output (specifically the \`$status\` field) and looks up the corresponding edge in the graph.
### Graph Structure
The graph is a nested map: \`Record<Role | "$START", Record<Status, Target>>\`. Each role maps its possible \`$status\` values to a target with a \`role\` and \`prompt\`:
\`\`\`yaml
graph:
$START:
_: { role: planner, prompt: "Analyze the issue." }
planner:
ready: { role: developer, prompt: "Implement the plan (CAS hash: {{{plan}}})." }
insufficient_info: { role: $END, prompt: "Not enough info." }
developer:
done: { role: reviewer, prompt: "Review branch {{{branch}}} at {{{worktree}}}." }
failed: { role: $END, prompt: "Developer failed: {{{reason}}}." }
reviewer:
approved: { role: tester, prompt: "Run tests on {{{branch}}} at {{{worktree}}}." }
rejected: { role: developer, prompt: "Fix issues: {{{comments}}}." }
\`\`\`
### Routing Algorithm
1. Look up \`graph[lastRole]\` to get the status map for the current role
2. Look up \`statusMap[lastOutput.$status]\` to get the target
3. If target role is \`$END\`, mark thread as completed
4. Otherwise, render the edge prompt (Mustache templates with \`{{{field}}}\` from output) and spawn the next agent
### Edge Prompts and Mustache Templates
Edge prompts use triple-brace Mustache syntax (\`{{{field}}}\`) to interpolate values from the previous step's output into the next agent's task prompt. This passes structured data (branch names, file paths, CAS hashes) between roles without manual wiring.
## Special Nodes
- \`$START\` — entry point; uses status key \`_\` (unconditional) since there is no previous output
- \`$END\` — terminal node; thread completes when reached and is moved to history
## Integration with Steps
Each \`uwf thread exec\` cycle:
1. Moderator reads the thread's head step output
2. Looks up \`graph[lastRole][output.$status]\` to pick the next role
3. If next is \`$END\`, marks thread as completed
4. Otherwise, renders the edge prompt and spawns the agent for the selected role
5. Extract pipeline parses agent output → new step node → append to CAS chain
`;
}
@@ -0,0 +1,82 @@
export function generateYamlReference(): string {
return `# Workflow YAML Schema Reference
## Top-Level Structure
A workflow YAML file defines the complete workflow specification:
\`\`\`yaml
name: solve-issue # verb-first kebab-case identifier
description: "..." # human-readable description
roles: # named actors in the workflow
planner:
description: "Analyzes issue and outputs a plan"
goal: "You are a planning agent."
capabilities:
- issue-analysis
- planning
procedure: |
1. Read the issue
2. Produce a test spec
output: "Output the plan summary. Set $status to ready or insufficient_info."
frontmatter: # JSON Schema for structured output (drives routing)
oneOf:
- properties:
$status: { const: ready }
plan: { type: string }
required: [$status, plan]
- properties:
$status: { const: insufficient_info }
required: [$status]
graph: # status-based routing (nested map)
$START:
_: { role: planner, prompt: "Analyze the issue." }
planner:
ready: { role: developer, prompt: "Implement plan {{{plan}}}." }
insufficient_info: { role: $END, prompt: "Not enough info." }
\`\`\`
## roles
Each role defines an actor in the workflow:
| Field | Type | Description |
|-------|------|-------------|
| \`description\` | string | Short description of the role's purpose |
| \`goal\` | string | System-level goal statement for the agent |
| \`capabilities\` | string[] | Tags describing what the role can do |
| \`procedure\` | string | Step-by-step instructions for the agent |
| \`output\` | string | Description of expected output format |
| \`frontmatter\` | JSON Schema | Defines the structured output the agent must produce |
### frontmatter
The \`frontmatter\` field is a standard JSON Schema object. The extract pipeline validates agent output against it. Key conventions:
- \`$status\` field drives routing decisions in the graph
- Use \`const\` or \`enum\` to constrain status values
- Use \`oneOf\` to define multiple valid output shapes (one per status)
- All \`required\` fields must appear in the agent's frontmatter output
## graph
The graph is a nested map defining status-based routing:
\`\`\`
Record<Role | "$START", Record<Status, { role: string, prompt: string }>>
\`\`\`
| Level | Key | Value |
|-------|-----|-------|
| Outer | Role name or \`$START\` | Status map for that role |
| Inner | \`$status\` value (or \`_\` for unconditional) | Target: \`{ role, prompt }\` |
### Special Nodes
- \`$START\` — entry point; uses status key \`_\` (unconditional, no previous output)
- \`$END\` — terminal node; thread completes when reached
### Edge Prompts
Prompts use triple-brace Mustache templates (\`{{{field}}}\`) to interpolate values from the previous step's output. Example: \`"Implement plan {{{plan}}} in repo {{{repoPath}}}."\`
`;
}