Compare commits

..

4 Commits

Author SHA1 Message Date
xingyue 318f8c7fa6 ci: test runner v4 2026-05-25 19:42:50 +08:00
xingyue 32569e4248 ci: test runner v3 2026-05-25 19:41:54 +08:00
xingyue 3e4cc4cd33 ci: retry actions runner test 2026-05-25 19:38:54 +08:00
xingyue 1d4692db50 ci: add gitea actions workflow 2026-05-25 19:36:04 +08:00
86 changed files with 1658 additions and 6766 deletions
+28
View File
@@ -0,0 +1,28 @@
name: CI
on:
push:
branches: ['*']
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Bun
uses: oven-sh/setup-bun@v2
- name: Install dependencies
run: bun install
- name: Lint
run: bun run lint
- name: Type check
run: bun run typecheck
- name: Test
run: bun test
+75 -53
View File
@@ -38,26 +38,19 @@ roles:
capabilities:
- coding
procedure: |
IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
Before starting any work, set up an isolated worktree:
1. `cd ~/repos/workflow && git fetch origin` to get latest refs
2. First time (no existing branch):
- `git worktree add ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
- `cd ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug> && bun install`
3. If bounced back from reviewer or tester (branch already exists):
- The worktree should already exist at `~/repos/workflow-worktrees/fix/<issue-number>-<short-slug>`
- `cd ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug>`
- `git fetch origin && git rebase origin/main`
4. ALL subsequent work must happen inside the worktree directory.
Before starting any work, ensure a clean worktree:
1. `git checkout main && git pull` to get the latest code
2. `git checkout -b fix/<issue-number>-<short-description>` to create a fresh branch
- If bounced back from reviewer or tester, reuse the existing branch and rebase onto latest main:
`git checkout main && git pull && git checkout <branch> && git rebase main`
Then implement TDD:
5. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
6. If bounced back from reviewer or tester: read the previous role's output to understand what needs fixing
7. Write tests first based on the spec
8. Implement the code to make tests pass
9. Ensure `bun run build` passes with no errors
10. Run `bun test` to verify all tests pass
3. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
4. If bounced back from reviewer or tester: read the previous role's output to understand what needs fixing
5. Write tests first based on the spec
6. Implement the code to make tests pass
7. Ensure `bun run build` passes with no errors
8. Run `bun test` to verify all tests pass
output: "List all files changed and provide a summary. Frontmatter must include: status (done or failed)."
frontmatter:
type: object
@@ -73,8 +66,6 @@ roles:
- code-review
- static-analysis
procedure: |
First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
Before reviewing, verify the git branch:
1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
2. If the branch doesn't correspond to the issue, flag it in your output and reject
@@ -95,22 +86,19 @@ roles:
Only review standards compliance. Do NOT test functionality.
If rejecting, you MUST explain the specific reason in your output.
output: "Explain your decision with specific file/line references. Frontmatter must include: status (approved or rejected)."
output: "Explain your decision with specific file/line references. Frontmatter must include: approved (true or false)."
frontmatter:
type: object
properties:
status:
type: string
enum: [approved, rejected]
required: [status]
approved:
type: boolean
required: [approved]
tester:
description: "Functional correctness verification"
goal: "You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec."
capabilities:
- testing
procedure: |
First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
1. Run `bun test` for automated test verification
2. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
3. Verify each scenario in the spec is covered and passing
@@ -131,45 +119,79 @@ roles:
goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue."
capabilities: []
procedure: |
First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
1. Stage all changes: `git add -A`
2. Commit with a descriptive message referencing the issue: `git commit -m "type: description\n\nFixes #N"`
3. Push the branch: `git push -u origin <branch-name>`
- If push hook fails: capture the error log in your output, mark hook_failed
4. On push success: create a PR via `tea pr create --repo uncaged/workflow --title "..." --description "..."`
- The `--repo` flag is required to work in worktree directories (fixes #474 "path segment [0] is empty" error)
- If working on a different repo, extract owner/repo from: `git remote get-url origin | sed 's/.*[:/]\([^/]*\/[^.]*\).*/\1/'`
4. On push success: create a PR via `tea pr create --title "..." --description "..."`
- PR description must follow the project template: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
- On tea failure: capture stderr/stdout, log the error clearly, include PR details (title, description, branch) for manual creation, and mark success=false
5. After PR creation, clean up the worktree:
- `cd ~/repos/workflow`
- `git worktree remove ~/repos/workflow-worktrees/fix/<issue-number>-<slug>`
output: "Include PR URL on success or error log on failure. Frontmatter must include: status (committed or hook_failed)."
output: "Include PR URL on success or error log on failure. Frontmatter must include: success (true or false)."
frontmatter:
type: object
properties:
status:
type: string
enum: [committed, hook_failed]
required: [status]
success:
type: boolean
required: [success]
conditions:
insufficientInfo:
description: "Planner determined there's not enough info to proceed"
expression: "$last('planner').status = 'insufficient_info'"
devFailed:
description: "Developer failed to implement"
expression: "$last('developer').status = 'failed'"
rejected:
description: "Reviewer rejected the implementation"
expression: "$last('reviewer').approved = false"
fixCode:
description: "Tester found code issues"
expression: "$last('tester').status = 'fix_code'"
fixSpec:
description: "Tester found spec issues"
expression: "$last('tester').status = 'fix_spec'"
hookFailed:
description: "Push hook failed"
expression: "$last('committer').success = false"
graph:
$START:
_: { role: "planner", prompt: "Analyze the issue and produce an implementation plan." }
- role: "planner"
condition: null
prompt: "Analyze the issue and produce an implementation plan."
planner:
insufficient_info: { role: "$END", prompt: "Insufficient information to proceed; end the workflow." }
ready: { role: "developer", prompt: "Implement the plan from the planner." }
- role: "$END"
condition: "insufficientInfo"
prompt: "Insufficient information to proceed; end the workflow."
- role: "developer"
condition: null
prompt: "Implement the plan from the planner."
developer:
failed: { role: "$END", prompt: "Development failed; end the workflow." }
done: { role: "reviewer", prompt: "Send the implementation to the reviewer." }
- role: "$END"
condition: "devFailed"
prompt: "Development failed; end the workflow."
- role: "reviewer"
condition: null
prompt: "Send the implementation to the reviewer."
reviewer:
rejected: { role: "developer", prompt: "Reviewer rejected the implementation; fix the issues." }
approved: { role: "tester", prompt: "Review passed; run tests on the implementation." }
- role: "developer"
condition: "rejected"
prompt: "Reviewer rejected the implementation; fix the issues."
- role: "tester"
condition: null
prompt: "Review passed; run tests on the implementation."
tester:
fix_code: { role: "developer", prompt: "Tests found code issues; return to developer." }
fix_spec: { role: "planner", prompt: "Tests found spec issues; return to planner." }
passed: { role: "committer", prompt: "Tests passed; commit and push the changes." }
- role: "developer"
condition: "fixCode"
prompt: "Tests found code issues; return to developer."
- role: "planner"
condition: "fixSpec"
prompt: "Tests found spec issues; return to planner."
- role: "committer"
condition: null
prompt: "Tests passed; commit and push the changes."
committer:
hook_failed: { role: "developer", prompt: "Push hook failed; return to developer to fix." }
committed: { role: "$END", prompt: "Commit succeeded; complete the workflow." }
- role: "developer"
condition: "hookFailed"
prompt: "Push hook failed; return to developer to fix."
- role: "$END"
condition: null
prompt: "Commit succeeded; complete the workflow."
+5 -6
View File
@@ -62,16 +62,16 @@ See [docs/architecture.md](docs/architecture.md) for the full design — three-p
uwf setup
# 2. Register a workflow from YAML
uwf workflow add examples/solve-issue.yaml
uwf workflow put examples/solve-issue.yaml
# 3. Start a thread (creates head pointer; does not execute)
uwf thread start solve-issue -p "Fix the login redirect bug"
# 4. Execute steps (one at a time, until done)
uwf thread exec <thread-id>
uwf thread step <thread-id>
```
Use `-c, --count <number>` on `thread exec` to run multiple steps in one invocation. Override the agent with `--agent <cmd>`.
Use `-c, --count <number>` on `thread step` to run multiple steps in one invocation. Override the agent with `--agent <cmd>`.
## CLI Reference
@@ -79,9 +79,8 @@ Global options: `-V, --version`, `--format <json|yaml>`, `-h, --help`.
| Group | Commands |
|-------|----------|
| **thread** | `start`, `exec`, `show`, `list`, `stop`, `cancel`, `read` |
| **step** | `list`, `show`, `read`, `fork` |
| **workflow** | `add`, `show`, `list` |
| **thread** | `start`, `step`, `show`, `list`, `kill`, `steps`, `read`, `fork`, `step-details` |
| **workflow** | `put`, `show`, `list` |
| **cas** | `get`, `put`, `put-text`, `has`, `refs`, `walk`, `reindex`, `schema list`, `schema get` |
| **setup** | Interactive or `--provider`, `--base-url`, `--api-key`, `--model`, `--agent` |
| **skill** | `cli` — print markdown reference of all uwf commands |
+8 -5
View File
@@ -22,8 +22,6 @@ roles:
frontmatter:
type: object
properties:
status:
enum: ["_"]
thesis:
type: string
keyPoints:
@@ -32,9 +30,14 @@ roles:
type: string
caveats:
type: string
required: [status, thesis, keyPoints]
required: [thesis, keyPoints]
conditions: {}
graph:
$START:
_: { role: "analyst", prompt: "Analyze the topic in the task and produce a structured summary with key points." }
- role: "analyst"
condition: null
prompt: "Analyze the topic in the task and produce a structured summary with key points."
analyst:
_: { role: "$END", prompt: "Analysis complete. Finish the workflow." }
- role: "$END"
condition: null
prompt: "Analysis complete. Finish the workflow."
+30 -15
View File
@@ -16,16 +16,15 @@ roles:
3. If you find yourself genuinely convinced by the other side, you may concede.
output: |
Provide your argument in the frontmatter.
Set status to "conceded" ONLY if you are genuinely convinced and wish to stop debating.
Otherwise set status to "continue".
Set conceded to true ONLY if you are genuinely convinced and wish to stop debating.
frontmatter:
type: object
properties:
status:
enum: ["continue", "conceded"]
argument:
type: string
required: [status, argument]
conceded:
type: boolean
required: [argument, conceded]
for:
description: "Argues for the proposition"
goal: |
@@ -41,22 +40,38 @@ roles:
3. If you find yourself genuinely convinced by the other side, you may concede.
output: |
Provide your argument in the frontmatter.
Set status to "conceded" ONLY if you are genuinely convinced and wish to stop debating.
Otherwise set status to "continue".
Set conceded to true ONLY if you are genuinely convinced and wish to stop debating.
frontmatter:
type: object
properties:
status:
enum: ["continue", "conceded"]
argument:
type: string
required: [status, argument]
conceded:
type: boolean
required: [argument, conceded]
conditions:
againstConceded:
description: "The against side conceded"
expression: "$last('against').conceded = true"
forConceded:
description: "The for side conceded"
expression: "$last('for').conceded = true"
graph:
$START:
_: { role: "against", prompt: "Present your opening argument against the proposition." }
- role: "against"
condition: null
prompt: "Present your opening argument against the proposition."
against:
conceded: { role: "$END", prompt: "The against side conceded. Debate over." }
continue: { role: "for", prompt: "Counter the opposing argument: {{{argument}}}" }
- role: "$END"
condition: "againstConceded"
prompt: "The against side conceded. Debate over."
- role: "for"
condition: null
prompt: "Counter the opposing argument. Address their points directly."
for:
conceded: { role: "$END", prompt: "The for side conceded. Debate over." }
continue: { role: "against", prompt: "Counter the opposing argument: {{{argument}}}" }
- role: "$END"
condition: "forConceded"
prompt: "The for side conceded. Debate over."
- role: "against"
condition: null
prompt: "Counter the opposing argument. Address their points directly."
+26 -20
View File
@@ -27,13 +27,11 @@ roles:
frontmatter:
type: object
properties:
status:
enum: ["_"]
repoPath:
type: string
plan:
type: string
required: [status, repoPath, plan]
required: [repoPath, plan]
developer:
description: "Implements code changes"
goal: "You are a developer agent. You implement code changes according to plans."
@@ -46,47 +44,55 @@ roles:
2. cd to the repoPath before making any changes.
3. Create a feature branch from the default branch.
4. Implement the plan — write code, tests, and ensure existing tests pass.
5. Run the project's lint/check command (e.g. `bun run check`, `npm run lint`) and fix ALL errors before proceeding. Build and lint must pass cleanly.
6. Commit your changes with a descriptive message referencing the issue.
5. Commit your changes with a descriptive message referencing the issue.
output: "List all files changed and provide a summary of the implementation."
frontmatter:
type: object
properties:
status:
enum: ["_"]
filesChanged:
type: array
items:
type: string
summary:
type: string
required: [status, filesChanged, summary]
required: [filesChanged, summary]
reviewer:
description: "Reviews code changes"
goal: "You are a code reviewer. You review implementations for correctness and quality."
capabilities:
- code-review
- static-analysis
procedure: |
1. Run hard checks first — build (`bun run build` or equivalent) and lint (`bunx biome check .` or equivalent) MUST pass with zero errors. If they fail, reject immediately.
2. Then review code quality: correctness, edge cases, naming, project conventions (CLAUDE.md), and test coverage.
3. Only reject for hard check failures or genuine correctness/security issues. Style suggestions alone should not block approval.
procedure: "Review the implementation against the plan. Check for bugs, edge cases, and style."
output: "Approve or reject with detailed comments explaining your decision."
frontmatter:
type: object
properties:
status:
enum: ["approved", "rejected"]
approved:
type: boolean
comments:
type: string
required: [status, comments]
required: [approved, comments]
conditions:
notApproved:
description: "Reviewer rejected the implementation"
expression: "$last('reviewer').approved = false"
graph:
$START:
_: { role: "planner", prompt: "Analyze the issue described in the task and produce a detailed implementation plan." }
- role: "planner"
condition: null
prompt: "Analyze the issue described in the task and produce a detailed implementation plan."
planner:
_: { role: "developer", prompt: "Implement the plan from the planner. Write code, tests, and ensure existing tests pass." }
- role: "developer"
condition: null
prompt: "Implement the plan from the planner. Write code, tests, and ensure existing tests pass."
developer:
_: { role: "reviewer", prompt: "Review the developer's implementation against the plan for correctness and quality." }
- role: "reviewer"
condition: null
prompt: "Review the developer's implementation against the plan for correctness and quality."
reviewer:
approved: { role: "$END", prompt: "The review passed. Complete the workflow." }
rejected: { role: "developer", prompt: "The reviewer rejected your implementation. Read their feedback and fix the issues: {{{comments}}}" }
- role: "developer"
condition: "notApproved"
prompt: "The reviewer rejected your implementation. Read their feedback and fix the issues."
- role: "$END"
condition: null
prompt: "The review passed. Complete the workflow."
@@ -531,25 +531,13 @@ export async function executeThread(
timestamp: nowMs,
parentState: options.parentStateHash,
},
steps: await Promise.all(
input.steps.map(async (out, i) => {
// Resolve content for the last step (most relevant for the next agent).
// Earlier steps only carry meta summaries to avoid bloating the prompt.
const isLast = i === input.steps.length - 1;
let content: string | null = null;
if (isLast) {
content = await getContentMerklePayload(io.cas, out.contentHash);
}
return {
role: out.role,
contentHash: out.contentHash,
content,
meta: out.meta,
refs: out.refs,
timestamp: replayTs?.[i] ?? prefilled?.[i]?.timestamp ?? nowMs + i,
};
}),
),
steps: input.steps.map((out, i) => ({
role: out.role,
contentHash: out.contentHash,
meta: out.meta,
refs: out.refs,
timestamp: replayTs?.[i] ?? prefilled?.[i]?.timestamp ?? nowMs + i,
})),
};
const runtime: WorkflowRuntime = {
@@ -71,7 +71,6 @@ export type RoleStep<M extends RoleMeta> = {
role: K;
meta: M[K];
contentHash: string;
content: string | null;
refs: string[];
timestamp: number;
};
@@ -71,8 +71,7 @@ async function buildRoleStepsFromStates<M extends RoleMeta>(
cas: CasStore,
): Promise<RoleStep<M>[]> {
const steps: RoleStep<M>[] = [];
for (let idx = 0; idx < chronologicalStates.length; idx++) {
const st = chronologicalStates[idx];
for (const st of chronologicalStates) {
if (st.payload.role === END) {
continue;
}
@@ -80,13 +79,10 @@ async function buildRoleStepsFromStates<M extends RoleMeta>(
if (contentParsed === null || contentParsed.kind !== "content") {
throw new Error(`buildThreadContext: expected content node at ${st.payload.content}`);
}
// Resolve full text content for the last step only
const isLast = idx === chronologicalStates.length - 1;
steps.push({
role: st.payload.role,
meta: st.payload.meta,
contentHash: st.payload.content,
content: isLast ? contentParsed.node.payload : null,
refs: [...contentParsed.node.refs],
timestamp: st.payload.timestamp,
} as RoleStep<M>);
@@ -88,7 +88,6 @@ async function advanceOneRound<M extends RoleMeta>(
const step = {
role: next,
contentHash,
content: contentPayload,
meta,
refs,
timestamp: Date.now(),
@@ -30,7 +30,7 @@ describe("buildAgentPrompt", () => {
expect(text).not.toContain("## Tools");
});
test("single step shows meta and content, and includes tools", async () => {
test("single step shows hash and meta, and includes tools", async () => {
const onlyHash = "01HASHSINGLESTEP0000000001";
const ctx: AgentContext = {
start: startTask("user task"),
@@ -42,7 +42,6 @@ describe("buildAgentPrompt", () => {
{
role: "coder",
contentHash: onlyHash,
content: "Here is my implementation of the feature.",
meta: { files: ["a.ts"] },
refs: [onlyHash],
timestamp: 2,
@@ -53,39 +52,13 @@ describe("buildAgentPrompt", () => {
expect(text).toContain("## Task");
expect(text).toContain("user task");
expect(text).toContain("## Step: coder");
expect(text).toContain(`ContentHash: ${onlyHash}`);
expect(text).toContain('Meta: {"files":["a.ts"]}');
expect(text).toContain("<output>");
expect(text).toContain("Here is my implementation of the feature.");
expect(text).toContain("</output>");
expect(text).toContain("## Tools");
expect(text).toContain("uncaged-workflow thread 01TEST000000000000000000TR");
});
test("single step with null content omits output tag", async () => {
const onlyHash = "01HASHSINGLESTEP0000000001";
const ctx: AgentContext = {
start: startTask("user task"),
depth: 0,
bundleHash: "TESTHASH00001",
threadId: "01TEST000000000000000000TR",
currentRole: { name: "coder", systemPrompt: "Be helpful." },
steps: [
{
role: "coder",
contentHash: onlyHash,
content: null,
meta: { files: ["a.ts"] },
refs: [onlyHash],
timestamp: 2,
},
],
};
const text = await buildAgentPrompt(ctx);
expect(text).not.toContain("<output>");
expect(text).toContain('Meta: {"files":["a.ts"]}');
});
test("two or more steps: previous steps are meta-only; latest step includes content", async () => {
test("two or more steps: previous steps are meta-only; latest step includes hash", async () => {
const plannerHash = "01HASHPLANNER0000000000001";
const coderHash = "01HASHCODER0000000000000001";
const ctx: AgentContext = {
@@ -98,7 +71,6 @@ describe("buildAgentPrompt", () => {
{
role: "planner",
contentHash: plannerHash,
content: null,
meta: { plan: "short" },
refs: [plannerHash],
timestamp: 2,
@@ -106,7 +78,6 @@ describe("buildAgentPrompt", () => {
{
role: "coder",
contentHash: coderHash,
content: "I reviewed the code and found 4 lint issues:\n1. Missing semicolon on line 42\n2. Unused import on line 3",
meta: { done: true },
refs: [coderHash],
timestamp: 3,
@@ -119,11 +90,10 @@ describe("buildAgentPrompt", () => {
expect(text).toContain("### Step 1: planner");
expect(text).toContain('Summary: {"plan":"short"}');
expect(text).toContain("## Latest Step: coder");
expect(text).toContain(`ContentHash: ${coderHash}`);
expect(text).toContain('Meta: {"done":true}');
expect(text).toContain("<output>");
expect(text).toContain("I reviewed the code and found 4 lint issues:");
expect(text).toContain("</output>");
expect(text).toContain("## Tools");
expect(text).toContain("uncaged-workflow thread 01TEST000000000000000000TR");
});
test("parentState null omits Parent Context section", async () => {
@@ -155,7 +125,7 @@ describe("buildAgentPrompt", () => {
expect(text).toContain(`uncaged-workflow cas get ${parentHash}`);
});
test("middle steps show meta summary only and latest shows content", async () => {
test("middle steps show meta summary only and latest shows hash", async () => {
const ha = "01HASHA00000000000000000001";
const hb = "01HASHB00000000000000000001";
const hc = "01HASHC00000000000000000001";
@@ -169,7 +139,6 @@ describe("buildAgentPrompt", () => {
{
role: "a",
contentHash: ha,
content: null,
meta: { n: 1 },
refs: [ha],
timestamp: 2,
@@ -177,7 +146,6 @@ describe("buildAgentPrompt", () => {
{
role: "b",
contentHash: hb,
content: null,
meta: { n: 2 },
refs: [hb],
timestamp: 3,
@@ -185,7 +153,6 @@ describe("buildAgentPrompt", () => {
{
role: "c",
contentHash: hc,
content: "Final output from role c",
meta: { n: 3 },
refs: [hc],
timestamp: 4,
@@ -195,35 +162,7 @@ describe("buildAgentPrompt", () => {
const text = await buildAgentPrompt(ctx);
expect(text).toContain('Summary: {"n":1}');
expect(text).toContain('Summary: {"n":2}');
expect(text).toContain(`ContentHash: ${hc}`);
expect(text).toContain("## Latest Step: c");
expect(text).toContain("<output>");
expect(text).toContain("Final output from role c");
expect(text).toContain("</output>");
});
test("content is truncated when exceeding quota", async () => {
const longContent = "x".repeat(20_000);
const hash = "01HASHLONG000000000000000001";
const ctx: AgentContext = {
start: startTask("task"),
depth: 0,
bundleHash: "TESTHASH00001",
threadId: "01TEST000000000000000000TR",
currentRole: { name: "r", systemPrompt: "S" },
steps: [
{
role: "r",
contentHash: hash,
content: longContent,
meta: {},
refs: [],
timestamp: 2,
},
],
};
const text = await buildAgentPrompt(ctx);
expect(text).toContain("<output>");
expect(text).toContain("... (truncated)");
expect(text.length).toBeLessThan(20_000);
});
});
-1
View File
@@ -5,7 +5,6 @@
"packages/*"
],
"scripts": {
"uwf": "bun packages/cli-workflow/src/cli.ts",
"build": "bunx tsc --build",
"check": "bunx tsc --build && biome check . && bash scripts/lint-log-tags.sh",
"typecheck": "bunx tsc --build",
+13 -96
View File
@@ -6,18 +6,6 @@
Layer 4 entry point for the workflow engine. The `uwf` binary orchestrates one step per invocation: load thread head from `threads.yaml`, run the moderator, spawn the configured agent CLI, run extract, append a CAS step node, and update the head pointer (or archive when `$END`).
### Four-Layer Architecture
```
workflow → thread → step → turn
模板定义 执行实例 单步结果 agent内部交互
```
- **Workflow** (layer 1): YAML template with roles and routing graph
- **Thread** (layer 2): Single workflow execution instance
- **Step** (layer 3): One moderator→agent→extract cycle
- **Turn** (layer 4): Agent-internal interactions (use `step show` or CAS to inspect)
This package has no library `src/index.ts` — it is consumed as a CLI binary only.
**Dependencies:** `@uncaged/json-cas`, `@uncaged/json-cas-fs`, `@uncaged/workflow-agent-kit`, `@uncaged/workflow-moderator`, `@uncaged/workflow-protocol`, `@uncaged/workflow-util`, `commander`, `dotenv`, `yaml`
@@ -42,58 +30,34 @@ bun link packages/cli-workflow
-h, --help Show help
```
### Thread (Layer 2: Execution Instances)
### Thread
| Command | Description |
|---------|-------------|
| `uwf thread start <workflow> -p <prompt>` | Create a thread without executing |
| `uwf thread exec <thread-id> [--agent <cmd>] [-c <count>] [--background]` | Execute one or more moderator→agent→extract cycles |
| `uwf thread step <thread-id> [--agent <cmd>] [-c <count>]` | Execute one or more moderator→agent→extract cycles |
| `uwf thread show <thread-id>` | Show thread head pointer |
| `uwf thread list [--status <status>] [--after <date>] [--before <date>] [--skip <n>] [--take <n>]` | List threads filtered by status (idle, running, completed, active, or comma-separated), time range (ISO or relative like '7d'), with pagination |
| `uwf thread list [--all]` | List active threads (`--all` includes archived) |
| `uwf thread steps <thread-id>` | List all steps chronologically |
| `uwf thread read <thread-id> [--quota N] [--before <hash>] [--start]` | Render thread as readable markdown |
`thread read`, `step list`, and `step show` work on both active and completed threads.
| `uwf thread stop <thread-id>` | Stop background execution (keep thread active) |
| `uwf thread cancel <thread-id>` | Cancel thread (stop + archive to history) |
| `uwf thread fork <step-hash>` | Fork from a specific step |
| `uwf thread step-details <step-hash>` | Dump full detail node as YAML |
| `uwf thread kill <thread-id>` | Terminate and archive |
Examples:
```bash
uwf thread start solve-issue -p "Fix the login redirect bug"
uwf thread exec 01ARZ3NDEKTSV4RRFFQ69G5FAV
uwf thread exec 01ARZ3NDEKTSV4RRFFQ69G5FAV -c 3 --agent uwf-builtin
uwf thread exec 01ARZ3NDEKTSV4RRFFQ69G5FAV --background
uwf thread list --status running
uwf thread list --status active
uwf thread list --status idle,completed
uwf thread list --after 7d --take 10
uwf thread step 01ARZ3NDEKTSV4RRFFQ69G5FAV
uwf thread step 01ARZ3NDEKTSV4RRFFQ69G5FAV -c 3 --agent uwf-builtin
uwf thread read 01ARZ3NDEKTSV4RRFFQ69G5FAV --quota 8000
uwf thread stop 01ARZ3NDEKTSV4RRFFQ69G5FAV
```
### Step (Layer 3: Single Cycle Results)
### Workflow
| Command | Description |
|---------|-------------|
| `uwf step list <thread-id>` | List all steps in a thread chronologically |
| `uwf step show <step-hash>` | Show step metadata and frontmatter |
| `uwf step read <step-hash> [--quota <chars>]` | Read a step's turns as human-readable markdown |
| `uwf step fork <step-hash>` | Fork a thread from a specific step |
Examples:
```bash
uwf step list 01ARZ3NDEKTSV4RRFFQ69G5FAV
uwf step show 32GCDE899RRQ3
uwf step read 32GCDE899RRQ3 --quota 2000
uwf step fork 32GCDE899RRQ3
```
### Workflow (Layer 1: Templates)
| Command | Description |
|---------|-------------|
| `uwf workflow add <file.yaml>` | Register a workflow from YAML |
| `uwf workflow put <file.yaml>` | Register a workflow from YAML |
| `uwf workflow show <name-or-hash>` | Show workflow definition |
| `uwf workflow list` | List registered workflows |
@@ -135,52 +99,6 @@ Config: `~/.uncaged/workflow/config.yaml`. API keys: `~/.uncaged/workflow/.env`.
| `uwf log show [--thread <id>] [--process <pid>] [--date YYYY-MM-DD]` | Show filtered log entries |
| `uwf log clean [--before YYYY-MM-DD]` | Delete old log files |
## Migration Guide
### Breaking Changes (v0.x → v1.x)
The CLI was reorganized to clarify the four-layer architecture. **No backward compatibility** — old commands have been removed.
#### Renamed Commands
| Old Command | New Command | Notes |
|------------|-------------|-------|
| `workflow put` | `workflow add` | More intuitive verb |
| `thread step` | `thread exec` | Eliminates ambiguity with "step" noun |
| `thread list --all` | `thread list --status completed` | Unified status filtering |
#### Removed Commands (Merged)
| Old Command | New Command | Notes |
|------------|-------------|-------|
| `thread running` | `thread list --status running` | Merged into unified list |
#### Removed Commands (Split)
| Old Command | New Commands | Notes |
|------------|-------------|-------|
| `thread kill` | `thread stop` or `thread cancel` | `stop` keeps thread active, `cancel` archives it |
#### Moved Commands
| Old Command | New Command | Notes |
|------------|-------------|-------|
| `thread steps` | `step list` | Moved to step layer |
| `thread step-details` | `step show` | Moved to step layer |
| `thread fork` | `step fork` | Moved to step layer (forks are step-based) |
#### Deprecation Errors
Old commands now show helpful error messages:
```bash
$ uwf thread step 01ARZ3NDEKTSV4RRFFQ69G5FAV
Error: Command 'thread step' has been removed.
Use 'thread exec' instead.
For more information, see: uwf help thread exec
```
## Internal Structure
```
@@ -191,9 +109,8 @@ src/
├── validate.ts Workflow YAML validation
├── schemas.ts CLI-local schema registration
└── commands/
├── thread.ts Thread lifecycle and exec
├── step.ts Step operations (list/show/read/fork)
├── workflow.ts Workflow registry (add/show/list)
├── thread.ts Thread lifecycle and step execution
├── workflow.ts Workflow registry (put/show/list)
├── cas.ts CAS inspection and schema ops
├── setup.ts Interactive/non-interactive setup
├── skill.ts Built-in skill references
@@ -1,152 +0,0 @@
import { execSync } from "node:child_process";
import { mkdir, rm } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { cmdCasPutText } from "../commands/cas.js";
let storageRoot: string;
let uwfPath: string;
beforeEach(async () => {
storageRoot = join(
tmpdir(),
`uwf-cas-exit-test-${Date.now()}-${Math.random().toString(36).slice(2)}`,
);
await mkdir(storageRoot, { recursive: true });
// Find the uwf CLI path
uwfPath = join(__dirname, "../../src/cli.ts");
});
afterEach(async () => {
await rm(storageRoot, { recursive: true, force: true });
});
type ExecResult = {
stdout: string;
stderr: string;
exitCode: number;
};
function execUwf(args: string[]): ExecResult {
try {
const stdout = execSync(`bun ${uwfPath} ${args.join(" ")}`, {
env: { ...process.env, WORKFLOW_STORAGE_ROOT: storageRoot },
encoding: "utf-8",
stdio: ["pipe", "pipe", "pipe"],
});
return { stdout, stderr: "", exitCode: 0 };
} catch (error: unknown) {
if (
error &&
typeof error === "object" &&
"stdout" in error &&
"stderr" in error &&
"status" in error
) {
return {
stdout: (error.stdout as Buffer | string).toString(),
stderr: (error.stderr as Buffer | string).toString(),
exitCode: error.status as number,
};
}
throw error;
}
}
describe("uwf cas has CLI exit codes", () => {
test("exits 0 when hash exists", async () => {
// Setup: Create a temp storage root, put a text node, capture hash
const putResult = await cmdCasPutText(storageRoot, "test content");
const hash = putResult.hash;
// Execute: uwf cas has <hash>
const result = execUwf(["cas", "has", hash]);
// Assert: stdout contains {"exists":true}, exit code === 0
expect(result.stdout).toContain('"exists":true');
expect(result.exitCode).toBe(0);
});
test("exits 1 when hash does not exist", () => {
// Setup: Create a temp storage root (empty CAS store)
// Execute: uwf cas has NOSUCHHASH123
const result = execUwf(["cas", "has", "NOSUCHHASH123"]);
// Assert: stdout contains {"exists":false}, exit code === 1
expect(result.stdout).toContain('"exists":false');
expect(result.exitCode).toBe(1);
});
test("JSON output format unchanged for exists=true", async () => {
// Setup: Create store, put node
const putResult = await cmdCasPutText(storageRoot, "test");
const hash = putResult.hash;
// Execute: uwf cas has <hash>
const result = execUwf(["cas", "has", hash]);
// Assert: stdout JSON parses correctly to {exists: true}
const parsed = JSON.parse(result.stdout.trim());
expect(parsed).toEqual({ exists: true });
});
test("JSON output format unchanged for exists=false", () => {
// Setup: Create empty store
// Execute: uwf cas has INVALID
const result = execUwf(["cas", "has", "INVALID"]);
// Assert: stdout JSON parses correctly to {exists: false}
const parsed = JSON.parse(result.stdout.trim());
expect(parsed).toEqual({ exists: false });
});
test("YAML output format preserves exit code behavior for exists=true", async () => {
// Setup: Create store with node
const putResult = await cmdCasPutText(storageRoot, "test");
const hash = putResult.hash;
// Execute: uwf --format yaml cas has <hash>
const result = execUwf(["--format", "yaml", "cas", "has", hash]);
// Assert: exit code === 0, output is YAML format
expect(result.exitCode).toBe(0);
expect(result.stdout).toContain("exists:");
expect(result.stdout).toContain("true");
});
test("YAML output format preserves exit code behavior for exists=false", () => {
// Setup: Create empty store
// Execute: uwf --format yaml cas has INVALID
const result = execUwf(["--format", "yaml", "cas", "has", "INVALID"]);
// Assert: exit code === 1, output is YAML format
expect(result.exitCode).toBe(1);
expect(result.stdout).toContain("exists:");
expect(result.stdout).toContain("false");
});
});
describe("regression: other cas commands unaffected", () => {
test("uwf cas get still exits 1 on not-found with error message", () => {
// Execute: uwf cas get NOSUCHHASH
const result = execUwf(["cas", "get", "NOSUCHHASH"]);
// Assert: exit code === 1, stderr contains "Node not found"
expect(result.exitCode).toBe(1);
expect(result.stderr).toContain("Node not found");
});
test("uwf cas put-text behavior unchanged", () => {
// Execute: uwf cas put-text "hello"
const result = execUwf(["cas", "put-text", "hello"]);
// Assert: exit code === 0, returns hash
expect(result.exitCode).toBe(0);
const parsed = JSON.parse(result.stdout.trim());
expect(parsed).toHaveProperty("hash");
expect(typeof parsed.hash).toBe("string");
expect(parsed.hash.length).toBe(13); // Crockford Base32 XXH64 hash length
});
});
@@ -1,74 +0,0 @@
import { mkdir, rm } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { cmdCasHas, cmdCasPutText } from "../commands/cas.js";
let storageRoot: string;
beforeEach(async () => {
storageRoot = join(tmpdir(), `uwf-cas-test-${Date.now()}-${Math.random().toString(36).slice(2)}`);
await mkdir(storageRoot, { recursive: true });
});
afterEach(async () => {
await rm(storageRoot, { recursive: true, force: true });
});
describe("cmdCasHas", () => {
test("returns {exists: true} for existing hash", async () => {
// Setup: Create a test store, put a node, get its hash
const putResult = await cmdCasPutText(storageRoot, "test content");
const hash = putResult.hash;
// Execute: Call cmdCasHas with the valid hash
const result = await cmdCasHas(storageRoot, hash);
// Assert: Result equals {exists: true}
expect(result).toEqual({ exists: true });
});
test("returns {exists: false} for non-existent hash", async () => {
// Setup: Create an empty test store
// (storageRoot already created in beforeEach)
// Execute: Call cmdCasHas with an invalid hash
const result = await cmdCasHas(storageRoot, "INVALIDHASH12");
// Assert: Result equals {exists: false}
expect(result).toEqual({ exists: false });
});
test("does not throw for non-existent hash", async () => {
// Setup: Create an empty test store
// Execute & Assert: Does not throw, returns {exists: false}
await expect(cmdCasHas(storageRoot, "NOSUCHHASH123")).resolves.toEqual({
exists: false,
});
});
test("handles malformed hash gracefully", async () => {
// Setup: Create a test store
// Execute: Call cmdCasHas with a too-short hash
const result = await cmdCasHas(storageRoot, "xyz");
// Assert: Returns {exists: false} (store.has() returns false)
expect(result).toEqual({ exists: false });
});
test("handles empty hash string", async () => {
// Execute: Call cmdCasHas with an empty string
const result = await cmdCasHas(storageRoot, "");
// Assert: Returns {exists: false}
expect(result).toEqual({ exists: false });
});
test("handles hash with special characters", async () => {
// Execute: Call cmdCasHas with special characters
const result = await cmdCasHas(storageRoot, "HASH!@#");
// Assert: Returns {exists: false}
expect(result).toEqual({ exists: false });
});
});
@@ -1,108 +0,0 @@
import { mkdtemp, rm } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { resolveHeadHash } from "../commands/shared.js";
import { appendThreadHistory, saveThreadsIndex } from "../store.js";
let tmpDir: string;
beforeEach(async () => {
tmpDir = await mkdtemp(join(tmpdir(), "cli-uwf-resolve-head-"));
});
afterEach(async () => {
await rm(tmpDir, { recursive: true, force: true });
});
describe("resolveHeadHash", () => {
test("returns head hash from threads.yaml for active thread", async () => {
const threadId = "01JTEST0000000000000000001" as ThreadId;
const headHash = "active_hash_123" as CasRef;
await saveThreadsIndex(tmpDir, { [threadId]: headHash });
const result = await resolveHeadHash(tmpDir, threadId);
expect(result).toBe(headHash);
});
test("falls back to history.jsonl when thread not in threads.yaml", async () => {
const threadId = "01JTEST0000000000000000002" as ThreadId;
const headHash = "completed_hash_456" as CasRef;
const workflowHash = "workflow_hash_789" as CasRef;
// No entry in threads.yaml, only in history.jsonl
await saveThreadsIndex(tmpDir, {});
await appendThreadHistory(tmpDir, {
thread: threadId,
workflow: workflowHash,
head: headHash,
completedAt: Date.now(),
});
const result = await resolveHeadHash(tmpDir, threadId);
expect(result).toBe(headHash);
});
// Note: Testing the error case requires CLI-level testing because resolveHeadHash
// calls fail() which does process.exit(1), terminating the test runner.
// The error behavior is tested in integration tests below via CLI invocation.
test("prioritizes active thread over history when thread exists in both", async () => {
const threadId = "01JTEST0000000000000000004" as ThreadId;
const activeHash = "active_hash_v2" as CasRef;
const historicalHash = "historical_hash_v1" as CasRef;
const workflowHash = "workflow_hash_xyz" as CasRef;
// Thread exists in both locations (should not happen normally, but test the precedence)
await saveThreadsIndex(tmpDir, { [threadId]: activeHash });
await appendThreadHistory(tmpDir, {
thread: threadId,
workflow: workflowHash,
head: historicalHash,
completedAt: Date.now(),
});
const result = await resolveHeadHash(tmpDir, threadId);
// Should return the active head, not the historical one
expect(result).toBe(activeHash);
});
test("finds thread from multiple history entries", async () => {
const threadId1 = "01JTEST0000000000000000005" as ThreadId;
const threadId2 = "01JTEST0000000000000000006" as ThreadId;
const threadId3 = "01JTEST0000000000000000007" as ThreadId;
const hash1 = "hash_thread1" as CasRef;
const hash2 = "hash_thread2" as CasRef;
const hash3 = "hash_thread3" as CasRef;
const workflowHash = "workflow_hash_abc" as CasRef;
await saveThreadsIndex(tmpDir, {});
await appendThreadHistory(tmpDir, {
thread: threadId1,
workflow: workflowHash,
head: hash1,
completedAt: Date.now() - 2000,
});
await appendThreadHistory(tmpDir, {
thread: threadId2,
workflow: workflowHash,
head: hash2,
completedAt: Date.now() - 1000,
});
await appendThreadHistory(tmpDir, {
thread: threadId3,
workflow: workflowHash,
head: hash3,
completedAt: Date.now(),
});
const result = await resolveHeadHash(tmpDir, threadId2);
expect(result).toBe(hash2);
});
});
@@ -1,97 +0,0 @@
import { readFile } from "node:fs/promises";
import { join } from "node:path";
import type { WorkflowPayload } from "@uncaged/workflow-protocol";
import { describe, expect, test } from "vitest";
import { parse } from "yaml";
/**
* Test: Issue #474 - tea pr create fails in git worktree directories
*
* This test verifies that the solve-issue workflow's committer role
* includes the --repo flag when running tea pr create, which fixes
* the "path segment [0] is empty" error in worktree directories.
*/
describe("solve-issue workflow: tea pr create worktree fix", () => {
// Navigate up from packages/cli-workflow to repo root
const workflowPath = join(process.cwd(), "..", "..", ".workflows", "solve-issue.yaml");
test("committer procedure should include --repo flag in tea pr create command", async () => {
const yamlContent = await readFile(workflowPath, "utf-8");
const workflow = parse(yamlContent) as WorkflowPayload;
expect(workflow.roles.committer).toBeDefined();
const committerProcedure = workflow.roles.committer?.procedure;
expect(committerProcedure).toBeDefined();
// Verify the procedure includes tea pr create with --repo flag
expect(committerProcedure).toContain("tea pr create");
expect(committerProcedure).toContain("--repo");
// Verify the --repo flag appears before or together with tea pr create
// This ensures the command is: tea pr create --repo <owner/repo> ...
const teaPrCreateMatch = committerProcedure?.match(/tea pr create[^\n]*/);
expect(teaPrCreateMatch).not.toBeNull();
if (teaPrCreateMatch) {
const teaCommandLine = teaPrCreateMatch[0];
expect(teaCommandLine).toContain("--repo");
}
});
test("committer procedure should mention repo extraction from git remote", async () => {
const yamlContent = await readFile(workflowPath, "utf-8");
const workflow = parse(yamlContent) as WorkflowPayload;
const committerProcedure = workflow.roles.committer?.procedure;
expect(committerProcedure).toBeDefined();
// Verify the procedure mentions extracting repo info from git remote
// This ensures fallback logic is documented
expect(committerProcedure).toMatch(/git remote/i);
});
test("committer procedure should include error handling for tea failures", async () => {
const yamlContent = await readFile(workflowPath, "utf-8");
const workflow = parse(yamlContent) as WorkflowPayload;
const committerProcedure = workflow.roles.committer?.procedure;
expect(committerProcedure).toBeDefined();
// Verify the procedure includes error handling guidance
// This ensures we capture failures and provide actionable output
expect(committerProcedure).toMatch(/error|fail/i);
});
test("workflow should be parseable as valid WorkflowPayload", async () => {
const yamlContent = await readFile(workflowPath, "utf-8");
const workflow = parse(yamlContent) as WorkflowPayload;
// Basic structure validation
expect(workflow.name).toBe("solve-issue");
expect(workflow.roles).toBeDefined();
expect(workflow.graph).toBeDefined();
// Verify committer role exists with required fields
expect(workflow.roles.committer).toBeDefined();
expect(workflow.roles.committer?.description).toBeDefined();
expect(workflow.roles.committer?.goal).toBeDefined();
expect(workflow.roles.committer?.procedure).toBeDefined();
expect(workflow.roles.committer?.output).toBeDefined();
expect(workflow.roles.committer?.frontmatter).toBeDefined();
});
test("committer frontmatter schema should require status field", async () => {
const yamlContent = await readFile(workflowPath, "utf-8");
// Parse as any to access the raw YAML structure (frontmatter is inline JSON Schema in YAML)
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const workflow = parse(yamlContent) as any;
const frontmatter = workflow.roles.committer?.frontmatter;
expect(frontmatter).toBeDefined();
expect(frontmatter?.type).toBe("object");
expect(frontmatter?.properties?.status).toBeDefined();
expect(frontmatter?.properties?.status?.enum).toContain("committed");
expect(frontmatter?.required).toContain("status");
});
});
@@ -1,519 +0,0 @@
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { bootstrap, putSchema } from "@uncaged/json-cas";
import { createFsStore } from "@uncaged/json-cas-fs";
import type { CasRef } from "@uncaged/workflow-protocol";
import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { cmdStepRead } from "../commands/step.js";
import { registerUwfSchemas } from "../schemas.js";
// ── schemas used in tests ────────────────────────────────────────────────────
const TURN_SCHEMA = {
title: "hermes-turn",
type: "object" as const,
required: ["index", "role", "content"],
properties: {
index: { type: "integer" as const },
role: { type: "string" as const },
content: { type: "string" as const },
toolCalls: {
anyOf: [
{ type: "array" as const, items: { type: "object" as const } },
{ type: "null" as const },
],
},
reasoning: { anyOf: [{ type: "string" as const }, { type: "null" as const }] },
},
additionalProperties: false,
};
const DETAIL_SCHEMA = {
title: "hermes-detail",
type: "object" as const,
required: ["sessionId", "model", "duration", "turnCount", "turns"],
properties: {
sessionId: { type: "string" as const },
model: { type: "string" as const },
duration: { type: "integer" as const },
turnCount: { type: "integer" as const },
turns: {
type: "array" as const,
items: { type: "string" as const, format: "cas_ref" },
},
},
additionalProperties: false,
};
// ── helpers ───────────────────────────────────────────────────────────────────
async function registerDetailSchemas(store: ReturnType<typeof createFsStore>) {
await bootstrap(store);
const [turn, detail] = await Promise.all([
putSchema(store, TURN_SCHEMA),
putSchema(store, DETAIL_SCHEMA),
]);
return { turn, detail };
}
function generateContent(size: number, prefix = "Content"): string {
const base = `${prefix} `;
const repeat = Math.ceil(size / base.length);
return base.repeat(repeat).slice(0, size);
}
// ── fixture ───────────────────────────────────────────────────────────────────
let tmpDir: string;
beforeEach(async () => {
tmpDir = await mkdtemp(join(tmpdir(), "cli-uwf-step-read-test-"));
});
afterEach(async () => {
await rm(tmpDir, { recursive: true, force: true });
});
// ── step read tests ───────────────────────────────────────────────────────────
describe("step read", () => {
test("test 1: basic single-step read with 3 turns", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
const detailSchemas = await registerDetailSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
// Create 3 turns
const turnHashes: CasRef[] = [];
for (let i = 1; i <= 3; i++) {
const content = `Turn ${i} content with some text to make it readable.`;
const turnHash = await store.put(detailSchemas.turn, {
index: i - 1,
role: "assistant",
content,
toolCalls: null,
reasoning: null,
});
turnHashes.push(turnHash);
}
const detailHash = await store.put(detailSchemas.detail, {
sessionId: "session-1",
model: "test-model",
duration: 1000,
turnCount: 3,
turns: turnHashes,
});
const stepHash = await store.put(schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: detailHash,
agent: "uwf-test",
});
// Read step with large quota
const markdown = await cmdStepRead(tmpDir, stepHash, 10000);
// Assert structure
expect(markdown).toContain(`# Step ${stepHash}`);
expect(markdown).toContain("**Role:** worker");
expect(markdown).toContain("**Agent:** uwf-test");
expect(markdown).toContain("## Turn 1");
expect(markdown).toContain("## Turn 2");
expect(markdown).toContain("## Turn 3");
expect(markdown).toContain("Turn 1 content with some text to make it readable.");
expect(markdown).toContain("Turn 2 content with some text to make it readable.");
expect(markdown).toContain("Turn 3 content with some text to make it readable.");
});
test("test 2: quota enforcement - multiple turns", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
const detailSchemas = await registerDetailSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
// Create 4 turns of ~300 chars each
const turnHashes: CasRef[] = [];
for (let i = 1; i <= 4; i++) {
const content = generateContent(300, `Turn${i}`);
const turnHash = await store.put(detailSchemas.turn, {
index: i - 1,
role: "assistant",
content,
toolCalls: null,
reasoning: null,
});
turnHashes.push(turnHash);
}
const detailHash = await store.put(detailSchemas.detail, {
sessionId: "session-1",
model: "test-model",
duration: 1000,
turnCount: 4,
turns: turnHashes,
});
const stepHash = await store.put(schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: detailHash,
agent: "uwf-test",
});
// Read step with limited quota (700 chars)
const markdown = await cmdStepRead(tmpDir, stepHash, 700);
// Assert only most recent turns fit
expect(markdown).toContain(`# Step ${stepHash}`);
// Should have skip hint
expect(markdown).toContain("Earlier turns omitted");
// Should include at least Turn 4 (most recent)
expect(markdown).toContain("Turn4");
// Total length should respect quota (with tolerance for structural overhead)
expect(markdown.length).toBeLessThanOrEqual(900); // 700 quota + 200 buffer tolerance
});
test("test 3: minimal quota edge case - always show at least one turn", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
const detailSchemas = await registerDetailSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
// Create 1 turn of 500 chars
const content = generateContent(500, "LongTurn");
const turnHash = await store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content,
toolCalls: null,
reasoning: null,
});
const detailHash = await store.put(detailSchemas.detail, {
sessionId: "session-1",
model: "test-model",
duration: 1000,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await store.put(schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: detailHash,
agent: "uwf-test",
});
// Read step with minimal quota (1 char)
const markdown = await cmdStepRead(tmpDir, stepHash, 1);
// Assert at least one turn is always shown
expect(markdown).toContain("LongTurn");
expect(markdown.length).toBeGreaterThan(1);
});
test("test 4: step with no detail field", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const stepHash = await store.put(schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: null,
agent: "uwf-test",
});
// Read step - should return metadata only (no error)
const markdown = await cmdStepRead(tmpDir, stepHash, 4000);
// Assert metadata is present
expect(markdown).toContain(`# Step ${stepHash}`);
expect(markdown).toContain("**Role:** worker");
expect(markdown).toContain("**Agent:** uwf-test");
// Should not have turn sections
expect(markdown).not.toContain("## Turn");
});
test("test 5: step with detail but no turns array", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
await registerDetailSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
// Create detail with different schema (no turns)
const SIMPLE_DETAIL_SCHEMA = {
title: "simple-detail",
type: "object" as const,
required: ["sessionId"],
properties: {
sessionId: { type: "string" as const },
},
additionalProperties: false,
};
await bootstrap(store);
const simpleDetailType = await putSchema(store, SIMPLE_DETAIL_SCHEMA);
const detailHash = await store.put(simpleDetailType, {
sessionId: "session-1",
});
const stepHash = await store.put(schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: detailHash,
agent: "uwf-test",
});
// Read step - should return metadata only (no error)
const markdown = await cmdStepRead(tmpDir, stepHash, 4000);
// Assert metadata is present
expect(markdown).toContain(`# Step ${stepHash}`);
expect(markdown).toContain("**Role:** worker");
// Should not have turn sections
expect(markdown).not.toContain("## Turn");
});
test("test 6: turn content with special characters", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
const detailSchemas = await registerDetailSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
// Create turn with special markdown characters
const content = "This has `backticks`, **bold**, *italic*, and [links](http://example.com)";
const turnHash = await store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content,
toolCalls: null,
reasoning: null,
});
const detailHash = await store.put(detailSchemas.detail, {
sessionId: "session-1",
model: "test-model",
duration: 1000,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await store.put(schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: detailHash,
agent: "uwf-test",
});
// Read step
const markdown = await cmdStepRead(tmpDir, stepHash, 4000);
// Assert content is rendered correctly without corruption
expect(markdown).toContain("`backticks`");
expect(markdown).toContain("**bold**");
expect(markdown).toContain("*italic*");
expect(markdown).toContain("[links](http://example.com)");
});
});
@@ -1,550 +0,0 @@
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
import { extractUlidTimestamp, generateUlid } from "@uncaged/workflow-util";
import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { createMarker, deleteMarker } from "../background/index.js";
import { cmdThreadList } from "../commands/thread.js";
import { parseTimeInput } from "../commands/thread-time-parser.js";
import type { UwfStore } from "../store.js";
import { appendThreadHistory, createUwfStore, saveThreadsIndex } from "../store.js";
// ── helpers ───────────────────────────────────────────────────────────────────
async function makeUwfStore(storageRoot: string): Promise<UwfStore> {
const casDir = join(storageRoot, "cas");
await mkdir(casDir, { recursive: true });
return createUwfStore(storageRoot);
}
async function createTestWorkflow(uwf: UwfStore): Promise<CasRef> {
const workflowPayload = {
name: "test-workflow",
roles: {
role1: {
goal: "test goal",
outputSchema: { type: "object" as const, properties: {} },
},
},
graph: { start: "role1" },
conditions: {},
};
return await uwf.store.put(uwf.schemas.workflow, workflowPayload);
}
async function createTestThread(
uwf: UwfStore,
storageRoot: string,
workflowHash: CasRef,
timestamp: number,
): Promise<ThreadId> {
const threadId = generateUlid(timestamp) as ThreadId;
const startPayload = {
workflow: workflowHash,
prompt: "test prompt",
};
const headHash = await uwf.store.put(uwf.schemas.startNode, startPayload);
const index = await import("../store.js").then((m) => m.loadThreadsIndex(storageRoot));
index[threadId] = headHash;
await saveThreadsIndex(storageRoot, index);
return threadId;
}
async function markThreadRunning(storageRoot: string, threadId: ThreadId, workflow: CasRef) {
await createMarker(storageRoot, {
thread: threadId,
workflow,
pid: process.pid, // Use current process PID so isPidAlive returns true
startedAt: Date.now(),
});
}
async function completeThread(
storageRoot: string,
threadId: ThreadId,
workflowHash: CasRef,
headHash: CasRef,
) {
const index = await import("../store.js").then((m) => m.loadThreadsIndex(storageRoot));
delete index[threadId];
await saveThreadsIndex(storageRoot, index);
await appendThreadHistory(storageRoot, {
thread: threadId,
workflow: workflowHash,
head: headHash,
completedAt: Date.now(),
});
}
// ── test setup ────────────────────────────────────────────────────────────────
let tmpDir: string;
beforeEach(async () => {
tmpDir = await mkdtemp(join(tmpdir(), "thread-list-filters-test-"));
});
afterEach(async () => {
await rm(tmpDir, { recursive: true, force: true });
});
// ── status filter tests ───────────────────────────────────────────────────────
describe("cmdThreadList status filter", () => {
test("should return idle and running threads when status=active", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const thread1 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 3000);
const thread2 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 2000);
const thread3 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 1000);
await markThreadRunning(tmpDir, thread2, workflowHash);
const index = await import("../store.js").then((m) => m.loadThreadsIndex(tmpDir));
const thread3Head = index[thread3];
if (thread3Head === undefined) throw new Error("thread3 head not found");
await completeThread(tmpDir, thread3, workflowHash, thread3Head);
const result = await cmdThreadList(tmpDir, ["idle", "running"], null, null, null, null);
expect(result).toHaveLength(2);
expect(result.map((r) => r.thread).sort()).toEqual([thread1, thread2].sort());
// Clean up marker after test
await deleteMarker(tmpDir, thread2);
});
test("should support comma-separated status values", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const thread1 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 3000);
const thread2 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 2000);
const thread3 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 1000);
await markThreadRunning(tmpDir, thread2, workflowHash);
const index = await import("../store.js").then((m) => m.loadThreadsIndex(tmpDir));
const thread3Head = index[thread3];
if (thread3Head === undefined) throw new Error("thread3 head not found");
await completeThread(tmpDir, thread3, workflowHash, thread3Head);
const result = await cmdThreadList(tmpDir, ["idle", "completed"], null, null, null, null);
// Clean up marker
await deleteMarker(tmpDir, thread2);
// thread2 is running (not idle), so should not be included
// Expected: thread1 (idle) and thread3 (completed)
expect(result).toHaveLength(2);
expect(result.map((r) => r.thread).sort()).toEqual([thread1, thread3].sort());
});
test("should support single status filter (backward compat)", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const _thread1 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 3000);
const _thread2 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 2000);
const thread3 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 1000);
const index = await import("../store.js").then((m) => m.loadThreadsIndex(tmpDir));
const thread3Head = index[thread3];
if (thread3Head === undefined) throw new Error("thread3 head not found");
await completeThread(tmpDir, thread3, workflowHash, thread3Head);
const result = await cmdThreadList(tmpDir, ["completed"], null, null, null, null);
expect(result).toHaveLength(1);
expect(result[0]?.thread).toBe(thread3);
expect(result[0]?.status).toBe("completed");
});
test("should return all threads when no status filter provided", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const thread1 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 3000);
const thread2 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 2000);
const thread3 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 1000);
await markThreadRunning(tmpDir, thread2, workflowHash);
const index = await import("../store.js").then((m) => m.loadThreadsIndex(tmpDir));
const thread3Head = index[thread3];
if (thread3Head === undefined) throw new Error("thread3 head not found");
await completeThread(tmpDir, thread3, workflowHash, thread3Head);
const result = await cmdThreadList(tmpDir, null, null, null, null, null);
expect(result).toHaveLength(3);
expect(result.map((r) => r.thread).sort()).toEqual([thread1, thread2, thread3].sort());
});
});
// ── time range filtering tests ────────────────────────────────────────────────
describe("cmdThreadList time filters", () => {
test("should filter threads created after given timestamp", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const ts1 = Date.UTC(2026, 4, 20, 0, 0, 0);
const ts2 = Date.UTC(2026, 4, 21, 0, 0, 0);
const ts3 = Date.UTC(2026, 4, 22, 0, 0, 0);
const _threadA = await createTestThread(uwf, tmpDir, workflowHash, ts1);
const threadB = await createTestThread(uwf, tmpDir, workflowHash, ts2);
const threadC = await createTestThread(uwf, tmpDir, workflowHash, ts3);
// Use a timestamp slightly before ts2 to include threadB
const afterMs = Date.UTC(2026, 4, 20, 12, 0, 0);
const result = await cmdThreadList(tmpDir, null, afterMs, null, null, null);
expect(result).toHaveLength(2);
expect(result.map((r) => r.thread).sort()).toEqual([threadB, threadC].sort());
});
test("should filter threads created before given timestamp", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const ts1 = Date.UTC(2026, 4, 20, 0, 0, 0);
const ts2 = Date.UTC(2026, 4, 21, 0, 0, 0);
const ts3 = Date.UTC(2026, 4, 22, 0, 0, 0);
const threadA = await createTestThread(uwf, tmpDir, workflowHash, ts1);
const threadB = await createTestThread(uwf, tmpDir, workflowHash, ts2);
const _threadC = await createTestThread(uwf, tmpDir, workflowHash, ts3);
const beforeMs = Date.UTC(2026, 4, 22, 0, 0, 0);
const result = await cmdThreadList(tmpDir, null, null, beforeMs, null, null);
expect(result).toHaveLength(2);
expect(result.map((r) => r.thread).sort()).toEqual([threadA, threadB].sort());
});
test("should support both after and before filters (time range)", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const ts1 = Date.UTC(2026, 4, 20, 0, 0, 0);
const ts2 = Date.UTC(2026, 4, 21, 0, 0, 0);
const ts3 = Date.UTC(2026, 4, 22, 0, 0, 0);
const _threadA = await createTestThread(uwf, tmpDir, workflowHash, ts1);
const threadB = await createTestThread(uwf, tmpDir, workflowHash, ts2);
const _threadC = await createTestThread(uwf, tmpDir, workflowHash, ts3);
const afterMs = Date.UTC(2026, 4, 20, 12, 0, 0);
const beforeMs = Date.UTC(2026, 4, 22, 0, 0, 0);
const result = await cmdThreadList(tmpDir, null, afterMs, beforeMs, null, null);
expect(result).toHaveLength(1);
expect(result[0]?.thread).toBe(threadB);
});
});
// ── pagination tests ──────────────────────────────────────────────────────────
describe("cmdThreadList pagination", () => {
test("should limit results with --take", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const threads: ThreadId[] = [];
for (let i = 0; i < 10; i++) {
threads.push(await createTestThread(uwf, tmpDir, workflowHash, Date.now() - i * 1000));
}
const result = await cmdThreadList(tmpDir, null, null, null, null, 5);
expect(result).toHaveLength(5);
});
test("should skip first N threads with --skip", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const threads: ThreadId[] = [];
// Create threads in chronological order, but they'll be sorted newest first
for (let i = 0; i < 10; i++) {
threads.push(await createTestThread(uwf, tmpDir, workflowHash, Date.now() + i * 100));
// Small delay to ensure distinct timestamps
await new Promise((resolve) => setTimeout(resolve, 10));
}
const result = await cmdThreadList(tmpDir, null, null, null, 3, null);
expect(result).toHaveLength(7);
// The 3 newest threads should be skipped, so we should get the 7 oldest
});
test("should support skip + take for pagination", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const threads: ThreadId[] = [];
for (let i = 0; i < 10; i++) {
threads.push(await createTestThread(uwf, tmpDir, workflowHash, Date.now() + i * 100));
await new Promise((resolve) => setTimeout(resolve, 10));
}
const result = await cmdThreadList(tmpDir, null, null, null, 5, 3);
expect(result).toHaveLength(3);
// Should skip first 5 (newest), then take 3
});
test("should handle take > available threads", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const _thread1 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 3000);
const _thread2 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 2000);
const _thread3 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 1000);
const result = await cmdThreadList(tmpDir, null, null, null, null, 10);
expect(result).toHaveLength(3);
});
test("should return empty array when skip >= thread count", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 3000);
await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 2000);
await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 1000);
const result = await cmdThreadList(tmpDir, null, null, null, 5, null);
expect(result).toHaveLength(0);
});
});
// ── combined filters tests ────────────────────────────────────────────────────
describe("combined filters", () => {
test("should combine status and time range filters", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const ts1 = Date.UTC(2026, 4, 20, 0, 0, 0);
const ts2 = Date.UTC(2026, 4, 21, 0, 0, 0);
const ts3 = Date.UTC(2026, 4, 22, 0, 0, 0);
const ts4 = Date.UTC(2026, 4, 23, 0, 0, 0);
const _thread1 = await createTestThread(uwf, tmpDir, workflowHash, ts1);
const thread2 = await createTestThread(uwf, tmpDir, workflowHash, ts2);
const thread3 = await createTestThread(uwf, tmpDir, workflowHash, ts3);
const thread4 = await createTestThread(uwf, tmpDir, workflowHash, ts4);
await markThreadRunning(tmpDir, thread2, workflowHash);
const index = await import("../store.js").then((m) => m.loadThreadsIndex(tmpDir));
const thread3Head = index[thread3];
if (thread3Head === undefined) throw new Error("thread3 head not found");
await completeThread(tmpDir, thread3, workflowHash, thread3Head);
const afterMs = Date.UTC(2026, 4, 20, 12, 0, 0);
const result = await cmdThreadList(tmpDir, ["idle"], afterMs, null, null, null);
expect(result).toHaveLength(1);
expect(result[0]?.thread).toBe(thread4);
expect(result[0]?.status).toBe("idle");
// Clean up marker
await deleteMarker(tmpDir, thread2);
});
test("should combine status filter and pagination", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const threads: ThreadId[] = [];
for (let i = 9; i >= 0; i--) {
const thread = await createTestThread(uwf, tmpDir, workflowHash, Date.now() + i * 1000);
threads.push(thread);
const index = await import("../store.js").then((m) => m.loadThreadsIndex(tmpDir));
const headHash = index[thread];
if (headHash === undefined) throw new Error("head not found");
await completeThread(tmpDir, thread, workflowHash, headHash);
}
const result = await cmdThreadList(tmpDir, ["completed"], null, null, 3, 5);
expect(result).toHaveLength(5);
for (const r of result) {
expect(r.status).toBe("completed");
}
});
test("should combine time range and pagination", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const threads: ThreadId[] = [];
for (let i = 0; i < 20; i++) {
const ts = Date.UTC(2026, 4, 1 + i, 0, 0, 0);
threads.push(await createTestThread(uwf, tmpDir, workflowHash, ts));
}
const afterMs = Date.UTC(2026, 4, 10, 0, 0, 0);
const result = await cmdThreadList(tmpDir, null, afterMs, null, 2, 5);
expect(result).toHaveLength(5);
for (const r of result) {
const ts = extractUlidTimestamp(r.thread);
expect(ts).not.toBeNull();
if (ts !== null) {
expect(ts).toBeGreaterThan(afterMs);
}
}
});
async function setupMixedStatusThreads(
uwf: UwfStore,
workflowHash: string,
count: number,
): Promise<ThreadId[]> {
const threads: ThreadId[] = [];
for (let i = 0; i < count; i++) {
const ts = Date.UTC(2026, 4, 10 + i, 0, 0, 0);
const thread = await createTestThread(uwf, tmpDir, workflowHash, ts);
threads.push(thread);
if (i % 2 === 0) {
const index = await import("../store.js").then((m) => m.loadThreadsIndex(tmpDir));
const headHash = index[thread];
if (headHash === undefined) throw new Error("head not found");
await completeThread(tmpDir, thread, workflowHash, headHash);
} else {
await markThreadRunning(tmpDir, thread, workflowHash);
}
}
return threads;
}
async function cleanupRunningMarkers(threads: ThreadId[]): Promise<void> {
for (let i = 0; i < threads.length; i++) {
if (i % 2 !== 0) {
await deleteMarker(tmpDir, threads[i] as ThreadId);
}
}
}
test("should combine all filters (status + time + pagination)", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const threads = await setupMixedStatusThreads(uwf, workflowHash, 15);
const afterMs = Date.UTC(2026, 4, 14, 12, 0, 0);
const beforeMs = Date.UTC(2026, 4, 20, 0, 0, 0);
const result = await cmdThreadList(tmpDir, ["idle", "running"], afterMs, beforeMs, 1, 3);
expect(result.length).toBeLessThanOrEqual(3);
for (const r of result) {
expect(["idle", "running"]).toContain(r.status);
const ts = extractUlidTimestamp(r.thread);
if (ts !== null) {
expect(ts).toBeGreaterThan(afterMs);
expect(ts).toBeLessThan(beforeMs);
}
}
await cleanupRunningMarkers(threads);
});
});
// ── edge cases tests ──────────────────────────────────────────────────────────
describe("edge cases", () => {
test("should handle empty thread list", async () => {
await makeUwfStore(tmpDir);
const result = await cmdThreadList(tmpDir, null, null, null, null, null);
expect(result).toHaveLength(0);
});
test("should skip threads with invalid ULID when time filtering", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await createTestWorkflow(uwf);
const thread1 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 2000);
const thread2 = await createTestThread(uwf, tmpDir, workflowHash, Date.now() - 1000);
const index = await import("../store.js").then((m) => m.loadThreadsIndex(tmpDir));
index["INVALID_ULID_FORMAT_HERE" as ThreadId] = "01J6HMVRNQKJV2";
await saveThreadsIndex(tmpDir, index);
const afterMs = Date.now() - 3000;
const result = await cmdThreadList(tmpDir, null, afterMs, null, null, null);
expect(result).toHaveLength(2);
expect(result.map((r) => r.thread).sort()).toEqual([thread1, thread2].sort());
});
});
// ── time parsing tests ────────────────────────────────────────────────────────
describe("relative time parsing", () => {
test("should parse '7d' as 7 days ago", () => {
const nowMs = Date.UTC(2026, 4, 24, 12, 0, 0);
const result = parseTimeInput("7d", nowMs);
const expected = Date.UTC(2026, 4, 17, 12, 0, 0);
expect(result).toBe(expected);
});
test("should parse '24h' as 24 hours ago", () => {
const nowMs = Date.UTC(2026, 4, 24, 12, 0, 0);
const result = parseTimeInput("24h", nowMs);
const expected = Date.UTC(2026, 4, 23, 12, 0, 0);
expect(result).toBe(expected);
});
test("should parse '30m' as 30 minutes ago", () => {
const nowMs = Date.UTC(2026, 4, 24, 12, 30, 0);
const result = parseTimeInput("30m", nowMs);
const expected = Date.UTC(2026, 4, 24, 12, 0, 0);
expect(result).toBe(expected);
});
test("should parse '1d' as 1 day ago", () => {
const nowMs = Date.UTC(2026, 4, 24, 0, 0, 0);
const result = parseTimeInput("1d", nowMs);
const expected = Date.UTC(2026, 4, 23, 0, 0, 0);
expect(result).toBe(expected);
});
});
describe("ISO date parsing", () => {
test("should parse ISO date (YYYY-MM-DD)", () => {
const nowMs = Date.now();
const result = parseTimeInput("2026-05-20", nowMs);
const expected = Date.UTC(2026, 4, 20, 0, 0, 0);
expect(result).toBe(expected);
});
test("should parse ISO datetime (YYYY-MM-DDTHH:MM:SS)", () => {
const nowMs = Date.now();
const result = parseTimeInput("2026-05-20T14:30:00", nowMs);
const expected = Date.parse("2026-05-20T14:30:00");
expect(result).toBe(expected);
});
test("should parse ISO datetime with Z suffix", () => {
const nowMs = Date.now();
const result = parseTimeInput("2026-05-20T14:30:00Z", nowMs);
const expected = Date.UTC(2026, 4, 20, 14, 30, 0);
expect(result).toBe(expected);
});
test("should reject invalid date formats", () => {
const nowMs = Date.now();
expect(() => parseTimeInput("not-a-date", nowMs)).toThrow();
expect(() => parseTimeInput("2026-13-01", nowMs)).toThrow();
expect(() => parseTimeInput("invalid", nowMs)).toThrow();
});
});
@@ -1,583 +0,0 @@
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { bootstrap, putSchema } from "@uncaged/json-cas";
import { createFsStore } from "@uncaged/json-cas-fs";
import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { cmdThreadRead } from "../commands/thread.js";
import { registerUwfSchemas } from "../schemas.js";
import { saveThreadsIndex } from "../store.js";
// ── schemas used in tests ────────────────────────────────────────────────────
const TURN_SCHEMA = {
title: "hermes-turn",
type: "object" as const,
required: ["index", "role", "content"],
properties: {
index: { type: "integer" as const },
role: { type: "string" as const },
content: { type: "string" as const },
toolCalls: {
anyOf: [
{ type: "array" as const, items: { type: "object" as const } },
{ type: "null" as const },
],
},
reasoning: { anyOf: [{ type: "string" as const }, { type: "null" as const }] },
},
additionalProperties: false,
};
const DETAIL_SCHEMA = {
title: "hermes-detail",
type: "object" as const,
required: ["sessionId", "model", "duration", "turnCount", "turns"],
properties: {
sessionId: { type: "string" as const },
model: { type: "string" as const },
duration: { type: "integer" as const },
turnCount: { type: "integer" as const },
turns: {
type: "array" as const,
items: { type: "string" as const, format: "cas_ref" },
},
},
additionalProperties: false,
};
// ── helpers ───────────────────────────────────────────────────────────────────
async function registerDetailSchemas(store: ReturnType<typeof createFsStore>) {
await bootstrap(store);
const [turn, detail] = await Promise.all([
putSchema(store, TURN_SCHEMA),
putSchema(store, DETAIL_SCHEMA),
]);
return { turn, detail };
}
function generateContent(size: number, prefix = "Content"): string {
const base = `${prefix} `;
const repeat = Math.ceil(size / base.length);
return base.repeat(repeat).slice(0, size);
}
// ── fixture ───────────────────────────────────────────────────────────────────
let tmpDir: string;
beforeEach(async () => {
tmpDir = await mkdtemp(join(tmpdir(), "cli-uwf-quota-test-"));
});
afterEach(async () => {
await rm(tmpDir, { recursive: true, force: true });
});
// ── thread read quota enforcement ─────────────────────────────────────────────
describe("thread read --quota flag", () => {
test("test 1: basic quota enforcement with 3 steps", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
const detailSchemas = await registerDetailSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
// Create 3 steps with ~500 chars each
const steps: CasRef[] = [];
for (let i = 1; i <= 3; i++) {
const content = generateContent(500, `Step${i}`);
const turnHash = await store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content,
toolCalls: null,
reasoning: null,
});
const detailHash = await store.put(detailSchemas.detail, {
sessionId: `session-${i}`,
model: "test-model",
duration: 1000,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await store.put(schemas.stepNode, {
start: startHash,
prev: steps[i - 2] ?? null,
role: "worker",
output: outputHash,
detail: detailHash,
agent: "uwf-test",
});
steps.push(stepHash);
}
const threadId = "01HX2Q3R4S5T6V7W8X9YZ0" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: steps[2] as CasRef });
// Set quota to 800 chars - should only fit most recent steps
const markdown = await cmdThreadRead(tmpDir, threadId, 800, null, false);
// Quota must be reasonably enforced (allow ~200 char tolerance for skip hint)
expect(markdown.length).toBeLessThanOrEqual(1000);
// Should contain skip hint since not all steps fit
expect(markdown).toMatch(/earlier step/);
// Most recent step should be included
expect(markdown).toMatch(/Step3/);
});
test("test 2: quota check order - verifies bug is fixed", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
const detailSchemas = await registerDetailSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
// Create 2 steps: first=300 chars, second=600 chars
const step1Content = generateContent(300, "First");
const step1TurnHash = await store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content: step1Content,
toolCalls: null,
reasoning: null,
});
const step1DetailHash = await store.put(detailSchemas.detail, {
sessionId: "session-1",
model: "test-model",
duration: 1000,
turnCount: 1,
turns: [step1TurnHash],
});
const step1Hash = await store.put(schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: step1DetailHash,
agent: "uwf-test",
});
const step2Content = generateContent(600, "Second");
const step2TurnHash = await store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content: step2Content,
toolCalls: null,
reasoning: null,
});
const step2DetailHash = await store.put(detailSchemas.detail, {
sessionId: "session-2",
model: "test-model",
duration: 1000,
turnCount: 1,
turns: [step2TurnHash],
});
const step2Hash = await store.put(schemas.stepNode, {
start: startHash,
prev: step1Hash,
role: "worker",
output: outputHash,
detail: step2DetailHash,
agent: "uwf-test",
});
const threadId = "01HX2Q3R4S5T6V7W8X9YZ1" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: step2Hash });
// Set quota to 500 chars
const markdown = await cmdThreadRead(tmpDir, threadId, 500, null, false);
// Bug fix verification: output must be limited (allow ~200 char tolerance)
expect(markdown.length).toBeLessThanOrEqual(1100);
// Should contain "Second" (most recent step)
expect(markdown).toMatch(/Second/);
// Should skip first step
expect(markdown).toMatch(/earlier step/);
// Verify improvement: before fix would be ~1264, now should be much closer to 500
expect(markdown.length).toBeLessThan(1200);
});
test("test 3: quota with --start section", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
const detailSchemas = await registerDetailSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task with a moderately long prompt to test quota accounting",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
// Create 2 steps
const steps: CasRef[] = [];
for (let i = 1; i <= 2; i++) {
const content = generateContent(400, `Step${i}`);
const turnHash = await store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content,
toolCalls: null,
reasoning: null,
});
const detailHash = await store.put(detailSchemas.detail, {
sessionId: `session-${i}`,
model: "test-model",
duration: 1000,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await store.put(schemas.stepNode, {
start: startHash,
prev: steps[i - 2] ?? null,
role: "worker",
output: outputHash,
detail: detailHash,
agent: "uwf-test",
});
steps.push(stepHash);
}
const threadId = "01HX2Q3R4S5T6V7W8X9YZ2" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: steps[1] as CasRef });
// Set tight quota with --start flag
const markdown = await cmdThreadRead(tmpDir, threadId, 600, null, true);
// Quota must be reasonably enforced (allow ~210 char tolerance for structure)
expect(markdown.length).toBeLessThanOrEqual(810);
// Should contain thread header
expect(markdown).toMatch(/# Thread/);
expect(markdown).toMatch(/test-wf/);
});
test("test 5a: quota edge case - minimal quota", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
const detailSchemas = await registerDetailSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const content = generateContent(500, "Test");
const turnHash = await store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content,
toolCalls: null,
reasoning: null,
});
const detailHash = await store.put(detailSchemas.detail, {
sessionId: "session-1",
model: "test-model",
duration: 1000,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await store.put(schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: detailHash,
agent: "uwf-test",
});
const threadId = "01HX2Q3R4S5T6V7W8X9YZ4" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
// Minimal quota
const markdown = await cmdThreadRead(tmpDir, threadId, 1, null, false);
// Should handle gracefully - always shows at least one step
expect(markdown.length).toBeGreaterThan(1);
expect(markdown).toMatch(/Test/);
});
test("test 5b: quota edge case - very large quota", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
const detailSchemas = await registerDetailSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
// Create 3 steps
const steps: CasRef[] = [];
for (let i = 1; i <= 3; i++) {
const content = generateContent(300, `Step${i}`);
const turnHash = await store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content,
toolCalls: null,
reasoning: null,
});
const detailHash = await store.put(detailSchemas.detail, {
sessionId: `session-${i}`,
model: "test-model",
duration: 1000,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await store.put(schemas.stepNode, {
start: startHash,
prev: steps[i - 2] ?? null,
role: "worker",
output: outputHash,
detail: detailHash,
agent: "uwf-test",
});
steps.push(stepHash);
}
const threadId = "01HX2Q3R4S5T6V7W8X9YZ5" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: steps[2] as CasRef });
// Very large quota
const markdown = await cmdThreadRead(tmpDir, threadId, 1000000, null, false);
// Should show all steps (no skipping)
expect(markdown).not.toMatch(/earlier step/);
expect(markdown).toMatch(/Step1/);
expect(markdown).toMatch(/Step2/);
expect(markdown).toMatch(/Step3/);
});
test("test 6: quota with --before parameter", async () => {
const casDir = join(tmpDir, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
const detailSchemas = await registerDetailSchemas(store);
const workflowHash = await store.put(schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do the work.",
output: "Summarize the work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await store.put(schemas.startNode, {
workflow: workflowHash,
prompt: "Test task",
});
const outputHash = await store.put(schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
// Create 5 steps
const steps: CasRef[] = [];
for (let i = 1; i <= 5; i++) {
const content = generateContent(300, `Step${i}`);
const turnHash = await store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content,
toolCalls: null,
reasoning: null,
});
const detailHash = await store.put(detailSchemas.detail, {
sessionId: `session-${i}`,
model: "test-model",
duration: 1000,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await store.put(schemas.stepNode, {
start: startHash,
prev: steps[i - 2] ?? null,
role: "worker",
output: outputHash,
detail: detailHash,
agent: "uwf-test",
});
steps.push(stepHash);
}
const threadId = "01HX2Q3R4S5T6V7W8X9YZ6" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: steps[4] as CasRef });
// Use --before to limit to steps 1-2, then set quota that allows only 1
const markdown = await cmdThreadRead(tmpDir, threadId, 500, steps[2] as CasRef, false);
// Should not contain Step3 or later
expect(markdown).not.toMatch(/Step3/);
expect(markdown).not.toMatch(/Step4/);
expect(markdown).not.toMatch(/Step5/);
// Quota should select most recent of candidates (Step2)
expect(markdown).toMatch(/Step2/);
// Quota enforcement (allow ~200 char tolerance)
expect(markdown.length).toBeLessThanOrEqual(700);
});
});
@@ -1,683 +0,0 @@
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { bootstrap, putSchema } from "@uncaged/json-cas";
import { createFsStore } from "@uncaged/json-cas-fs";
import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { cmdThreadRead, THREAD_READ_DEFAULT_QUOTA } from "../commands/thread.js";
import { registerUwfSchemas } from "../schemas.js";
import type { UwfStore } from "../store.js";
import { saveThreadsIndex } from "../store.js";
// ── schemas used in tests ────────────────────────────────────────────────────
const TURN_SCHEMA = {
title: "hermes-turn",
type: "object" as const,
required: ["index", "role", "content"],
properties: {
index: { type: "integer" as const },
role: { type: "string" as const },
content: { type: "string" as const },
toolCalls: {
anyOf: [
{ type: "array" as const, items: { type: "object" as const } },
{ type: "null" as const },
],
},
reasoning: { anyOf: [{ type: "string" as const }, { type: "null" as const }] },
},
additionalProperties: false,
};
const DETAIL_SCHEMA = {
title: "hermes-detail",
type: "object" as const,
required: ["sessionId", "model", "duration", "turnCount", "turns"],
properties: {
sessionId: { type: "string" as const },
model: { type: "string" as const },
duration: { type: "integer" as const },
turnCount: { type: "integer" as const },
turns: {
type: "array" as const,
items: { type: "string" as const, format: "cas_ref" },
},
},
additionalProperties: false,
};
// ── helpers ───────────────────────────────────────────────────────────────────
async function makeUwfStore(storageRoot: string): Promise<UwfStore> {
const casDir = join(storageRoot, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
return { storageRoot, store, schemas };
}
async function registerDetailSchemas(store: ReturnType<typeof createFsStore>) {
await bootstrap(store);
const [turn, detail] = await Promise.all([
putSchema(store, TURN_SCHEMA),
putSchema(store, DETAIL_SCHEMA),
]);
return { turn, detail };
}
// ── fixture ───────────────────────────────────────────────────────────────────
let tmpDir: string;
beforeEach(async () => {
tmpDir = await mkdtemp(join(tmpdir(), "cli-uwf-test-"));
});
afterEach(async () => {
await rm(tmpDir, { recursive: true, force: true });
});
// ── thread read XML tag isolation ─────────────────────────────────────────────
describe("thread read XML tag isolation", () => {
test("scenario 1: wraps output in XML tags instead of heading", async () => {
const uwf = await makeUwfStore(tmpDir);
const detailSchemas = await registerDetailSchemas(uwf.store);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
planner: {
description: "Planner",
goal: "You are a planning agent. Your task is to...",
capabilities: [],
procedure: "Plan the work.",
output: "Summarize the plan.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Fix issue #459",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const turnHash = await uwf.store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content:
"---\nstatus: ready\nplan: CMWGHQKT58RY4\n---\n\n# Analysis Complete\n## Issue Summary\nThe issue requires XML tag isolation.",
toolCalls: null,
reasoning: null,
});
const detailHash = await uwf.store.put(detailSchemas.detail, {
sessionId: "sx",
model: "mx",
duration: 500,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "planner",
output: outputHash,
detail: detailHash,
agent: "uwf-claude-code",
});
const threadId = "01JTEST0000000000000001" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
// Should wrap output in XML tags
expect(markdown).toContain("<output>");
expect(markdown).toContain("</output>");
// Should not have ### Content heading
expect(markdown).not.toContain("### Content");
// Should preserve markdown headings inside output tags
expect(markdown).toContain("# Analysis Complete");
expect(markdown).toContain("## Issue Summary");
});
test("scenario 2: wraps prompt in XML tags", async () => {
const uwf = await makeUwfStore(tmpDir);
const detailSchemas = await registerDetailSchemas(uwf.store);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
planner: {
description: "Planner",
goal: "You are a planning agent. Your task is to analyze and plan.",
capabilities: [],
procedure: "Plan the work.",
output: "Summarize the plan.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Fix issue",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const turnHash = await uwf.store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content: "---\nstatus: ready\n---\n\nContent here...",
toolCalls: null,
reasoning: null,
});
const detailHash = await uwf.store.put(detailSchemas.detail, {
sessionId: "sx",
model: "mx",
duration: 500,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "planner",
output: outputHash,
detail: detailHash,
agent: "uwf-claude-code",
});
const threadId = "01JTEST0000000000000002" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
// Should wrap prompt in XML tags
expect(markdown).toContain("<prompt>");
expect(markdown).toContain("</prompt>");
expect(markdown).toContain("You are a planning agent. Your task is to analyze and plan.");
// Should not have ### Prompt heading
expect(markdown).not.toContain("### Prompt");
// Should wrap output in XML tags
expect(markdown).toContain("<output>");
expect(markdown).toContain("</output>");
});
test("scenario 3: same role repeated does not show prompt twice", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
writer: {
description: "Writer",
goal: "You are a writer agent.",
capabilities: [],
procedure: "Write content.",
output: "Summarize writing.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Write something",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const step1 = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "writer",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const step2 = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: step1 as CasRef,
role: "writer",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000003" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: step2 });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
// Should only show prompt tags once
const promptCount = (markdown.match(/<prompt>/g) ?? []).length;
expect(promptCount).toBe(1);
});
test("scenario 4: step with no detail shows no output tags", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do work.",
output: "Summarize work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Do stuff",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000004" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
// Should not have output tags
expect(markdown).not.toContain("<output>");
expect(markdown).not.toContain("</output>");
// Step header should still be displayed
expect(markdown).toContain("## Step 1: worker");
// Prompt should still be shown
expect(markdown).toContain("<prompt>");
});
test("scenario 5: empty content shows no output tags", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Do stuff",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
// A detail ref that doesn't exist → extractLastAssistantContent returns null
const missingDetailRef = "missingdetail0" as CasRef;
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: missingDetailRef,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000005" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
// Should not have output tags
expect(markdown).not.toContain("<output>");
expect(markdown).not.toContain("</output>");
});
test("scenario 6: thread read with --start flag shows task section", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
roleA: {
description: "Role A",
goal: "Goal for roleA",
capabilities: [],
procedure: "Do stuff.",
output: "Output.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Initial prompt",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "roleA",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000006" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, true);
// Should include task section
expect(markdown).toContain("# Thread");
expect(markdown).toContain("## Task");
expect(markdown).toContain("Initial prompt");
// Prompts should use XML tags
expect(markdown).toContain("<prompt>");
});
test("scenario 7: thread read with --before parameter", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
roleA: {
description: "Role A",
goal: "Goal for roleA",
capabilities: [],
procedure: "Do stuff.",
output: "Output.",
meta: "placeholder00" as CasRef,
},
roleB: {
description: "Role B",
goal: "Goal for roleB",
capabilities: [],
procedure: "Do stuff.",
output: "Output.",
meta: "placeholder00" as CasRef,
},
roleC: {
description: "Role C",
goal: "Goal for roleC",
capabilities: [],
procedure: "Do stuff.",
output: "Output.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Initial prompt",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const step1 = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "roleA",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const step2 = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: step1 as CasRef,
role: "roleB",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const step3 = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: step2 as CasRef,
role: "roleC",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000007" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: step3 });
const markdown = await cmdThreadRead(
tmpDir,
threadId,
THREAD_READ_DEFAULT_QUOTA,
step2 as CasRef,
false,
);
// Should only show roleA
expect(markdown).toContain("roleA");
expect(markdown).not.toContain("roleB");
expect(markdown).not.toContain("roleC");
// Should use XML tags
expect(markdown).toContain("<prompt>");
});
test("scenario 9: special characters in content are preserved", async () => {
const uwf = await makeUwfStore(tmpDir);
const detailSchemas = await registerDetailSchemas(uwf.store);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
writer: {
description: "Writer",
goal: "You are a writer.",
capabilities: [],
procedure: "Write content.",
output: "Summarize.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Write something",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const turnHash = await uwf.store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content: "Content with <special> & characters > like <this>",
toolCalls: null,
reasoning: null,
});
const detailHash = await uwf.store.put(detailSchemas.detail, {
sessionId: "sx",
model: "mx",
duration: 500,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "writer",
output: outputHash,
detail: detailHash,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000008" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
// Special characters should be preserved as-is
expect(markdown).toContain("Content with <special> & characters > like <this>");
});
test("scenario 10: quota limit with XML tags", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
roleA: {
description: "Role A",
goal: "Goal for roleA",
capabilities: [],
procedure: "Do stuff.",
output: "Output.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Initial prompt",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const steps: CasRef[] = [];
let prev: CasRef | null = null;
for (let i = 0; i < 5; i++) {
const step = (await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev,
role: "roleA",
output: outputHash,
detail: null,
agent: "uwf-test",
})) as CasRef;
steps.push(step);
prev = step;
}
const threadId = "01JTEST0000000000000009" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: steps[steps.length - 1]! });
// Use very small quota
const markdown = await cmdThreadRead(tmpDir, threadId, 1, null, false);
// Should have skip hint
expect(markdown).toContain("earlier step");
// Should have XML tags for displayed steps
if (markdown.includes("<prompt>")) {
expect(markdown).toContain("</prompt>");
}
});
});
@@ -22,48 +22,48 @@ function runCli(args: string[]): { stdout: string; stderr: string; exitCode: num
}
}
describe("thread exec --count CLI parsing", () => {
describe("thread step --count CLI parsing", () => {
test("--help shows -c/--count option", () => {
const result = runCli(["thread", "exec", "--help"]);
const result = runCli(["thread", "step", "--help"]);
expect(result.stdout).toContain("--count");
expect(result.stdout).toContain("-c");
});
test("description says 'one or more steps'", () => {
const result = runCli(["thread", "exec", "--help"]);
const result = runCli(["thread", "step", "--help"]);
expect(result.stdout).toContain("one or more steps");
});
});
describe("cmdThreadExec count logic", () => {
describe("cmdThreadStep count logic", () => {
test("count=0 fails with validation error", () => {
const result = runCli(["thread", "exec", "FAKE_THREAD_ID", "-c", "0"]);
const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "0"]);
expect(result.exitCode).not.toBe(0);
expect(result.stderr).toContain("positive integer");
});
test("negative count fails with validation error", () => {
const result = runCli(["thread", "exec", "FAKE_THREAD_ID", "-c", "-1"]);
const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "-1"]);
expect(result.exitCode).not.toBe(0);
expect(result.stderr).toContain("positive integer");
});
test("non-integer count fails with validation error", () => {
const result = runCli(["thread", "exec", "FAKE_THREAD_ID", "-c", "1.5"]);
const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "1.5"]);
expect(result.exitCode).not.toBe(0);
expect(result.stderr).toContain("positive integer");
});
test("count=1 is the default (no -c flag)", () => {
// Without -c, it should attempt to run 1 step (failing on missing thread, not on count validation)
const result = runCli(["thread", "exec", "FAKE_THREAD_ID"]);
const result = runCli(["thread", "step", "FAKE_THREAD_ID"]);
expect(result.exitCode).not.toBe(0);
// Should NOT contain "positive integer" error — should fail on thread lookup instead
expect(result.stderr).not.toContain("positive integer");
});
test("count=3 passes validation (fails on thread lookup)", () => {
const result = runCli(["thread", "exec", "FAKE_THREAD_ID", "-c", "3"]);
const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "3"]);
expect(result.exitCode).not.toBe(0);
// Should NOT contain "positive integer" error — should fail on thread/storage lookup
expect(result.stderr).not.toContain("positive integer");
@@ -5,15 +5,15 @@ import { bootstrap, putSchema } from "@uncaged/json-cas";
import { createFsStore } from "@uncaged/json-cas-fs";
import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { cmdStepList, cmdStepShow } from "../commands/step.js";
import {
cmdThreadRead,
cmdThreadStepDetails,
extractLastAssistantContent,
THREAD_READ_DEFAULT_QUOTA,
} from "../commands/thread.js";
import { registerUwfSchemas } from "../schemas.js";
import type { UwfStore } from "../store.js";
import { appendThreadHistory, saveThreadsIndex } from "../store.js";
import { saveThreadsIndex } from "../store.js";
// ── schemas used in tests ────────────────────────────────────────────────────
@@ -198,10 +198,10 @@ describe("extractLastAssistantContent", () => {
});
});
// ── cmdThreadRead: <output> section ──────────────────────────────────────────
// ── cmdThreadRead: ### Content section ───────────────────────────────────────
describe("cmdThreadRead <output> section", () => {
test("includes <output> tags when detail has assistant turns", async () => {
describe("cmdThreadRead ### Content section", () => {
test("includes ### Content before ### Output when detail has assistant turns", async () => {
const uwf = await makeUwfStore(tmpDir);
const detailSchemas = await registerDetailSchemas(uwf.store);
@@ -264,13 +264,12 @@ describe("cmdThreadRead <output> section", () => {
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
expect(markdown).toContain("<output>");
expect(markdown).toContain("</output>");
expect(markdown).toContain("### Content");
expect(markdown).toContain("The assistant response text");
expect(markdown).not.toContain("### Content");
expect(markdown).not.toContain("### Output");
});
test("omits <output> tags when detail has no matching assistant turns", async () => {
test("omits ### Content when detail has no matching assistant turns", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
@@ -309,15 +308,14 @@ describe("cmdThreadRead <output> section", () => {
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
expect(markdown).not.toContain("<output>");
expect(markdown).not.toContain("</output>");
expect(markdown).not.toContain("### Content");
expect(markdown).not.toContain("### Output");
});
});
// ── cmdStepShow ───────────────────────────────────────────────────────────────
// ── cmdThreadStepDetails ──────────────────────────────────────────────────────
describe("cmdStepShow", () => {
describe("cmdThreadStepDetails", () => {
test("returns expanded detail node with turns inlined", async () => {
const uwf = await makeUwfStore(tmpDir);
const detailSchemas = await registerDetailSchemas(uwf.store);
@@ -365,7 +363,7 @@ describe("cmdStepShow", () => {
agent: "uwf-hermes",
});
const result = await cmdStepShow(tmpDir, stepHash);
const result = await cmdThreadStepDetails(tmpDir, stepHash);
expect(result).toMatchObject({
sessionId: "sess42",
@@ -386,9 +384,9 @@ describe("cmdStepShow", () => {
});
});
// ── cmdThreadRead: <prompt> deduplication ───────────────────────────────────
// ── cmdThreadRead: ### Prompt deduplication ───────────────────────────────────
describe("cmdThreadRead <prompt> deduplication", () => {
describe("cmdThreadRead ### Prompt deduplication", () => {
async function makeThreadWithRoles(uwf: UwfStore, roles: string[]): Promise<string> {
const roleMap: Record<string, unknown> = {};
for (const r of [...new Set(roles)]) {
@@ -436,36 +434,36 @@ describe("cmdThreadRead <prompt> deduplication", () => {
return stepHash;
}
test("same consecutive role shows <prompt> once", async () => {
test("same consecutive role shows ### Prompt once", async () => {
const uwf = await makeUwfStore(tmpDir);
const headHash = await makeThreadWithRoles(uwf, ["writer", "writer"]);
const threadId = "01JTEST0000000000000003" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: headHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
const count = (markdown.match(/<prompt>/g) ?? []).length;
const count = (markdown.match(/### Prompt/g) ?? []).length;
expect(count).toBe(1);
});
test("different consecutive roles each show <prompt>", async () => {
test("different consecutive roles each show ### Prompt", async () => {
const uwf = await makeUwfStore(tmpDir);
const headHash = await makeThreadWithRoles(uwf, ["planner", "coder"]);
const threadId = "01JTEST0000000000000004" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: headHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
const count = (markdown.match(/<prompt>/g) ?? []).length;
const count = (markdown.match(/### Prompt/g) ?? []).length;
expect(count).toBe(2);
});
test("non-consecutive same role shows <prompt> twice", async () => {
test("non-consecutive same role shows ### Prompt twice", async () => {
const uwf = await makeUwfStore(tmpDir);
const headHash = await makeThreadWithRoles(uwf, ["roleA", "roleB", "roleA"]);
const threadId = "01JTEST0000000000000005" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: headHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
const count = (markdown.match(/<prompt>/g) ?? []).length;
const count = (markdown.match(/### Prompt/g) ?? []).length;
expect(count).toBe(2);
});
});
@@ -586,9 +584,9 @@ describe("cmdThreadRead start section / before / quota", () => {
// ── Tests that call process.exit must be last ─────────────────────────────────
describe("cmdStepShow (process.exit tests - must be last)", () => {
describe("cmdThreadStepDetails (process.exit tests - must be last)", () => {
test("throws when step hash does not exist", async () => {
await expect(cmdStepShow(tmpDir, "nonexistenth0" as CasRef)).rejects.toThrow();
await expect(cmdThreadStepDetails(tmpDir, "nonexistenth0" as CasRef)).rejects.toThrow();
});
test("before with unknown hash rejects", async () => {
@@ -647,383 +645,3 @@ describe("cmdStepShow (process.exit tests - must be last)", () => {
).rejects.toThrow();
});
});
// ── cmdStepList / cmdStepShow: completed threads ──────────────────────────────
describe("cmdStepList with completed threads", () => {
test("lists steps from active thread", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf-active",
description: "desc",
roles: {},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Start prompt",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const step1Hash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "role1",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const step2Hash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: step1Hash,
role: "role2",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const step3Hash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: step2Hash,
role: "role3",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000000A1" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: step3Hash });
const result = await cmdStepList(tmpDir, threadId);
expect(result.thread).toBe(threadId);
expect(result.steps).toHaveLength(4); // start + 3 steps
expect(result.steps[1].role).toBe("role1");
expect(result.steps[2].role).toBe("role2");
expect(result.steps[3].role).toBe("role3");
});
test("lists steps from completed thread", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf-completed",
description: "desc",
roles: {},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Start prompt",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const step1Hash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "roleA",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const step2Hash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: step1Hash,
role: "roleB",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000000A2" as ThreadId;
// Thread is NOT in threads.yaml (simulating completed thread)
await saveThreadsIndex(tmpDir, {});
// But it IS in history.jsonl
await appendThreadHistory(tmpDir, {
thread: threadId,
workflow: workflowHash,
head: step2Hash,
completedAt: Date.now(),
});
const result = await cmdStepList(tmpDir, threadId);
expect(result.thread).toBe(threadId);
expect(result.steps).toHaveLength(3); // start + 2 steps
expect(result.steps[1].role).toBe("roleA");
expect(result.steps[2].role).toBe("roleB");
});
});
describe("cmdStepShow with completed threads", () => {
test("shows step detail from active thread", async () => {
const uwf = await makeUwfStore(tmpDir);
const detailSchemas = await registerDetailSchemas(uwf.store);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf-step-active",
description: "desc",
roles: {},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "p",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const turnHash = await uwf.store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content: "Active thread response",
toolCalls: null,
reasoning: null,
});
const detailHash = await uwf.store.put(detailSchemas.detail, {
sessionId: "sess-active",
model: "model-x",
duration: 1234,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "coder",
output: outputHash,
detail: detailHash,
agent: "uwf-hermes",
});
const threadId = "01JTEST0000000000000000B1" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const result = await cmdStepShow(tmpDir, stepHash);
expect(result).toMatchObject({
sessionId: "sess-active",
model: "model-x",
duration: 1234,
turnCount: 1,
});
});
test("shows step detail from completed thread", async () => {
const uwf = await makeUwfStore(tmpDir);
const detailSchemas = await registerDetailSchemas(uwf.store);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf-step-completed",
description: "desc",
roles: {},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "p",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const turnHash = await uwf.store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content: "Completed thread response",
toolCalls: null,
reasoning: null,
});
const detailHash = await uwf.store.put(detailSchemas.detail, {
sessionId: "sess-completed",
model: "model-y",
duration: 5678,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "reviewer",
output: outputHash,
detail: detailHash,
agent: "uwf-hermes",
});
const threadId = "01JTEST0000000000000000B2" as ThreadId;
// Thread is NOT in threads.yaml
await saveThreadsIndex(tmpDir, {});
// But it IS in history.jsonl
await appendThreadHistory(tmpDir, {
thread: threadId,
workflow: workflowHash,
head: stepHash,
completedAt: Date.now(),
});
const result = await cmdStepShow(tmpDir, stepHash);
expect(result).toMatchObject({
sessionId: "sess-completed",
model: "model-y",
duration: 5678,
turnCount: 1,
});
});
});
describe("cmdThreadRead with completed threads", () => {
test("reads completed thread context", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf-read-completed",
description: "desc",
roles: {
writer: {
description: "Write",
goal: "You are a writer.",
capabilities: [],
procedure: "Write content.",
output: "Summary.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Write something",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "writer",
output: outputHash,
detail: null,
agent: "uwf-hermes",
});
const threadId = "01JTEST0000000000000000C1" as ThreadId;
// Thread is NOT in threads.yaml
await saveThreadsIndex(tmpDir, {});
// But it IS in history.jsonl
await appendThreadHistory(tmpDir, {
thread: threadId,
workflow: workflowHash,
head: stepHash,
completedAt: Date.now(),
});
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
expect(markdown).toContain("writer");
expect(markdown).toContain("Write something");
});
test("reads completed thread with before filter", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf-read-before",
description: "desc",
roles: {},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Do task",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const step1Hash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "roleX",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const step2Hash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: step1Hash,
role: "roleY",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const step3Hash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: step2Hash,
role: "roleZ",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000000C2" as ThreadId;
await saveThreadsIndex(tmpDir, {});
await appendThreadHistory(tmpDir, {
thread: threadId,
workflow: workflowHash,
head: step3Hash,
completedAt: Date.now(),
});
const markdown = await cmdThreadRead(
tmpDir,
threadId,
THREAD_READ_DEFAULT_QUOTA,
step2Hash,
false,
);
// Should contain step1 (roleX) but not step2 (roleY) or step3 (roleZ)
expect(markdown).toContain("roleX");
expect(markdown).not.toContain("roleY");
expect(markdown).not.toContain("roleZ");
});
});
@@ -25,6 +25,7 @@ async function storeWorkflow(uwf: UwfStore, name: string): Promise<CasRef> {
name,
description: "Test workflow",
roles: {},
conditions: {},
graph: {},
};
return await uwf.store.put(uwf.schemas.workflow, payload);
@@ -35,6 +36,7 @@ async function createWorkflowYaml(name: string, version: string | null = null):
name,
description: version !== null ? `Test workflow (${version})` : "Test workflow",
roles: {},
conditions: {},
graph: {},
};
const yaml = stringify(payload);
@@ -143,7 +145,7 @@ describe("Strategy 2: File Path Resolution", () => {
test("should fail on valid YAML with invalid WorkflowPayload shape", async () => {
await makeUwfStore(storageRoot);
const yamlPath = join(tmpDir, "invalid-workflow.yaml");
await writeFile(yamlPath, "name: test\n# missing roles and graph");
await writeFile(yamlPath, "name: test\n# missing roles, conditions, and graph");
await expect(cmdThreadStart(storageRoot, yamlPath, "prompt", projectRoot)).rejects.toThrow();
});
@@ -1,147 +0,0 @@
import { mkdir, readdir, readFile, rename, rm, writeFile } from "node:fs/promises";
import { join } from "node:path";
import type { RunningThreadItem, ThreadId } from "@uncaged/workflow-protocol";
import type { RunningMarker } from "./types.js";
/**
* Get the path to the running markers directory.
*/
export function getRunningDir(storageRoot: string): string {
return join(storageRoot, "running");
}
/**
* Get the path to a specific thread's marker file.
*/
export function getMarkerPath(storageRoot: string, threadId: ThreadId): string {
return join(getRunningDir(storageRoot), `${threadId}.json`);
}
/**
* Check if a PID is still running.
* Returns true if the process exists, false otherwise.
*/
export function isPidAlive(pid: number): boolean {
try {
// process.kill with signal 0 checks existence without killing
process.kill(pid, 0);
return true;
} catch {
// ESRCH means process doesn't exist
return false;
}
}
/**
* Create a marker file for a running thread.
* Writes to a temp file in the same directory, then atomically renames.
*/
export async function createMarker(storageRoot: string, marker: RunningMarker): Promise<void> {
const runningDir = getRunningDir(storageRoot);
await mkdir(runningDir, { recursive: true });
const markerPath = getMarkerPath(storageRoot, marker.thread);
const tempPath = join(runningDir, `.${marker.thread}-${process.pid}.tmp`);
const content = JSON.stringify(marker, null, 2);
await writeFile(tempPath, content, "utf8");
await rename(tempPath, markerPath);
}
/**
* Delete a marker file for a thread.
*/
export async function deleteMarker(storageRoot: string, threadId: ThreadId): Promise<void> {
const markerPath = getMarkerPath(storageRoot, threadId);
try {
await rm(markerPath);
} catch {
// Ignore errors if file doesn't exist
}
}
/**
* Read a marker file. Returns null if file doesn't exist or is invalid.
*/
export async function readMarker(
storageRoot: string,
threadId: ThreadId,
): Promise<RunningMarker | null> {
const markerPath = getMarkerPath(storageRoot, threadId);
try {
const content = await readFile(markerPath, "utf8");
const marker = JSON.parse(content) as RunningMarker;
return marker;
} catch {
return null;
}
}
/**
* List all running threads, filtering out stale markers.
*/
export async function listRunningThreads(storageRoot: string): Promise<RunningThreadItem[]> {
const runningDir = getRunningDir(storageRoot);
let files: string[];
try {
files = await readdir(runningDir);
} catch {
// Directory doesn't exist or can't be read
return [];
}
const results: RunningThreadItem[] = [];
for (const filename of files) {
if (!filename.endsWith(".json")) {
continue;
}
const threadId = filename.slice(0, -5) as ThreadId;
const marker = await readMarker(storageRoot, threadId);
if (marker === null) {
// Invalid marker file
continue;
}
if (!isPidAlive(marker.pid)) {
// Stale marker - process no longer exists
await deleteMarker(storageRoot, threadId);
continue;
}
results.push({
thread: marker.thread,
workflow: marker.workflow,
pid: marker.pid,
startedAt: marker.startedAt,
});
}
return results;
}
/**
* Check if a thread is currently executing in the background.
* Returns the marker if running, null otherwise.
*/
export async function isThreadRunning(
storageRoot: string,
threadId: ThreadId,
): Promise<RunningMarker | null> {
const marker = await readMarker(storageRoot, threadId);
if (marker === null) {
return null;
}
if (!isPidAlive(marker.pid)) {
// Stale marker
await deleteMarker(storageRoot, threadId);
return null;
}
return marker;
}
@@ -1,11 +0,0 @@
export {
createMarker,
deleteMarker,
getMarkerPath,
getRunningDir,
isPidAlive,
isThreadRunning,
listRunningThreads,
readMarker,
} from "./background.js";
export type { RunningMarker } from "./types.js";
@@ -1,9 +0,0 @@
import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
/** Marker file stored at ~/.uncaged/workflow/running/<thread-id>.json */
export type RunningMarker = {
thread: ThreadId;
workflow: CasRef;
pid: number;
startedAt: number;
};
+57 -305
View File
@@ -1,7 +1,8 @@
#!/usr/bin/env bun
import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
import type { ThreadId } from "@uncaged/workflow-protocol";
import { Command } from "commander";
import { stringify as yamlStringify } from "yaml";
import {
cmdCasGet,
cmdCasHas,
@@ -16,20 +17,19 @@ import {
import { cmdLogClean, cmdLogList, cmdLogShow } from "./commands/log.js";
import { cmdSetup, cmdSetupInteractive } from "./commands/setup.js";
import { cmdSkillCli } from "./commands/skill.js";
import { cmdStepFork, cmdStepList, cmdStepRead, cmdStepShow } from "./commands/step.js";
import {
cmdThreadCancel,
cmdThreadExec,
cmdThreadFork,
cmdThreadKill,
cmdThreadList,
cmdThreadRead,
cmdThreadShow,
cmdThreadStart,
cmdThreadStop,
cmdThreadStep,
cmdThreadStepDetails,
cmdThreadSteps,
THREAD_READ_DEFAULT_QUOTA,
type ThreadStatus,
} from "./commands/thread.js";
import { parseTimeInput } from "./commands/thread-time-parser.js";
import { cmdWorkflowAdd, cmdWorkflowList, cmdWorkflowShow } from "./commands/workflow.js";
import { cmdWorkflowList, cmdWorkflowPut, cmdWorkflowShow } from "./commands/workflow.js";
import { formatOutput, type OutputFormat } from "./format.js";
import { resolveStorageRoot } from "./store.js";
@@ -52,27 +52,20 @@ const program = new Command();
const pkg = await import("../package.json", { with: { type: "json" } });
program
.name("uwf")
.description(
"Stateless workflow CLI\n\n" +
"Four-layer architecture:\n" +
" workflow → thread → step → turn\n" +
" 模板定义 执行实例 单步结果 agent内部交互",
)
.description("Stateless workflow CLI")
.version(pkg.default.version, "-V, --version");
program.option("--format <fmt>", "Output format: json or yaml", "json");
const workflow = program
.command("workflow")
.description("Workflow definitions (layer 1: templates)");
const workflow = program.command("workflow").description("Workflow registry and CAS");
workflow
.command("add")
.command("put")
.description("Register a workflow from YAML")
.argument("<file>", "Workflow YAML file")
.action((file: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdWorkflowAdd(storageRoot, file);
const result = await cmdWorkflowPut(storageRoot, file);
writeOutput(result);
});
});
@@ -100,7 +93,7 @@ workflow
});
});
const thread = program.command("thread").description("Thread execution (layer 2: instances)");
const thread = program.command("thread").description("Thread lifecycle and execution");
thread
.command("start")
@@ -116,46 +109,24 @@ thread
});
thread
.command("exec")
.command("step")
.description("Execute one or more steps")
.argument("<thread-id>", "Thread ULID")
.option("--agent <cmd>", "Override agent command")
.option("-c, --count <number>", "Number of steps to run (default: 1)")
.option("--background", "Run in background and return immediately")
.option("--_background-worker", "Internal flag for background worker process", false)
.action(
(
threadId: string,
opts: {
agent: string | undefined;
count: string | undefined;
background: boolean;
_backgroundWorker: boolean;
},
) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const agentOverride = opts.agent ?? null;
const count = opts.count !== undefined ? Number(opts.count) : 1;
const background = opts.background ?? false;
const backgroundWorker = opts._backgroundWorker ?? false;
const results = await cmdThreadExec(
storageRoot,
threadId,
agentOverride,
count,
background,
backgroundWorker,
);
if (results.length === 1) {
writeOutput(results[0]);
} else {
writeOutput(results);
}
});
},
);
.action((threadId: string, opts: { agent: string | undefined; count: string | undefined }) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const agentOverride = opts.agent ?? null;
const count = opts.count !== undefined ? Number(opts.count) : 1;
const results = await cmdThreadStep(storageRoot, threadId, agentOverride, count);
if (results.length === 1) {
writeOutput(results[0]);
} else {
writeOutput(results);
}
});
});
thread
.command("show")
@@ -169,124 +140,38 @@ thread
});
});
// Helper functions for thread list command parsing
function parseStatusFilter(status: string | undefined): ThreadStatus[] | null {
if (status === undefined) return null;
const raw = status.trim();
if (raw === "active") return ["idle", "running"];
const parts = raw.split(",").map((s) => s.trim());
const validStatuses: ThreadStatus[] = ["idle", "running", "completed"];
for (const part of parts) {
if (!validStatuses.includes(part as ThreadStatus)) {
process.stderr.write(
`Invalid status: ${part}. Must be one of: idle, running, completed, active\n`,
);
process.exit(1);
}
}
return parts as ThreadStatus[];
}
function parseTimeFilters(
after: string | undefined,
before: string | undefined,
nowMs: number,
): { afterMs: number | null; beforeMs: number | null } {
try {
const afterMs = after !== undefined ? parseTimeInput(after, nowMs) : null;
const beforeMs = before !== undefined ? parseTimeInput(before, nowMs) : null;
return { afterMs, beforeMs };
} catch (e) {
const message = e instanceof Error ? e.message : String(e);
process.stderr.write(`${message}\n`);
process.exit(1);
}
}
function parsePaginationOptions(
skip: string | undefined,
take: string | undefined,
): { skip: number | null; take: number | null } {
let skipVal: number | null = null;
let takeVal: number | null = null;
if (skip !== undefined) {
skipVal = Number.parseInt(skip, 10);
if (!Number.isInteger(skipVal) || skipVal < 0) {
process.stderr.write("--skip must be a non-negative integer\n");
process.exit(1);
}
}
if (take !== undefined) {
takeVal = Number.parseInt(take, 10);
if (!Number.isInteger(takeVal) || takeVal < 1) {
process.stderr.write("--take must be a positive integer\n");
process.exit(1);
}
}
return { skip: skipVal, take: takeVal };
}
thread
.command("list")
.description("List threads")
.option(
"--status <status>",
"Filter by status: idle, running, completed, active (idle+running), or comma-separated values",
)
.option("--after <date>", "Filter threads created after this date (ISO or relative like '7d')")
.option("--before <date>", "Filter threads created before this date (ISO or relative like '7d')")
.option("--skip <n>", "Skip first n threads")
.option("--take <n>", "Return at most n threads")
.action(
(opts: {
status: string | undefined;
after: string | undefined;
before: string | undefined;
skip: string | undefined;
take: string | undefined;
}) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const statusFilter = parseStatusFilter(opts.status);
const nowMs = Date.now();
const { afterMs, beforeMs } = parseTimeFilters(opts.after, opts.before, nowMs);
const { skip, take } = parsePaginationOptions(opts.skip, opts.take);
const result = await cmdThreadList(
storageRoot,
statusFilter,
afterMs,
beforeMs,
skip,
take,
);
writeOutput(result);
});
},
);
thread
.command("stop")
.description("Stop background execution of a thread (keep thread active)")
.argument("<thread-id>", "Thread ULID")
.action((threadId: string) => {
.description("List active threads")
.option("--all", "Include archived threads")
.action((opts: { all: boolean }) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdThreadStop(storageRoot, threadId);
const result = await cmdThreadList(storageRoot, opts.all);
writeOutput(result);
});
});
thread
.command("cancel")
.description("Cancel a thread (stop execution and move to history)")
.command("kill")
.description("Terminate and archive a thread")
.argument("<thread-id>", "Thread ULID")
.action((threadId: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdThreadCancel(storageRoot, threadId);
const result = await cmdThreadKill(storageRoot, threadId);
writeOutput(result);
});
});
thread
.command("steps")
.description("List all steps in a thread")
.argument("<thread-id>", "Thread ULID")
.action((threadId: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdThreadSteps(storageRoot, threadId);
writeOutput(result);
});
});
@@ -320,157 +205,28 @@ thread
},
);
const step = program.command("step").description("Step results (layer 3: single cycle)");
step
.command("list")
.description("List all steps in a thread")
.argument("<thread-id>", "Thread ULID")
.action((threadId: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdStepList(storageRoot, threadId);
writeOutput(result);
});
});
step
.command("show")
.description("Show details of a specific step")
.argument("<step-hash>", "CAS hash of the StepNode")
.action((stepHash: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const detail = await cmdStepShow(storageRoot, stepHash as CasRef);
writeOutput(detail);
});
});
step
.command("read")
.description("Read a step's turns as human-readable markdown")
.argument("<step-hash>", "CAS hash of the StepNode")
.option("--quota <chars>", "Max output characters", "4000")
.action((stepHash: string, opts: { quota: string }) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const quota = Number.parseInt(opts.quota, 10);
if (!Number.isFinite(quota) || quota < 1) {
process.stderr.write("invalid --quota: must be a positive integer\n");
process.exit(1);
}
const markdown = await cmdStepRead(storageRoot, stepHash as CasRef, quota);
process.stdout.write(markdown.endsWith("\n") ? markdown : `${markdown}\n`);
});
});
step
thread
.command("fork")
.description("Fork a thread from a specific step")
.argument("<step-hash>", "CAS hash of the StartNode or StepNode to fork from")
.action((stepHash: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdStepFork(storageRoot, stepHash as CasRef);
const result = await cmdThreadFork(storageRoot, stepHash);
writeOutput(result);
});
});
// ── Deprecation Handlers ──────────────────────────────────────────────────────
// These commands have been removed. Show helpful error messages.
workflow
.command("put")
.description("[DEPRECATED] Use 'workflow add' instead")
.argument("<file>", "Workflow YAML file")
.action(() => {
process.stderr.write(`Error: Command 'workflow put' has been removed.
Use 'workflow add' instead.
For more information, see: uwf help workflow add
`);
process.exit(1);
});
thread
.command("step")
.description("[DEPRECATED] Use 'thread exec' instead")
.argument("<thread-id>", "Thread ULID")
.allowUnknownOption()
.action(() => {
process.stderr.write(`Error: Command 'thread step' has been removed.
Use 'thread exec' instead.
For more information, see: uwf help thread exec
`);
process.exit(1);
});
thread
.command("steps")
.description("[DEPRECATED] Use 'step list' instead")
.argument("<thread-id>", "Thread ULID")
.action(() => {
process.stderr.write(`Error: Command 'thread steps' has been removed.
Use 'step list' instead.
For more information, see: uwf help step list
`);
process.exit(1);
});
thread
.command("step-details")
.description("[DEPRECATED] Use 'step show' instead")
.argument("<step-hash>", "Step hash")
.action(() => {
process.stderr.write(`Error: Command 'thread step-details' has been removed.
Use 'step show' instead.
For more information, see: uwf help step show
`);
process.exit(1);
});
thread
.command("fork")
.description("[DEPRECATED] Use 'step fork' instead")
.argument("<step-hash>", "Step hash")
.action(() => {
process.stderr.write(`Error: Command 'thread fork' has been removed.
Use 'step fork' instead.
For more information, see: uwf help step fork
`);
process.exit(1);
});
thread
.command("kill")
.description("[DEPRECATED] Use 'thread stop' or 'thread cancel' instead")
.argument("<thread-id>", "Thread ULID")
.action(() => {
process.stderr.write(`Error: Command 'thread kill' has been removed.
Use 'thread stop' to stop background execution (keep thread active),
or 'thread cancel' to cancel and archive the thread.
For more information, see:
uwf help thread stop
uwf help thread cancel
`);
process.exit(1);
});
thread
.command("running")
.description("[DEPRECATED] Use 'thread list --status running' instead")
.action(() => {
process.stderr.write(`Error: Command 'thread running' has been removed.
Use 'thread list --status running' instead.
For more information, see: uwf help thread list
`);
process.exit(1);
.description("Dump the full detail node of a step as YAML")
.argument("<step-hash>", "CAS hash of the StepNode")
.action((stepHash: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const detail = await cmdThreadStepDetails(storageRoot, stepHash);
process.stdout.write(yamlStringify(detail));
});
});
const skill = program.command("skill").description("Built-in skill references for agents");
@@ -565,11 +321,7 @@ cas
.action((hash: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdCasHas(storageRoot, hash);
writeOutput(result);
if (!result.exists) {
process.exit(1);
}
writeOutput(await cmdCasHas(storageRoot, hash));
});
});
@@ -1,231 +0,0 @@
import type { Store as CasStore, JSONSchema } from "@uncaged/json-cas";
import { getSchema } from "@uncaged/json-cas";
import type {
CasRef,
StartNodePayload,
StepNodePayload,
ThreadId,
} from "@uncaged/workflow-protocol";
import { findThreadInHistory, loadThreadsIndex, type UwfStore } from "../store.js";
type ChainState = {
startHash: CasRef;
start: StartNodePayload;
stepsNewestFirst: StepNodePayload[];
headIsStart: boolean;
};
type OrderedStepItem = {
hash: CasRef;
payload: StepNodePayload;
timestamp: number;
};
function fail(message: string): never {
process.stderr.write(`${message}\n`);
process.exit(1);
}
function walkChain(uwf: UwfStore, headHash: CasRef): ChainState {
const headNode = uwf.store.get(headHash);
if (headNode === null) {
fail(`CAS node not found: ${headHash}`);
}
if (headNode.type === uwf.schemas.startNode) {
return {
startHash: headHash,
start: headNode.payload as StartNodePayload,
stepsNewestFirst: [],
headIsStart: true,
};
}
if (headNode.type !== uwf.schemas.stepNode) {
fail(`head ${headHash} is not a StartNode or StepNode`);
}
const stepsNewestFirst: StepNodePayload[] = [];
let hash: CasRef | null = headHash;
while (hash !== null) {
const node = uwf.store.get(hash);
if (node === null) {
fail(`CAS node not found while walking chain: ${hash}`);
}
if (node.type !== uwf.schemas.stepNode) {
break;
}
const payload = node.payload as StepNodePayload;
stepsNewestFirst.push(payload);
hash = payload.prev;
}
const newest = stepsNewestFirst[0];
if (newest === undefined) {
fail(`empty step chain at head ${headHash}`);
}
const startNode = uwf.store.get(newest.start);
if (startNode === null || startNode.type !== uwf.schemas.startNode) {
fail(`StartNode not found: ${newest.start}`);
}
return {
startHash: newest.start,
start: startNode.payload as StartNodePayload,
stepsNewestFirst,
headIsStart: false,
};
}
function expandOutput(uwf: UwfStore, outputRef: CasRef): unknown {
const node = uwf.store.get(outputRef);
if (node === null) {
return {};
}
return node.payload;
}
/**
* Recursively expand all cas_ref fields in a CAS node's payload,
* replacing hash strings with the referenced node's expanded payload.
*/
function expandDeep(store: CasStore, hash: CasRef, visited?: Set<string>): unknown {
const seen = visited ?? new Set<string>();
if (seen.has(hash)) return hash; // cycle guard
seen.add(hash);
const node = store.get(hash);
if (node === null) return hash;
const schema = getSchema(store, node.type);
if (schema === null) return node.payload;
return expandValue(store, schema, node.payload, seen);
}
function expandCasRefField(store: CasStore, value: unknown, visited: Set<string>): unknown {
if (typeof value === "string") {
return expandDeep(store, value as CasRef, visited);
}
return value;
}
function expandAnyOfField(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (!Array.isArray(schema.anyOf)) return value;
for (const sub of schema.anyOf as JSONSchema[]) {
if (sub.format === "cas_ref" && typeof value === "string") {
return expandDeep(store, value as CasRef, visited);
}
}
return value;
}
function expandArrayField(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (!schema.items || !Array.isArray(value)) return value;
const itemSchema = schema.items as JSONSchema;
return (value as unknown[]).map((item) => expandValue(store, itemSchema, item, visited));
}
function expandObjectField(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (value === null || typeof value !== "object" || Array.isArray(value) || !schema.properties) {
return value;
}
const props = schema.properties as Record<string, JSONSchema>;
const obj = value as Record<string, unknown>;
const result: Record<string, unknown> = {};
for (const [key, val] of Object.entries(obj)) {
const propSchema = props[key];
result[key] = propSchema ? expandValue(store, propSchema, val, visited) : val;
}
return result;
}
function expandValue(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (schema.format === "cas_ref") return expandCasRefField(store, value, visited);
if (Array.isArray(schema.anyOf)) return expandAnyOfField(store, schema, value, visited);
if (schema.type === "array") return expandArrayField(store, schema, value, visited);
return expandObjectField(store, schema, value, visited);
}
function collectOrderedSteps(
uwf: UwfStore,
headHash: CasRef,
chain: ChainState,
): OrderedStepItem[] {
let hash: CasRef | null = headHash;
const hashToNode = new Map<string, { payload: StepNodePayload; timestamp: number }>();
while (hash !== null) {
const node = uwf.store.get(hash);
if (node === null || node.type !== uwf.schemas.stepNode) {
break;
}
const payload = node.payload as StepNodePayload;
hashToNode.set(hash, { payload, timestamp: node.timestamp });
hash = payload.prev;
}
let cur: CasRef | null = chain.headIsStart ? null : headHash;
const ordered: OrderedStepItem[] = [];
while (cur !== null) {
const entry = hashToNode.get(cur);
if (entry === undefined) {
break;
}
ordered.push({ hash: cur, ...entry });
cur = entry.payload.prev;
}
ordered.reverse();
return ordered;
}
async function resolveHeadHash(storageRoot: string, threadId: ThreadId): Promise<CasRef> {
const index = await loadThreadsIndex(storageRoot);
const activeHead = index[threadId];
if (activeHead !== undefined) {
return activeHead;
}
const hist = await findThreadInHistory(storageRoot, threadId);
if (hist !== null) {
return hist.head;
}
fail(`thread not found: ${threadId}`);
}
export {
type ChainState,
collectOrderedSteps,
expandAnyOfField,
expandArrayField,
expandCasRefField,
expandDeep,
expandObjectField,
expandOutput,
expandValue,
fail,
type OrderedStepItem,
resolveHeadHash,
walkChain,
};
-256
View File
@@ -1,256 +0,0 @@
import type { BootstrapCapableStore } from "@uncaged/json-cas";
import type {
CasRef,
StartEntry,
StepEntry,
StepNodePayload,
ThreadForkOutput,
ThreadId,
ThreadStepsOutput,
} from "@uncaged/workflow-protocol";
import { generateUlid } from "@uncaged/workflow-util";
import { createUwfStore, loadThreadsIndex, saveThreadsIndex } from "../store.js";
import {
collectOrderedSteps,
expandDeep,
expandOutput,
fail,
resolveHeadHash,
walkChain,
} from "./shared.js";
type TurnData = {
index: number;
content: string;
};
/**
* List all steps in a thread (previously: thread steps)
*/
export async function cmdStepList(
storageRoot: string,
threadId: ThreadId,
): Promise<ThreadStepsOutput> {
const headHash = await resolveHeadHash(storageRoot, threadId);
const uwf = await createUwfStore(storageRoot);
const chain = walkChain(uwf, headHash);
const startNode = uwf.store.get(chain.startHash);
if (startNode === null) {
fail(`StartNode not found: ${chain.startHash}`);
}
const startEntry: StartEntry = {
hash: chain.startHash,
workflow: chain.start.workflow,
prompt: chain.start.prompt,
timestamp: startNode.timestamp,
};
const stepEntries: StepEntry[] = [];
const ordered = collectOrderedSteps(uwf, headHash, chain);
for (const item of ordered) {
stepEntries.push({
hash: item.hash,
role: item.payload.role,
output: expandOutput(uwf, item.payload.output),
detail: item.payload.detail ?? null,
agent: item.payload.agent,
timestamp: item.timestamp,
});
}
return {
thread: threadId,
workflow: chain.start.workflow,
steps: [startEntry, ...stepEntries],
};
}
/**
* Show details of a specific step (previously: thread step-details)
*/
export async function cmdStepShow(storageRoot: string, stepHash: CasRef): Promise<unknown> {
const uwf = await createUwfStore(storageRoot);
const node = uwf.store.get(stepHash);
if (node === null) {
fail(`CAS node not found: ${stepHash}`);
}
if (node.type !== uwf.schemas.stepNode) {
fail(`node ${stepHash} is not a StepNode`);
}
const payload = node.payload as StepNodePayload;
if (!payload.detail) {
fail(`step ${stepHash} has no detail`);
}
return expandDeep(uwf.store, payload.detail);
}
/**
* Fork a thread from a specific step (previously: thread fork)
*/
export async function cmdStepFork(
storageRoot: string,
stepHash: CasRef,
): Promise<ThreadForkOutput> {
const uwf = await createUwfStore(storageRoot);
const node = uwf.store.get(stepHash);
if (node === null) {
fail(`CAS node not found: ${stepHash}`);
}
if (node.type !== uwf.schemas.startNode && node.type !== uwf.schemas.stepNode) {
fail(`node ${stepHash} is not a StartNode or StepNode`);
}
const newThreadId = generateUlid(Date.now()) as ThreadId;
const index = await loadThreadsIndex(storageRoot);
index[newThreadId] = stepHash;
await saveThreadsIndex(storageRoot, index);
return {
thread: newThreadId,
forkedFrom: {
step: stepHash,
},
};
}
/**
* Load and validate step detail node from CAS store
*/
function loadStepDetail(store: BootstrapCapableStore, detailRef: CasRef): Record<string, unknown> {
const detailNode = store.get(detailRef);
if (detailNode === null) {
fail(`detail node not found: ${detailRef}`);
}
return detailNode.payload as Record<string, unknown>;
}
/**
* Load all turn nodes from CAS store and extract content
*/
function loadTurnData(store: BootstrapCapableStore, turns: unknown): TurnData[] {
if (!Array.isArray(turns) || turns.length === 0) {
return [];
}
const turnData: TurnData[] = [];
for (const turnRef of turns) {
if (typeof turnRef !== "string") {
continue;
}
const turnNode = store.get(turnRef as CasRef);
if (turnNode === null) {
continue;
}
const turn = turnNode.payload as Record<string, unknown>;
if (typeof turn.content === "string") {
turnData.push({
index: typeof turn.index === "number" ? turn.index : turnData.length,
content: turn.content,
});
}
}
return turnData;
}
/**
* Select turns that fit within quota, working backwards from most recent
*/
function selectTurnsForQuota(turnData: TurnData[], availableQuota: number): TurnData[] {
const selectedTurns: TurnData[] = [];
let totalChars = 0;
for (let i = turnData.length - 1; i >= 0; i--) {
const turn = turnData[i];
if (turn === undefined) continue;
const turnHeader = `## Turn ${turn.index + 1}\n\n`;
const turnBlock = turnHeader + turn.content;
const separatorCost = selectedTurns.length > 0 ? 2 : 0;
const addCost = turnBlock.length + separatorCost;
if (totalChars + addCost > availableQuota && selectedTurns.length > 0) {
break;
}
selectedTurns.unshift(turn);
totalChars += addCost;
}
return selectedTurns;
}
/**
* Assemble final markdown output from header and selected turns
*/
function formatStepMarkdown(
stepHash: CasRef,
role: string,
agent: string,
turnData: TurnData[],
selectedTurns: TurnData[],
): string {
const parts: string[] = [];
parts.push(`# Step ${stepHash}`);
parts.push("");
parts.push(`**Role:** ${role}`);
parts.push(`**Agent:** ${agent}`);
if (selectedTurns.length === 0) {
return parts.join("\n");
}
const skippedCount = turnData.length - selectedTurns.length;
if (skippedCount > 0) {
parts.push("");
parts.push(`_[Earlier turns omitted due to quota. Use --quota to increase.]_`);
}
for (const turn of selectedTurns) {
parts.push("");
parts.push(`## Turn ${turn.index + 1}`);
parts.push("");
parts.push(turn.content);
}
return parts.join("\n");
}
/**
* Read a step's agent turns as human-readable markdown with quota enforcement
*/
export async function cmdStepRead(
storageRoot: string,
stepHash: CasRef,
quota: number,
): Promise<string> {
const uwf = await createUwfStore(storageRoot);
const node = uwf.store.get(stepHash);
if (node === null) {
fail(`CAS node not found: ${stepHash}`);
}
if (node.type !== uwf.schemas.stepNode) {
fail(`node ${stepHash} is not a StepNode`);
}
const payload = node.payload as StepNodePayload;
if (payload.detail === null) {
return formatStepMarkdown(stepHash, payload.role, payload.agent, [], []);
}
const detail = loadStepDetail(uwf.store, payload.detail);
const turnData = loadTurnData(uwf.store, detail.turns);
if (turnData.length === 0) {
return formatStepMarkdown(stepHash, payload.role, payload.agent, [], []);
}
const headerSection = formatStepMarkdown(stepHash, payload.role, payload.agent, [], []);
const BUFFER = 200;
const availableQuota = quota - headerSection.length - BUFFER;
const selectedTurns = selectTurnsForQuota(turnData, availableQuota);
return formatStepMarkdown(stepHash, payload.role, payload.agent, turnData, selectedTurns);
}
@@ -1,23 +0,0 @@
/**
* Parse time input: ISO date (YYYY-MM-DD, YYYY-MM-DDTHH:MM:SS) or relative (7d, 24h, 30m)
* Returns Unix timestamp in milliseconds.
*/
export function parseTimeInput(input: string, nowMs: number): number {
const trimmed = input.trim();
// Relative time: 7d, 24h, 30m
const relativeMatch = /^(\d+)(d|h|m)$/.exec(trimmed);
if (relativeMatch !== null) {
const value = Number.parseInt(relativeMatch[1], 10);
const unit = relativeMatch[2];
const multiplier = unit === "d" ? 86400000 : unit === "h" ? 3600000 : 60000;
return nowMs - value * multiplier;
}
// ISO date: try parsing
const parsed = Date.parse(trimmed);
if (Number.isNaN(parsed)) {
throw new Error(`invalid time format: ${trimmed} (expected ISO date or relative like '7d')`);
}
return parsed;
}
+367 -371
View File
@@ -1,32 +1,33 @@
import { execFileSync, spawn } from "node:child_process";
import { execFileSync } from "node:child_process";
import { access, readFile } from "node:fs/promises";
import { dirname, isAbsolute, resolve as resolvePath } from "node:path";
import { validate } from "@uncaged/json-cas";
import type { Store as CasStore, JSONSchema } from "@uncaged/json-cas";
import { getSchema, validate } from "@uncaged/json-cas";
import { getEnvPath, loadWorkflowConfig } from "@uncaged/workflow-agent-kit";
import { evaluate } from "@uncaged/workflow-moderator";
import type {
AgentAlias,
AgentConfig,
CasRef,
ModeratorContext,
StartEntry,
StartNodePayload,
StartOutput,
StepContext,
StepEntry,
StepNodePayload,
StepOutput,
ThreadForkOutput,
ThreadId,
ThreadListItem,
ThreadsIndex,
ThreadStepsOutput,
WorkflowConfig,
WorkflowPayload,
} from "@uncaged/workflow-protocol";
import {
createProcessLogger,
extractUlidTimestamp,
generateUlid,
type ProcessLogger,
} from "@uncaged/workflow-util";
import { createProcessLogger, generateUlid, type ProcessLogger } from "@uncaged/workflow-util";
import { config as loadDotenv } from "dotenv";
import { parse } from "yaml";
import { createMarker, deleteMarker, isThreadRunning } from "../background/index.js";
import { parse, stringify } from "yaml";
import {
appendThreadHistory,
createUwfStore,
@@ -40,18 +41,9 @@ import {
type UwfStore,
} from "../store.js";
import { checkWorkflowFilenameConsistency, isCasRef, parseWorkflowPayload } from "../validate.js";
import {
type ChainState,
collectOrderedSteps,
expandOutput,
fail,
type OrderedStepItem,
walkChain,
} from "./shared.js";
import { materializeWorkflowPayload } from "./workflow.js";
const END_ROLE = "$END";
const START_ROLE = "$START";
export const THREAD_READ_DEFAULT_QUOTA = 4000;
const PL_THREAD_START = "7HNQ4B2X";
@@ -60,13 +52,35 @@ const PL_AGENT_SPAWN = "R5J2W8N4";
const PL_AGENT_DONE = "C6P9E3H7";
const PL_THREAD_ARCHIVED = "F4D8Q2K5";
const PL_STEP_ERROR = "B8T5N1V6";
const PL_BACKGROUND_START = "X7Q4W9M2";
function failStep(plog: ProcessLogger, message: string): never {
plog.log(PL_STEP_ERROR, message, null);
fail(message);
}
type ChainState = {
startHash: CasRef;
start: StartNodePayload;
stepsNewestFirst: StepNodePayload[];
headIsStart: boolean;
};
type OrderedStepItem = {
hash: CasRef;
payload: StepNodePayload;
timestamp: number;
};
export type KillOutput = {
thread: ThreadId;
archived: boolean;
};
function fail(message: string): never {
process.stderr.write(`${message}\n`);
process.exit(1);
}
/**
* Check if a string looks like a file path (contains path separators or has .yaml/.yml extension).
*/
@@ -307,7 +321,6 @@ export async function cmdThreadShow(storageRoot: string, threadId: ThreadId): Pr
thread: threadId,
head: activeHead,
done: false,
background: null,
};
}
@@ -318,146 +331,249 @@ export async function cmdThreadShow(storageRoot: string, threadId: ThreadId): Pr
thread: threadId,
head: hist.head,
done: true,
background: null,
};
}
fail(`thread not found: ${threadId}`);
}
export type ThreadStatus = "idle" | "running" | "completed";
export type ThreadListItemWithStatus = ThreadListItem & {
status: ThreadStatus;
};
async function threadListItemFromActive(
storageRoot: string,
uwf: UwfStore,
threadId: ThreadId,
head: CasRef,
): Promise<ThreadListItemWithStatus | null> {
): Promise<ThreadListItem | null> {
const workflow = resolveWorkflowFromHead(uwf, head);
if (workflow === null) {
return null;
}
// Check if thread is currently running in background
const runningMarker = await isThreadRunning(storageRoot, threadId);
const status: ThreadStatus = runningMarker !== null ? "running" : "idle";
return { thread: threadId, workflow, head, status };
}
async function collectActiveThreads(
storageRoot: string,
uwf: UwfStore,
index: ThreadsIndex,
): Promise<ThreadListItemWithStatus[]> {
const items: ThreadListItemWithStatus[] = [];
for (const [threadId, head] of Object.entries(index)) {
const item = await threadListItemFromActive(
storageRoot,
uwf,
threadId as ThreadId,
head as CasRef,
);
if (item !== null) {
items.push(item);
}
}
return items;
}
async function collectCompletedThreads(
storageRoot: string,
activeIds: Set<ThreadId>,
): Promise<ThreadListItemWithStatus[]> {
const items: ThreadListItemWithStatus[] = [];
const history = await loadThreadHistory(storageRoot);
const seen = new Set<ThreadId>(); // Deduplication (issue #470)
for (const entry of history) {
if (!activeIds.has(entry.thread) && !seen.has(entry.thread)) {
seen.add(entry.thread);
items.push({
thread: entry.thread,
workflow: entry.workflow,
head: entry.head,
status: "completed",
});
}
}
return items;
}
function applyTimeFilters(
items: ThreadListItemWithStatus[],
afterMs: number | null,
beforeMs: number | null,
): ThreadListItemWithStatus[] {
if (afterMs === null && beforeMs === null) return items;
return items.filter((item) => {
const ts = extractUlidTimestamp(item.thread);
if (ts === null) return false;
if (afterMs !== null && ts <= afterMs) return false;
if (beforeMs !== null && ts >= beforeMs) return false;
return true;
});
}
function sortByNewestFirst(items: ThreadListItemWithStatus[]): ThreadListItemWithStatus[] {
return items.sort((a, b) => {
const tsA = extractUlidTimestamp(a.thread) ?? 0;
const tsB = extractUlidTimestamp(b.thread) ?? 0;
return tsB - tsA;
});
}
function applyPagination(
items: ThreadListItemWithStatus[],
skip: number | null,
take: number | null,
): ThreadListItemWithStatus[] {
const skipCount = skip ?? 0;
const takeCount = take ?? items.length;
return items.slice(skipCount, skipCount + takeCount);
return { thread: threadId, workflow, head };
}
export async function cmdThreadList(
storageRoot: string,
statusFilter: ThreadStatus[] | null,
afterMs: number | null,
beforeMs: number | null,
skip: number | null,
take: number | null,
): Promise<ThreadListItemWithStatus[]> {
includeAll: boolean,
): Promise<ThreadListItem[]> {
const uwf = await createUwfStore(storageRoot);
const index = await loadThreadsIndex(storageRoot);
const items: ThreadListItem[] = [];
// Collect active threads
let items = await collectActiveThreads(storageRoot, uwf, index);
// Collect completed threads (if relevant for status filter)
const includeCompleted = statusFilter === null || statusFilter.includes("completed");
if (includeCompleted) {
const activeIds = new Set(items.map((i) => i.thread));
const completedItems = await collectCompletedThreads(storageRoot, activeIds);
items = items.concat(completedItems);
for (const [threadId, head] of Object.entries(index)) {
const item = await threadListItemFromActive(uwf, threadId as ThreadId, head);
if (item !== null) {
items.push(item);
}
}
// Apply status filter
if (statusFilter !== null) {
items = items.filter((item) => statusFilter.includes(item.status));
if (!includeAll) {
return items;
}
// Apply time range filters
items = applyTimeFilters(items, afterMs, beforeMs);
const activeIds = new Set(items.map((i) => i.thread));
const history = await loadThreadHistory(storageRoot);
for (const entry of history) {
if (!activeIds.has(entry.thread)) {
items.push({
thread: entry.thread,
workflow: entry.workflow,
head: entry.head,
});
}
}
// Sort by timestamp descending (newest first)
items = sortByNewestFirst(items);
return items;
}
// Apply pagination
return applyPagination(items, skip, take);
function walkChain(uwf: UwfStore, headHash: CasRef): ChainState {
const headNode = uwf.store.get(headHash);
if (headNode === null) {
fail(`CAS node not found: ${headHash}`);
}
if (headNode.type === uwf.schemas.startNode) {
return {
startHash: headHash,
start: headNode.payload as StartNodePayload,
stepsNewestFirst: [],
headIsStart: true,
};
}
if (headNode.type !== uwf.schemas.stepNode) {
fail(`head ${headHash} is not a StartNode or StepNode`);
}
const stepsNewestFirst: StepNodePayload[] = [];
let hash: CasRef | null = headHash;
while (hash !== null) {
const node = uwf.store.get(hash);
if (node === null) {
fail(`CAS node not found while walking chain: ${hash}`);
}
if (node.type !== uwf.schemas.stepNode) {
break;
}
const payload = node.payload as StepNodePayload;
stepsNewestFirst.push(payload);
hash = payload.prev;
}
const newest = stepsNewestFirst[0];
if (newest === undefined) {
fail(`empty step chain at head ${headHash}`);
}
const startNode = uwf.store.get(newest.start);
if (startNode === null || startNode.type !== uwf.schemas.startNode) {
fail(`StartNode not found: ${newest.start}`);
}
return {
startHash: newest.start,
start: startNode.payload as StartNodePayload,
stepsNewestFirst,
headIsStart: false,
};
}
function expandOutput(uwf: UwfStore, outputRef: CasRef): unknown {
const node = uwf.store.get(outputRef);
if (node === null) {
return {};
}
return node.payload;
}
/**
* Recursively expand all cas_ref fields in a CAS node's payload,
* replacing hash strings with the referenced node's expanded payload.
*/
function expandDeep(store: CasStore, hash: CasRef, visited?: Set<string>): unknown {
const seen = visited ?? new Set<string>();
if (seen.has(hash)) return hash; // cycle guard
seen.add(hash);
const node = store.get(hash);
if (node === null) return hash;
const schema = getSchema(store, node.type);
if (schema === null) return node.payload;
return expandValue(store, schema, node.payload, seen);
}
function expandCasRefField(store: CasStore, value: unknown, visited: Set<string>): unknown {
if (typeof value === "string") {
return expandDeep(store, value as CasRef, visited);
}
return value;
}
function expandAnyOfField(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (!Array.isArray(schema.anyOf)) return value;
for (const sub of schema.anyOf as JSONSchema[]) {
if (sub.format === "cas_ref" && typeof value === "string") {
return expandDeep(store, value as CasRef, visited);
}
}
return value;
}
function expandArrayField(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (!schema.items || !Array.isArray(value)) return value;
const itemSchema = schema.items as JSONSchema;
return (value as unknown[]).map((item) => expandValue(store, itemSchema, item, visited));
}
function expandObjectField(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (value === null || typeof value !== "object" || Array.isArray(value) || !schema.properties) {
return value;
}
const props = schema.properties as Record<string, JSONSchema>;
const obj = value as Record<string, unknown>;
const result: Record<string, unknown> = {};
for (const [key, val] of Object.entries(obj)) {
const propSchema = props[key];
result[key] = propSchema ? expandValue(store, propSchema, val, visited) : val;
}
return result;
}
function expandValue(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (schema.format === "cas_ref") return expandCasRefField(store, value, visited);
if (Array.isArray(schema.anyOf)) return expandAnyOfField(store, schema, value, visited);
if (schema.type === "array") return expandArrayField(store, schema, value, visited);
return expandObjectField(store, schema, value, visited);
}
function collectOrderedSteps(
uwf: UwfStore,
headHash: CasRef,
chain: ChainState,
): OrderedStepItem[] {
let hash: CasRef | null = headHash;
const hashToNode = new Map<string, { payload: StepNodePayload; timestamp: number }>();
while (hash !== null) {
const node = uwf.store.get(hash);
if (node === null || node.type !== uwf.schemas.stepNode) {
break;
}
const payload = node.payload as StepNodePayload;
hashToNode.set(hash, { payload, timestamp: node.timestamp });
hash = payload.prev;
}
let cur: CasRef | null = chain.headIsStart ? null : headHash;
const ordered: OrderedStepItem[] = [];
while (cur !== null) {
const entry = hashToNode.get(cur);
if (entry === undefined) {
break;
}
ordered.push({ hash: cur, ...entry });
cur = entry.payload.prev;
}
ordered.reverse();
return ordered;
}
function formatYaml(value: unknown): string {
return stringify(value, { aliasDuplicateObjects: false }).trimEnd();
}
function formatCompactStep(index: number, item: OrderedStepItem, outputYaml: string): string {
return [
`## Step ${index}: ${item.payload.role}`,
"",
`- **Hash:** \`${item.hash}\``,
`- **Agent:** ${item.payload.agent}`,
"",
"### Output",
"",
"```yaml",
outputYaml,
"```",
].join("\n");
}
export function extractLastAssistantContent(uwf: UwfStore, detailRef: CasRef): string | null {
@@ -503,60 +619,22 @@ function sliceBeforeHash(
return candidates.slice(0, idx);
}
function calculateFormattedStepLength(
stepNum: number,
item: OrderedStepItem,
uwf: UwfStore,
workflow: WorkflowPayload,
): number {
// Calculate using the same format as formatStepHeader, formatStepPrompt, formatStepContent
// Use a temporary set to avoid mutating the actual shownPromptRoles during calculation
const tempShownRoles = new Set<string>();
const header = formatStepHeader(stepNum, item);
const roleDef = workflow.roles[item.payload.role];
const prompt = formatStepPrompt(roleDef, item.payload.role, tempShownRoles);
const content = formatStepContent(uwf, item);
const stepBlock = [header, prompt, content].filter((s) => s !== "").join("");
// Don't add separator here - it will be counted when we know the final structure
return stepBlock.length;
}
function selectByQuota(
candidates: OrderedStepItem[],
uwf: UwfStore,
workflow: WorkflowPayload,
quota: number,
startSectionLength: number,
): { selected: OrderedStepItem[]; skippedCount: number } {
const selected: OrderedStepItem[] = [];
// Start with start section length
let totalChars = startSectionLength;
let totalChars = 0;
for (let i = candidates.length - 1; i >= 0; i--) {
const item = candidates[i];
if (item === undefined) continue;
// Calculate the actual formatted length using the same format as final output
const blockLen = calculateFormattedStepLength(i + 1, item, uwf, workflow);
// Calculate cost of adding this step:
// - blockLen: the step content
// - 6: separator before this step (if there are already parts)
const separatorCost = totalChars > 0 || selected.length > 0 ? 6 : 0;
const addCost = blockLen + separatorCost;
// Check quota BEFORE adding - but always include at least one step
if (totalChars + addCost > quota && selected.length > 0) {
break;
}
const outputYaml = formatYaml(expandOutput(uwf, item.payload.output));
const blockLen = formatCompactStep(i + 1, item, outputYaml).length;
selected.unshift(item);
totalChars += addCost;
totalChars += blockLen;
if (totalChars > quota) break;
}
return { selected, skippedCount: candidates.length - selected.length };
}
@@ -578,14 +656,14 @@ function formatStepPrompt(
): string {
if (!roleDef || shownPromptRoles.has(role)) return "";
shownPromptRoles.add(role);
return ["", "", "<prompt>", roleDef.goal, "</prompt>"].join("\n");
return ["", "", "### Prompt", "", roleDef.goal].join("\n");
}
function formatStepContent(uwf: UwfStore, item: OrderedStepItem): string {
if (!item.payload.detail) return "";
const content = extractLastAssistantContent(uwf, item.payload.detail);
if (content === null) return "";
return ["", "", "<output>", content, "</output>"].join("\n");
return ["", "", "### Content", "", content].join("\n");
}
function formatStartSection(options: {
@@ -623,21 +701,11 @@ function formatThreadReadMarkdown(options: {
const { ordered, uwf, workflow, quota, before } = options;
const candidates = before !== null ? sliceBeforeHash(ordered, before, options.threadId) : ordered;
// Calculate start section length for quota accounting
const startSection = formatStartSection(options);
const startSectionLength = startSection !== "" ? startSection.length : 0;
const { selected, skippedCount } = selectByQuota(
candidates,
uwf,
workflow,
quota,
startSectionLength,
);
const { selected, skippedCount } = selectByQuota(candidates, uwf, quota);
const parts: string[] = [];
const startSection = formatStartSection(options);
if (startSection !== "") parts.push(startSection);
if (skippedCount > 0 && selected.length > 0) {
@@ -669,32 +737,16 @@ function formatThreadReadMarkdown(options: {
return parts.join("\n\n---\n\n");
}
type EvaluateLastOutput = Record<string, unknown> & { status: string };
function resolveEvaluateArgs(
uwf: UwfStore,
chain: ChainState,
): { lastRole: string; lastOutput: EvaluateLastOutput } {
if (chain.headIsStart) {
return { lastRole: START_ROLE, lastOutput: { status: "_" } };
}
const lastStep = chain.stepsNewestFirst[0];
if (lastStep === undefined) {
fail("empty step chain");
}
const raw = expandOutput(uwf, lastStep.output);
const base =
typeof raw === "object" && raw !== null && !Array.isArray(raw)
? (raw as Record<string, unknown>)
: {};
const status = typeof base.status === "string" ? base.status : "_";
return {
lastRole: lastStep.role,
lastOutput: { ...base, status },
};
function buildModeratorContext(uwf: UwfStore, chain: ChainState): ModeratorContext {
const chronological = [...chain.stepsNewestFirst].reverse();
const steps: StepContext[] = chronological.map((step) => ({
role: step.role,
output: expandOutput(uwf, step.output),
detail: step.detail,
agent: step.agent,
edgePrompt: step.edgePrompt ?? "",
}));
return { start: chain.start, steps };
}
function loadWorkflowPayload(uwf: UwfStore, workflowRef: CasRef): WorkflowPayload {
@@ -752,11 +804,13 @@ function spawnAgent(
role: string,
edgePrompt: string,
): CasRef {
const argv = [...agent.args, "--thread", threadId, "--role", role, "--prompt", edgePrompt];
const argv = [...agent.args, threadId, role];
const env = { ...process.env, UWF_EDGE_PROMPT: edgePrompt };
let stdout: string;
try {
stdout = execFileSync(agent.command, argv, {
encoding: "utf8",
env,
stdio: ["ignore", "pipe", "pipe"],
maxBuffer: 50 * 1024 * 1024, // 50 MB — stream-json output can be large
});
@@ -796,65 +850,31 @@ async function archiveThread(
});
}
export async function cmdThreadExec(
export async function cmdThreadStep(
storageRoot: string,
threadId: ThreadId,
agentOverride: string | null,
count: number,
background: boolean,
backgroundWorker: boolean,
): Promise<StepOutput[]> {
if (count < 1 || !Number.isInteger(count)) {
fail(`--count must be a positive integer, got: ${count}`);
}
// Check if thread is already running in background (unless we ARE the background worker)
if (!backgroundWorker) {
const runningMarker = await isThreadRunning(storageRoot, threadId);
if (runningMarker !== null) {
fail(`thread already executing in background (PID: ${runningMarker.pid})`);
}
}
const workflowHash = await resolveActiveThreadWorkflowHash(storageRoot, threadId);
const plog = createProcessLogger({
storageRoot,
context: { thread: threadId, workflow: workflowHash },
});
if (background && !backgroundWorker) {
// Spawn background process
return cmdThreadStepBackground(storageRoot, threadId, agentOverride, count, plog, workflowHash);
}
// If we're the background worker, create marker before execution
let markerCreated = false;
if (backgroundWorker) {
await createMarker(storageRoot, {
thread: threadId,
workflow: workflowHash,
pid: process.pid,
startedAt: Date.now(),
});
markerCreated = true;
}
try {
const results: StepOutput[] = [];
for (let i = 0; i < count; i++) {
const result = await cmdThreadStepOnce(storageRoot, threadId, agentOverride, plog);
results.push(result);
if (result.done) {
break;
}
}
return results;
} finally {
// Cleanup marker if we created one
if (markerCreated) {
await deleteMarker(storageRoot, threadId);
const results: StepOutput[] = [];
for (let i = 0; i < count; i++) {
const result = await cmdThreadStepOnce(storageRoot, threadId, agentOverride, plog);
results.push(result);
if (result.done) {
break;
}
}
return results;
}
async function resolveActiveThreadWorkflowHash(
@@ -871,57 +891,6 @@ async function resolveActiveThreadWorkflowHash(
return chain.start.workflow;
}
async function cmdThreadStepBackground(
storageRoot: string,
threadId: ThreadId,
agentOverride: string | null,
count: number,
plog: ProcessLogger,
workflowHash: CasRef,
): Promise<StepOutput[]> {
// Get current head to return to caller
const index = await loadThreadsIndex(storageRoot);
const headHash = index[threadId];
if (headHash === undefined) {
failStep(plog, `thread not active: ${threadId}`);
}
// Spawn detached background process
const scriptPath = process.argv[1];
if (scriptPath === undefined) {
failStep(plog, "unable to determine script path for background execution");
}
const args = ["thread", "exec", threadId, "--count", String(count)];
if (agentOverride !== null) {
args.push("--agent", agentOverride);
}
// Internal flag to signal the background worker to create/cleanup markers
args.push("--_background-worker");
plog.log(PL_BACKGROUND_START, `spawning background process count=${count}`, null);
const child = spawn(scriptPath, args, {
detached: true,
stdio: "ignore",
});
child.unref();
// Return immediately with current state and background flag
return [
{
workflow: workflowHash,
thread: threadId,
head: headHash,
done: false,
background: true,
},
];
}
async function cmdThreadStepOnce(
storageRoot: string,
threadId: ThreadId,
@@ -938,9 +907,9 @@ async function cmdThreadStepOnce(
const chain = walkChain(uwf, headHash);
const workflowHash = chain.start.workflow;
const workflow = loadWorkflowPayload(uwf, workflowHash);
const { lastRole, lastOutput } = resolveEvaluateArgs(uwf, chain);
const context = buildModeratorContext(uwf, chain);
const nextResult = evaluate(workflow.graph, lastRole, lastOutput);
const nextResult = await evaluate(workflow, context);
if (!nextResult.ok) {
failStep(plog, `moderator evaluate failed: ${nextResult.error.message}`);
}
@@ -959,7 +928,6 @@ async function cmdThreadStepOnce(
thread: threadId,
head: headHash,
done: true,
background: null,
};
}
@@ -990,11 +958,8 @@ async function cmdThreadStepOnce(
await saveThreadsIndex(storageRoot, freshIndex);
const chainAfter = walkChain(uwfAfter, newHead);
const { lastRole: lastRoleAfter, lastOutput: lastOutputAfter } = resolveEvaluateArgs(
uwfAfter,
chainAfter,
);
const afterResult = evaluate(workflow.graph, lastRoleAfter, lastOutputAfter);
const contextAfter = buildModeratorContext(uwfAfter, chainAfter);
const afterResult = await evaluate(workflow, contextAfter);
if (!afterResult.ok) {
failStep(plog, `post-step moderator evaluate failed: ${afterResult.error.message}`);
}
@@ -1010,7 +975,6 @@ async function cmdThreadStepOnce(
thread: threadId,
head: newHead,
done,
background: null,
};
}
@@ -1027,6 +991,47 @@ async function resolveHeadHash(storageRoot: string, threadId: ThreadId): Promise
fail(`thread not found: ${threadId}`);
}
export async function cmdThreadSteps(
storageRoot: string,
threadId: ThreadId,
): Promise<ThreadStepsOutput> {
const headHash = await resolveHeadHash(storageRoot, threadId);
const uwf = await createUwfStore(storageRoot);
const chain = walkChain(uwf, headHash);
const startNode = uwf.store.get(chain.startHash);
if (startNode === null) {
fail(`StartNode not found: ${chain.startHash}`);
}
const startEntry: StartEntry = {
hash: chain.startHash,
workflow: chain.start.workflow,
prompt: chain.start.prompt,
timestamp: startNode.timestamp,
};
const stepEntries: StepEntry[] = [];
const ordered = collectOrderedSteps(uwf, headHash, chain);
for (const item of ordered) {
stepEntries.push({
hash: item.hash,
role: item.payload.role,
output: expandOutput(uwf, item.payload.output),
detail: item.payload.detail,
agent: item.payload.agent,
timestamp: item.timestamp,
});
}
return {
thread: threadId,
workflow: chain.start.workflow,
steps: [startEntry, ...stepEntries],
};
}
export async function cmdThreadRead(
storageRoot: string,
threadId: ThreadId,
@@ -1054,67 +1059,58 @@ export async function cmdThreadRead(
});
}
export type StopOutput = {
thread: ThreadId;
stopped: boolean;
};
export async function cmdThreadFork(
storageRoot: string,
stepHash: CasRef,
): Promise<ThreadForkOutput> {
const uwf = await createUwfStore(storageRoot);
const node = uwf.store.get(stepHash);
if (node === null) {
fail(`CAS node not found: ${stepHash}`);
}
if (node.type !== uwf.schemas.startNode && node.type !== uwf.schemas.stepNode) {
fail(`node ${stepHash} is not a StartNode or StepNode`);
}
export type CancelOutput = {
thread: ThreadId;
cancelled: boolean;
};
/**
* Stop background execution of a thread (but keep thread active)
*/
export async function cmdThreadStop(storageRoot: string, threadId: ThreadId): Promise<StopOutput> {
const newThreadId = generateUlid(Date.now()) as ThreadId;
const index = await loadThreadsIndex(storageRoot);
const head = index[threadId];
if (head === undefined) {
fail(`thread not active: ${threadId}`);
}
index[newThreadId] = stepHash;
await saveThreadsIndex(storageRoot, index);
// Check if thread is running in background and terminate it
const runningMarker = await isThreadRunning(storageRoot, threadId);
if (runningMarker === null) {
process.stderr.write(`Warning: thread ${threadId} is not currently running\n`);
return { thread: threadId, stopped: false };
}
try {
process.kill(runningMarker.pid, "SIGTERM");
} catch {
// Process may have already exited, ignore error
}
await deleteMarker(storageRoot, threadId);
return { thread: threadId, stopped: true };
return {
thread: newThreadId,
forkedFrom: {
step: stepHash,
},
};
}
/**
* Cancel a thread (stop execution + move to history)
*/
export async function cmdThreadCancel(
export async function cmdThreadStepDetails(
storageRoot: string,
threadId: ThreadId,
): Promise<CancelOutput> {
stepHash: CasRef,
): Promise<unknown> {
const uwf = await createUwfStore(storageRoot);
const node = uwf.store.get(stepHash);
if (node === null) {
fail(`CAS node not found: ${stepHash}`);
}
if (node.type !== uwf.schemas.stepNode) {
fail(`node ${stepHash} is not a StepNode`);
}
const payload = node.payload as StepNodePayload;
if (!payload.detail) {
fail(`step ${stepHash} has no detail`);
}
return expandDeep(uwf.store, payload.detail);
}
export async function cmdThreadKill(storageRoot: string, threadId: ThreadId): Promise<KillOutput> {
const index = await loadThreadsIndex(storageRoot);
const head = index[threadId];
if (head === undefined) {
fail(`thread not active: ${threadId}`);
}
// Check if thread is running in background and terminate it
const runningMarker = await isThreadRunning(storageRoot, threadId);
if (runningMarker !== null) {
try {
process.kill(runningMarker.pid, "SIGTERM");
} catch {
// Process may have already exited, ignore error
}
await deleteMarker(storageRoot, threadId);
}
const uwf = await createUwfStore(storageRoot);
const workflow = resolveWorkflowFromHead(uwf, head);
if (workflow === null) {
@@ -1132,5 +1128,5 @@ export async function cmdThreadCancel(
};
await appendThreadHistory(storageRoot, historyEntry);
return { thread: threadId, cancelled: true };
return { thread: threadId, archived: true };
}
+22 -19
View File
@@ -2,7 +2,12 @@ import { readFile } from "node:fs/promises";
import type { JSONSchema } from "@uncaged/json-cas";
import { putSchema, validate } from "@uncaged/json-cas";
import type { CasRef, RoleDefinition, Target, WorkflowPayload } from "@uncaged/workflow-protocol";
import type {
CasRef,
RoleDefinition,
Transition,
WorkflowPayload,
} from "@uncaged/workflow-protocol";
import { parse } from "yaml";
import {
@@ -24,7 +29,7 @@ export type WorkflowListEntry = {
origin: WorkflowOrigin;
};
export type WorkflowAddOutput = {
export type WorkflowPutOutput = {
name: string;
hash: CasRef;
};
@@ -46,23 +51,20 @@ function isJsonSchema(value: unknown): value is JSONSchema {
return typeof value === "object" && value !== null && !Array.isArray(value);
}
/** Normalize graph: validate each status → target mapping. */
function normalizeGraph(
graph: Record<string, Record<string, Target>>,
): Record<string, Record<string, Target>> {
const result: Record<string, Record<string, Target>> = {};
for (const [node, statusMap] of Object.entries(graph)) {
const normalized: Record<string, Target> = {};
for (const [status, target] of Object.entries(statusMap)) {
if (typeof target.prompt !== "string" || target.prompt.trim() === "") {
fail(`graph[${node}][${status}] → "${target.role}": prompt is required (non-empty string)`);
/** Normalize graph transitions: ensure condition is null (not undefined) for fallback entries. */
function normalizeGraph(graph: Record<string, Transition[]>): Record<string, Transition[]> {
const result: Record<string, Transition[]> = {};
for (const [node, transitions] of Object.entries(graph)) {
result[node] = transitions.map((t) => {
if (typeof t.prompt !== "string" || t.prompt.trim() === "") {
fail(`graph[${node}] transition to "${t.role}": prompt is required (non-empty string)`);
}
normalized[status] = {
role: target.role,
prompt: target.prompt,
return {
role: t.role,
condition: t.condition ?? null,
prompt: t.prompt,
};
}
result[node] = normalized;
});
}
return result;
}
@@ -104,14 +106,15 @@ export async function materializeWorkflowPayload(
name: raw.name,
description: raw.description,
roles,
conditions: raw.conditions,
graph: normalizeGraph(raw.graph),
};
}
export async function cmdWorkflowAdd(
export async function cmdWorkflowPut(
storageRoot: string,
filePath: string,
): Promise<WorkflowAddOutput> {
): Promise<WorkflowPutOutput> {
let text: string;
try {
text = await readFile(filePath, "utf8");
+19 -4
View File
@@ -30,12 +30,23 @@ function isRoleDefinition(value: unknown): boolean {
);
}
function isTarget(value: unknown): boolean {
function isConditionDefinition(value: unknown): boolean {
if (!isRecord(value)) {
return false;
}
return typeof value.description === "string" && typeof value.expression === "string";
}
function isTransition(value: unknown): boolean {
if (!isRecord(value)) {
return false;
}
const condition = value.condition;
return (
typeof value.role === "string" && typeof value.prompt === "string" && value.prompt.trim() !== ""
typeof value.role === "string" &&
typeof value.prompt === "string" &&
value.prompt.trim() !== "" &&
(condition === null || condition === undefined || typeof condition === "string")
);
}
@@ -51,7 +62,7 @@ function isGraph(value: unknown): boolean {
return false;
}
return Object.values(value).every(
(statusMap) => isRecord(statusMap) && Object.values(statusMap).every((t) => isTarget(t)),
(transitions) => Array.isArray(transitions) && transitions.every((t) => isTransition(t)),
);
}
@@ -90,7 +101,11 @@ export function parseWorkflowPayload(raw: unknown): WorkflowPayload | null {
if (typeof raw.name !== "string" || typeof raw.description !== "string") {
return null;
}
if (!isStringRecord(raw.roles, isRoleDefinition) || !isGraph(raw.graph)) {
if (
!isStringRecord(raw.roles, isRoleDefinition) ||
!isStringRecord(raw.conditions, isConditionDefinition) ||
!isGraph(raw.graph)
) {
return null;
}
return raw as WorkflowPayload;
@@ -19,14 +19,7 @@ mock.module("../src/tools/index.js", () => ({
getBuiltinTools: () => [],
}));
import {
executeTurnTools,
extractFinalText,
runBuiltinLoop,
shouldInjectDeadlineWarning,
shouldNudge,
shouldProcessToolCalls,
} from "../src/loop.js";
import { executeTurnTools, runBuiltinLoop, shouldNudge } from "../src/loop.js";
const fakeProvider = {} as any;
const fakeToolCtx = {} as any;
@@ -161,96 +154,3 @@ describe("runBuiltinLoop integration", () => {
expect(original.length).toBe(1);
});
});
describe("shouldInjectDeadlineWarning", () => {
test("5.1 returns true when turn count reaches warning threshold and not yet warned", () => {
expect(shouldInjectDeadlineWarning(7, 10, false, false)).toBe(true);
});
test("5.2 returns false when already warned", () => {
expect(shouldInjectDeadlineWarning(7, 10, true, false)).toBe(false);
});
test("5.3 returns false when noTools is true", () => {
expect(shouldInjectDeadlineWarning(7, 10, false, true)).toBe(false);
});
test("5.4 returns false when turns remaining > DEADLINE_WARNING_TURNS", () => {
expect(shouldInjectDeadlineWarning(5, 10, false, false)).toBe(false);
});
test("5.5 returns true when exactly at warning threshold", () => {
expect(shouldInjectDeadlineWarning(7, 10, false, false)).toBe(true);
});
test("5.6 returns false when turns remaining is 0", () => {
expect(shouldInjectDeadlineWarning(10, 10, false, false)).toBe(false);
});
});
describe("shouldProcessToolCalls", () => {
test("6.1 returns true when toolCalls present and noTools=false", () => {
expect(shouldProcessToolCalls([{ id: "x", name: "read", arguments: "{}" }], false)).toBe(true);
});
test("6.2 returns false when toolCalls is null", () => {
expect(shouldProcessToolCalls(null, false)).toBe(false);
});
test("6.3 returns false when toolCalls is empty array", () => {
expect(shouldProcessToolCalls([], false)).toBe(false);
});
test("6.4 returns false when noTools=true", () => {
expect(shouldProcessToolCalls([{ id: "x", name: "read", arguments: "{}" }], true)).toBe(false);
});
test("6.5 returns true when multiple tool calls present", () => {
expect(
shouldProcessToolCalls(
[
{ id: "x1", name: "read", arguments: "{}" },
{ id: "x2", name: "write", arguments: "{}" },
],
false,
),
).toBe(true);
});
});
describe("extractFinalText", () => {
test("7.1 returns last assistant message content", () => {
const messages = [
{ role: "system" as const, content: "sys", tool_calls: null },
{ role: "assistant" as const, content: "first", tool_calls: null },
{ role: "assistant" as const, content: "last", tool_calls: null },
];
expect(extractFinalText(messages)).toBe("last");
});
test("7.2 returns empty string when no assistant messages", () => {
expect(extractFinalText([{ role: "system" as const, content: "sys", tool_calls: null }])).toBe(
"",
);
});
test("7.3 skips assistant messages with null content", () => {
const messages = [
{ role: "assistant" as const, content: "first", tool_calls: null },
{
role: "assistant" as const,
content: null,
tool_calls: [{ id: "x", name: "t", arguments: "{}" }],
},
{ role: "assistant" as const, content: "second", tool_calls: null },
];
expect(extractFinalText(messages)).toBe("second");
});
test("7.4 skips assistant messages with empty content", () => {
const messages = [
{ role: "assistant" as const, content: "first", tool_calls: null },
{ role: "assistant" as const, content: "", tool_calls: null },
{ role: "user" as const, content: "nudge", tool_calls: null },
];
expect(extractFinalText(messages)).toBe("first");
});
test("7.5 handles empty messages array", () => {
expect(extractFinalText([])).toBe("");
});
test("7.6 handles messages with only user and system roles", () => {
const messages = [
{ role: "system" as const, content: "sys", tool_calls: null },
{ role: "user" as const, content: "query", tool_calls: null },
];
expect(extractFinalText(messages)).toBe("");
});
});
+82 -191
View File
@@ -1,12 +1,7 @@
import type { ResolvedLlmProvider } from "@uncaged/workflow-agent-kit";
import { createLogger } from "@uncaged/workflow-util";
import {
type ChatMessage,
chatCompletionWithTools,
type LlmToolCall,
type OpenAiToolDefinition,
} from "./llm/index.js";
import { type ChatMessage, chatCompletionWithTools, type LlmToolCall } from "./llm/index.js";
import { appendSessionTurn } from "./session.js";
import {
builtinToolsToOpenAi,
@@ -85,184 +80,10 @@ export type ShouldNudgeOptions = {
const MAX_NUDGES = 3;
const DEADLINE_WARNING_TURNS = 3;
export function shouldInjectDeadlineWarning(
turn: number,
maxTurns: number,
alreadyWarned: boolean,
noTools: boolean,
): boolean {
const turnsRemaining = maxTurns - turn;
return (
!noTools && !alreadyWarned && turnsRemaining > 0 && turnsRemaining <= DEADLINE_WARNING_TURNS
);
}
export function shouldProcessToolCalls(toolCalls: LlmToolCall[] | null, noTools: boolean): boolean {
return !noTools && toolCalls !== null && toolCalls.length > 0;
}
export function extractFinalText(messages: ChatMessage[]): string {
for (let i = messages.length - 1; i >= 0; i--) {
const msg = messages[i];
if (
msg !== undefined &&
msg.role === "assistant" &&
msg.content !== null &&
msg.content.trim() !== ""
) {
return msg.content;
}
}
return "";
}
function injectDeadlineWarning(messages: ChatMessage[], turnsRemaining: number): void {
log("4NRXW6KT", `${turnsRemaining} turns remaining, injecting deadline warning`);
messages.push({
role: "user",
content:
`⚠️ You have ${turnsRemaining} turns remaining. ` +
"Wrap up your work and output the YAML frontmatter starting with `---`. " +
"If you cannot finish in time, output frontmatter with `status: failed` and describe what remains.",
});
}
type HandleTextOnlyTurnResult = {
shouldBreak: boolean;
finalText: string;
turnCount: number;
nudgeCount: number;
turnAdjustment: number;
};
async function handleTextOnlyTurn(
text: string,
messages: ChatMessage[],
storageRoot: string,
sessionId: string,
noTools: boolean,
turn: number,
maxTurns: number,
currentNudgeCount: number,
): Promise<HandleTextOnlyTurnResult> {
await appendTurn(storageRoot, sessionId, {
role: "assistant",
content: text,
toolCalls: null,
reasoning: null,
});
const turnCount = 1;
let nudgeCount = currentNudgeCount;
let turnAdjustment = 0;
if (shouldNudge({ noTools, text, turn, maxTurns })) {
nudgeCount += 1;
log("7FXQM2KN", `text-only turn without frontmatter, nudge ${nudgeCount}/${MAX_NUDGES}`);
const nudge =
"You stopped calling tools but your response does not start with the required `---` YAML frontmatter. " +
"Either continue using tools to complete your work, or output your final response starting with `---`.";
messages.push({ role: "user", content: nudge });
// Nudge doesn't consume turn budget (up to MAX_NUDGES)
if (nudgeCount <= MAX_NUDGES) {
turnAdjustment = -1;
}
return { shouldBreak: false, finalText: "", turnCount, nudgeCount, turnAdjustment };
}
return { shouldBreak: true, finalText: text, turnCount, nudgeCount, turnAdjustment };
}
async function handleToolCallTurn(
content: string,
toolCalls: LlmToolCall[],
messages: ChatMessage[],
storageRoot: string,
sessionId: string,
toolCtx: ToolContext,
): Promise<number> {
await appendTurn(storageRoot, sessionId, {
role: "assistant",
content,
toolCalls: mapToolCallsForPayload(toolCalls),
reasoning: null,
});
let turnCount = 1;
// Execute tools
turnCount += await executeTurnTools(toolCalls, toolCtx, messages, storageRoot, sessionId);
return turnCount;
}
export function shouldNudge({ noTools, text, turn, maxTurns }: ShouldNudgeOptions): boolean {
return !noTools && !text.trimStart().startsWith("---") && turn < maxTurns - 1;
}
type ProcessLoopIterationResult = {
shouldBreak: boolean;
finalText: string;
turnCount: number;
nudgeCount: number;
turnAdjustment: number;
};
async function processLoopIteration(
options: RunBuiltinLoopOptions,
messages: ChatMessage[],
openAiTools: OpenAiToolDefinition[],
turn: number,
nudgeCount: number,
): Promise<ProcessLoopIterationResult> {
const response = await chatCompletionWithTools(
options.provider,
messages,
openAiTools.length > 0 ? openAiTools : null,
);
// When noTools is set, ignore any tool_calls the LLM might still return
const effectiveToolCalls = options.noTools ? null : (response.toolCalls ?? null);
const assistantMessage: ChatMessage = {
role: "assistant",
content: response.content,
tool_calls: effectiveToolCalls,
};
messages.push(assistantMessage);
if (!shouldProcessToolCalls(effectiveToolCalls, options.noTools)) {
const text = response.content ?? "";
const result = await handleTextOnlyTurn(
text,
messages,
options.storageRoot,
options.sessionId,
options.noTools,
turn,
options.maxTurns,
nudgeCount,
);
return result;
}
// At this point, effectiveToolCalls is guaranteed to be non-null and non-empty
const turnCount = await handleToolCallTurn(
response.content ?? "",
effectiveToolCalls as LlmToolCall[],
messages,
options.storageRoot,
options.sessionId,
options.toolCtx,
);
return {
shouldBreak: false,
finalText: "",
turnCount,
nudgeCount,
turnAdjustment: 0,
};
}
/** Agent run loop: LLM ↔ tools until no tool_calls or maxTurns. */
export async function runBuiltinLoop(
options: RunBuiltinLoopOptions,
@@ -278,25 +99,95 @@ export async function runBuiltinLoop(
log("8K2M4N7P", `builtin loop turn ${turn + 1}/${options.maxTurns}`);
// Warn agent when approaching turn limit
if (shouldInjectDeadlineWarning(turn, options.maxTurns, deadlineWarned, options.noTools)) {
const turnsRemaining = options.maxTurns - turn;
if (!options.noTools && !deadlineWarned && turnsRemaining <= DEADLINE_WARNING_TURNS) {
deadlineWarned = true;
const turnsRemaining = options.maxTurns - turn;
injectDeadlineWarning(messages, turnsRemaining);
log("4NRXW6KT", `${turnsRemaining} turns remaining, injecting deadline warning`);
messages.push({
role: "user",
content:
`⚠️ You have ${turnsRemaining} turns remaining. ` +
"Wrap up your work and output the YAML frontmatter starting with `---`. " +
"If you cannot finish in time, output frontmatter with `status: failed` and describe what remains.",
});
}
const result = await processLoopIteration(options, messages, openAiTools, turn, nudgeCount);
turnCount += result.turnCount;
nudgeCount = result.nudgeCount;
turn += result.turnAdjustment;
const response = await chatCompletionWithTools(
options.provider,
messages,
openAiTools.length > 0 ? openAiTools : null,
);
if (result.shouldBreak) {
finalText = result.finalText;
// When noTools is set, ignore any tool_calls the LLM might still return
const effectiveToolCalls = options.noTools ? null : (response.toolCalls ?? null);
const assistantMessage: ChatMessage = {
role: "assistant",
content: response.content,
tool_calls: effectiveToolCalls,
};
messages.push(assistantMessage);
if (effectiveToolCalls === null || effectiveToolCalls.length === 0) {
const text = response.content ?? "";
await appendTurn(options.storageRoot, options.sessionId, {
role: "assistant",
content: text,
toolCalls: null,
reasoning: null,
});
turnCount += 1;
if (shouldNudge({ noTools: options.noTools, text, turn, maxTurns: options.maxTurns })) {
nudgeCount += 1;
log("7FXQM2KN", `text-only turn without frontmatter, nudge ${nudgeCount}/${MAX_NUDGES}`);
const nudge =
"You stopped calling tools but your response does not start with the required `---` YAML frontmatter. " +
"Either continue using tools to complete your work, or output your final response starting with `---`.";
messages.push({ role: "user", content: nudge });
// Nudge doesn't consume turn budget (up to MAX_NUDGES)
if (nudgeCount <= MAX_NUDGES) {
turn -= 1;
}
continue;
}
finalText = text;
break;
}
// Assistant turn with tool calls
await appendTurn(options.storageRoot, options.sessionId, {
role: "assistant",
content: response.content ?? "",
toolCalls: mapToolCallsForPayload(effectiveToolCalls),
reasoning: null,
});
turnCount += 1;
// Execute tools
turnCount += await executeTurnTools(
effectiveToolCalls,
options.toolCtx,
messages,
options.storageRoot,
options.sessionId,
);
}
if (finalText === "") {
finalText = extractFinalText(messages);
if (finalText === "" && messages.length > 0) {
for (let i = messages.length - 1; i >= 0; i--) {
const msg = messages[i];
if (
msg !== undefined &&
msg.role === "assistant" &&
msg.content !== null &&
msg.content.trim() !== ""
) {
finalText = msg.content;
break;
}
}
}
return { finalText, messages, turnCount };
@@ -39,7 +39,7 @@ describe("buildClaudeCodePrompt", () => {
expect(result).toContain("## Task\nFix the bug");
});
test("includes previous steps with content on first visit", () => {
test("includes previous steps as history summary", () => {
const ctx = makeCtx({
steps: [
{
@@ -48,50 +48,18 @@ describe("buildClaudeCodePrompt", () => {
agent: "hermes",
detail: "detail-1",
edgePrompt: "Create a plan.",
content: "Here is my detailed plan for doing X.",
},
],
});
const result = buildClaudeCodePrompt(ctx);
expect(result).toContain("## What Happened Since Your Last Turn");
expect(result).toContain("## Previous Steps");
expect(result).toContain("Step 1: planner");
expect(result).toContain("do X");
// First visit should include step content
expect(result).toContain("Here is my detailed plan for doing X.");
});
test("re-entry shows steps since last visit without content", () => {
const ctx = makeCtx({
isFirstVisit: false,
steps: [
{
role: "developer",
output: '{"status":"done"}',
agent: "claude-code",
detail: "detail-1",
edgePrompt: "Implement.",
content: "I implemented everything.",
},
{
role: "reviewer",
output: '{"approved":false}',
agent: "claude-code",
detail: "detail-2",
edgePrompt: "Review.",
content: "Rejected: complexity too high, refactor cmdStepRead.",
},
],
});
const result = buildClaudeCodePrompt(ctx);
expect(result).toContain("## What Happened Since Your Last Turn");
expect(result).toContain("reviewer");
expect(result).toContain("approved");
});
test("omits history section when steps array is empty", () => {
const result = buildClaudeCodePrompt(makeCtx({ steps: [] }));
expect(result).not.toContain("## What Happened Since Your Last Turn");
expect(result).toContain("## Current Instruction");
expect(result).not.toContain("## Previous Steps");
});
test("works without outputFormatInstruction", () => {
@@ -154,99 +154,6 @@ describe("parseClaudeCodeStreamOutput", () => {
});
});
describe("parseClaudeCodeStreamOutput — helper extraction", () => {
test("processSystemLine sets model from system message", () => {
const lines = [
JSON.stringify({ type: "system", model: "claude-opus-4" }),
JSON.stringify({
type: "result",
subtype: "success",
result: "ok",
session_id: "s1",
num_turns: 0,
total_cost_usd: 0,
duration_ms: 0,
stop_reason: "end_turn",
}),
];
const parsed = parseClaudeCodeStreamOutput(lines.join("\n"));
expect(parsed).not.toBeNull();
expect(parsed!.model).toBe("claude-opus-4");
});
test("processAssistantLine skips empty content", () => {
const lines = [
JSON.stringify({ type: "assistant", message: { role: "assistant", content: [] } }),
JSON.stringify({
type: "result",
subtype: "success",
result: "ok",
session_id: "s1",
num_turns: 0,
total_cost_usd: 0,
duration_ms: 0,
stop_reason: "end_turn",
}),
];
const parsed = parseClaudeCodeStreamOutput(lines.join("\n"));
expect(parsed).not.toBeNull();
expect(parsed!.turns).toHaveLength(0);
});
test("processUserLine skips when no tool_result items", () => {
const lines = [
JSON.stringify({
type: "user",
message: { role: "user", content: [{ type: "text", text: "hi" }] },
}),
JSON.stringify({
type: "result",
subtype: "success",
result: "ok",
session_id: "s1",
num_turns: 0,
total_cost_usd: 0,
duration_ms: 0,
stop_reason: "end_turn",
}),
];
const parsed = parseClaudeCodeStreamOutput(lines.join("\n"));
expect(parsed).not.toBeNull();
expect(parsed!.turns).toHaveLength(0);
});
test("turn indices are sequential across mixed assistant and user lines", () => {
const lines = [
JSON.stringify({
type: "assistant",
message: { role: "assistant", content: [{ type: "text", text: "A" }] },
}),
JSON.stringify({
type: "user",
message: { role: "user", content: [{ type: "tool_result", content: "R" }] },
}),
JSON.stringify({
type: "assistant",
message: { role: "assistant", content: [{ type: "text", text: "B" }] },
}),
JSON.stringify({
type: "result",
subtype: "success",
result: "ok",
session_id: "s1",
num_turns: 3,
total_cost_usd: 0,
duration_ms: 0,
stop_reason: "end_turn",
}),
];
const parsed = parseClaudeCodeStreamOutput(lines.join("\n"));
expect(parsed).not.toBeNull();
expect(parsed!.turns).toHaveLength(3);
expect(parsed!.turns.map((t) => t.index)).toEqual([0, 1, 2]);
});
});
describe("storeClaudeCodeDetail", () => {
const baseParsed: ClaudeCodeParsedResult = {
type: "result",
@@ -22,8 +22,7 @@
},
"dependencies": {
"@uncaged/json-cas": "^0.4.0",
"@uncaged/workflow-agent-kit": "workspace:^",
"@uncaged/workflow-util": "workspace:^"
"@uncaged/workflow-agent-kit": "workspace:^"
},
"devDependencies": {
"typescript": "^5.8.3"
@@ -3,7 +3,6 @@ import type { Store } from "@uncaged/json-cas";
import {
type AgentContext,
type AgentRunResult,
buildContinuationPrompt,
buildRolePrompt,
createAgent,
getCachedSessionId,
@@ -19,6 +18,25 @@ const CLAUDE_COMMAND = "claude";
const CLAUDE_MAX_TURNS = 90;
const CLAUDE_MODEL = process.env.CLAUDE_MODEL ?? null;
function buildHistorySummary(steps: AgentContext["steps"]): string {
if (steps.length === 0) {
return "";
}
const lines: string[] = ["## Previous Steps"];
for (let i = 0; i < steps.length; i++) {
const step = steps[i];
if (step === undefined) {
continue;
}
lines.push("");
lines.push(`### Step ${i + 1}: ${step.role}`);
lines.push(`Output: ${JSON.stringify(step.output)}`);
lines.push(`Agent: ${step.agent}`);
}
return lines.join("\n");
}
/** Assemble system prompt, task, and prior step outputs for Claude Code. */
export function buildClaudeCodePrompt(ctx: AgentContext): string {
const roleDef = ctx.workflow.roles[ctx.role];
@@ -28,23 +46,11 @@ export function buildClaudeCodePrompt(ctx: AgentContext): string {
parts.push(ctx.outputFormatInstruction, "");
}
parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
if (!ctx.isFirstVisit) {
// Re-entry (session will be resumed): show only steps since last visit, meta only
parts.push("", buildContinuationPrompt(ctx.steps, ctx.role, ctx.edgePrompt));
} else if (ctx.steps.length > 0) {
// First visit: show all steps with content for recent ones
parts.push(
"",
buildContinuationPrompt(ctx.steps, ctx.role, ctx.edgePrompt, {
includeContent: true,
quota: 32000,
}),
);
} else {
parts.push("", "## Current Instruction", "", ctx.edgePrompt);
const historyBlock = buildHistorySummary(ctx.steps);
if (historyBlock !== "") {
parts.push("", historyBlock);
}
parts.push("", "## Current Instruction", "", ctx.edgePrompt);
return parts.join("\n");
}
@@ -140,13 +146,13 @@ async function runClaudeCode(ctx: AgentContext): Promise<AgentRunResult> {
// Try resuming a cached session for re-entry scenarios (e.g. reviewer reject → developer re-entry).
if (!ctx.isFirstVisit) {
const cachedSessionId = await getCachedSessionId("claude-code", ctx.threadId, ctx.role);
const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role);
if (cachedSessionId !== null) {
try {
const { stdout } = await spawnClaudeResume(cachedSessionId, fullPrompt);
const result = await processClaudeOutput(stdout, ctx.store);
if (result.sessionId !== undefined && result.sessionId !== "") {
await setCachedSessionId("claude-code", ctx.threadId, ctx.role, result.sessionId);
await setCachedSessionId(ctx.threadId, ctx.role, result.sessionId);
}
return result;
} catch (err) {
@@ -163,7 +169,7 @@ async function runClaudeCode(ctx: AgentContext): Promise<AgentRunResult> {
const { stdout } = await spawnClaudeRun(fullPrompt);
const result = await processClaudeOutput(stdout, ctx.store);
if (result.sessionId !== undefined && result.sessionId !== "") {
await setCachedSessionId("claude-code", ctx.threadId, ctx.role, result.sessionId);
await setCachedSessionId(ctx.threadId, ctx.role, result.sessionId);
}
return result;
}
@@ -34,7 +34,7 @@ export const CLAUDE_CODE_DETAIL_SCHEMA: JSONSchema = {
},
turns: {
type: "array",
items: { type: "string", format: "cas_ref" },
items: { type: "string" },
},
},
additionalProperties: false,
@@ -67,103 +67,99 @@ function extractToolResultContent(content: unknown[]): string {
return results.join("\n");
}
type ParseState = {
turns: ClaudeCodeTurnPayload[];
resultLine: Record<string, unknown> | null;
model: string;
turnIndex: number;
};
function processSystemLine(parsed: Record<string, unknown>, state: ParseState): void {
if (typeof parsed.model === "string") {
state.model = parsed.model;
}
}
function processAssistantLine(parsed: Record<string, unknown>, state: ParseState): void {
if (!isRecord(parsed.message)) return;
const content = Array.isArray(parsed.message.content) ? parsed.message.content : [];
const textContent = extractTextContent(content as unknown[]);
const toolCalls = extractToolCalls(content as unknown[]);
if (textContent !== "" || toolCalls.length > 0) {
state.turns.push({
index: state.turnIndex++,
role: "assistant",
content: textContent,
toolCalls: toolCalls.length > 0 ? toolCalls : null,
});
}
}
function processUserLine(parsed: Record<string, unknown>, state: ParseState): void {
if (!isRecord(parsed.message)) return;
const content = Array.isArray(parsed.message.content) ? parsed.message.content : [];
const resultContent = extractToolResultContent(content as unknown[]);
if (resultContent !== "") {
state.turns.push({
index: state.turnIndex++,
role: "tool_result",
content: resultContent,
toolCalls: null,
});
}
}
function processLine(line: string, state: ParseState): void {
let parsed: unknown;
try {
parsed = JSON.parse(line);
} catch {
return;
}
if (!isRecord(parsed)) return;
const type = parsed.type;
if (type === "system") processSystemLine(parsed, state);
else if (type === "assistant") processAssistantLine(parsed, state);
else if (type === "user") processUserLine(parsed, state);
else if (type === "result") state.resultLine = parsed;
}
function assembleResult(state: ParseState): ClaudeCodeParsedResult | null {
if (state.resultLine === null) return null;
const sessionId = state.resultLine.session_id;
const result = state.resultLine.result;
const subtype = state.resultLine.subtype;
if (typeof sessionId !== "string" || typeof result !== "string" || typeof subtype !== "string") {
return null;
}
const usage = isRecord(state.resultLine.usage) ? state.resultLine.usage : {};
return {
type: safeString(state.resultLine.type, "result"),
subtype: subtype as ClaudeCodeParsedResult["subtype"],
result,
sessionId,
numTurns: safeNumber(state.resultLine.num_turns),
totalCostUsd: safeNumber(state.resultLine.total_cost_usd),
durationMs: safeNumber(state.resultLine.duration_ms),
model: state.model,
stopReason: safeString(state.resultLine.stop_reason),
usage: {
inputTokens: safeNumber(usage.input_tokens),
outputTokens: safeNumber(usage.output_tokens),
cacheReadInputTokens: safeNumber(usage.cache_read_input_tokens),
cacheCreationInputTokens: safeNumber(usage.cache_creation_input_tokens),
},
turns: state.turns,
};
}
/**
* Parse Claude Code stream-json (NDJSON) output.
* Each line is a JSON object with type: "system" | "assistant" | "user" | "result".
*/
export function parseClaudeCodeStreamOutput(stdout: string): ClaudeCodeParsedResult | null {
const lines = stdout.trim().split("\n");
const state: ParseState = { turns: [], resultLine: null, model: "", turnIndex: 0 };
const turns: ClaudeCodeTurnPayload[] = [];
let resultLine: Record<string, unknown> | null = null;
let model = "";
let turnIndex = 0;
for (const line of lines) {
processLine(line, state);
let parsed: unknown;
try {
parsed = JSON.parse(line);
} catch {
continue;
}
if (!isRecord(parsed)) continue;
const type = parsed.type;
if (type === "system" && typeof parsed.model === "string") {
model = parsed.model;
}
if (type === "assistant" && isRecord(parsed.message)) {
const msg = parsed.message;
const content = Array.isArray(msg.content) ? msg.content : [];
const textContent = extractTextContent(content as unknown[]);
const toolCalls = extractToolCalls(content as unknown[]);
// Only record turns that have actual content
if (textContent !== "" || toolCalls.length > 0) {
turns.push({
index: turnIndex++,
role: "assistant",
content: textContent,
toolCalls: toolCalls.length > 0 ? toolCalls : null,
});
}
}
if (type === "user" && isRecord(parsed.message)) {
const msg = parsed.message;
const content = Array.isArray(msg.content) ? msg.content : [];
const resultContent = extractToolResultContent(content as unknown[]);
if (resultContent !== "") {
turns.push({
index: turnIndex++,
role: "tool_result",
content: resultContent,
toolCalls: null,
});
}
}
if (type === "result") {
resultLine = parsed;
}
}
return assembleResult(state);
if (resultLine === null) return null;
const sessionId = resultLine.session_id;
const result = resultLine.result;
const subtype = resultLine.subtype;
if (typeof sessionId !== "string" || typeof result !== "string" || typeof subtype !== "string") {
return null;
}
const usage = isRecord(resultLine.usage) ? resultLine.usage : {};
return {
type: safeString(resultLine.type, "result"),
subtype: subtype as ClaudeCodeParsedResult["subtype"],
result,
sessionId,
numTurns: safeNumber(resultLine.num_turns),
totalCostUsd: safeNumber(resultLine.total_cost_usd),
durationMs: safeNumber(resultLine.duration_ms),
model,
stopReason: safeString(resultLine.stop_reason),
usage: {
inputTokens: safeNumber(usage.input_tokens),
outputTokens: safeNumber(usage.output_tokens),
cacheReadInputTokens: safeNumber(usage.cache_read_input_tokens),
cacheCreationInputTokens: safeNumber(usage.cache_creation_input_tokens),
},
turns,
};
}
/**
@@ -4,96 +4,6 @@ import { HermesAcpClient } from "../src/acp-client.js";
const UUID_RE = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
describe("handleSessionUpdate — helper extraction", () => {
let client: HermesAcpClient;
beforeEach(() => {
client = new HermesAcpClient();
});
afterEach(async () => {
await client.close();
});
it("agent_message_chunk accumulates text in messageChunks", () => {
(client as any).handleSessionUpdate({
sessionUpdate: "agent_message_chunk",
content: { type: "text", text: "hello" },
});
(client as any).handleSessionUpdate({
sessionUpdate: "agent_message_chunk",
content: { type: "text", text: " world" },
});
expect((client as any).messageChunks).toEqual(["hello", " world"]);
});
it("agent_thought_chunk accumulates reasoning in reasoningChunks", () => {
(client as any).handleSessionUpdate({
sessionUpdate: "agent_thought_chunk",
content: { type: "text", text: "thinking" },
});
expect((client as any).reasoningChunks).toEqual(["thinking"]);
});
it("tool_call registers a pending tool and flushes message chunks", () => {
(client as any).messageChunks = ["pre-tool text"];
(client as any).handleSessionUpdate({
sessionUpdate: "tool_call",
title: "Bash",
rawInput: { command: "ls" },
toolCallId: "tc-1",
});
expect((client as any).pendingTools.get("tc-1")).toEqual({
name: "Bash",
args: JSON.stringify({ command: "ls" }),
});
expect((client as any).messageChunks).toEqual([]);
expect((client as any).messages).toHaveLength(1);
expect((client as any).messages[0].role).toBe("assistant");
});
it("tool_call_update completed pushes tool_call and tool messages", () => {
(client as any).pendingTools.set("tc-2", { name: "Read", args: '{"path":"/foo"}' });
(client as any).handleSessionUpdate({
sessionUpdate: "tool_call_update",
status: "completed",
toolCallId: "tc-2",
rawOutput: "file contents",
});
const msgs = (client as any).messages as Array<{
role: string;
tool_calls: unknown;
content: string | null;
}>;
expect(msgs).toHaveLength(2);
expect(msgs[0].role).toBe("assistant");
expect(msgs[0].tool_calls).toEqual([
{ function: { name: "Read", arguments: '{"path":"/foo"}' } },
]);
expect(msgs[1].role).toBe("tool");
expect(msgs[1].content).toBe("file contents");
expect((client as any).pendingTools.has("tc-2")).toBe(false);
});
it("tool_call_update with non-string rawOutput JSON-stringifies it", () => {
(client as any).pendingTools.set("tc-3", { name: "Fetch", args: "" });
(client as any).handleSessionUpdate({
sessionUpdate: "tool_call_update",
status: "completed",
toolCallId: "tc-3",
rawOutput: { html: "<p>page</p>" },
});
const msgs = (client as any).messages as Array<{ role: string; content: string | null }>;
expect(msgs[1].content).toBe(JSON.stringify({ html: "<p>page</p>" }));
});
it("unknown updateType is a no-op", () => {
(client as any).handleSessionUpdate({ sessionUpdate: "unknown_type", data: {} });
expect((client as any).messages).toHaveLength(0);
expect((client as any).messageChunks).toHaveLength(0);
});
});
describe("HermesAcpClient", () => {
let client: HermesAcpClient;
@@ -23,7 +23,7 @@ function makeCtx(overrides: Partial<AgentContext> = {}): AgentContext {
graph: {},
},
role: "developer",
start: { prompt: "Fix the bug", workflow: "abc123" },
start: { prompt: "Fix the bug", workflowHash: "abc123", threadId: "t1" },
steps: [],
store: {} as AgentContext["store"],
outputFormatInstruction: "Use YAML frontmatter",
@@ -55,7 +55,6 @@ describe("buildHermesPrompt", () => {
agent: "uwf-hermes",
detail: "detail-1",
edgePrompt: "Implement the fix.",
content: null,
},
{
role: "reviewer",
@@ -63,7 +62,6 @@ describe("buildHermesPrompt", () => {
agent: "uwf-hermes",
detail: "detail-2",
edgePrompt: "Review the code.",
content: null,
},
],
});
@@ -87,7 +85,6 @@ describe("buildHermesPrompt", () => {
agent: "uwf-hermes",
detail: "detail-1",
edgePrompt: "First attempt.",
content: null,
},
],
edgePrompt: "Retry with a fresh approach.",
@@ -98,90 +95,4 @@ describe("buildHermesPrompt", () => {
expect(result).toContain("Retry with a fresh approach.");
expect(result).not.toContain("## What Happened Since Your Last Turn");
});
test("first visit includes content from previous steps", () => {
const ctx = makeCtx({
isFirstVisit: true,
steps: [
{
role: "planner",
output: { plan: "hash1" },
agent: "uwf-hermes",
detail: "detail-1",
edgePrompt: "Create the plan.",
content: "# Plan\nDetailed plan markdown...",
},
{
role: "developer",
output: { files: ["app.ts"] },
agent: "uwf-hermes",
detail: "detail-2",
edgePrompt: "Implement the code.",
content: "# Implementation\nCode changes...",
},
{
role: "reviewer",
output: { approved: true },
agent: "uwf-hermes",
detail: "detail-3",
edgePrompt: "Review the work.",
content: "# Review\nApproved!",
},
],
role: "committer",
edgePrompt: "Commit the reviewed code.",
});
const result = buildHermesPrompt(ctx);
expect(result).toContain("Use YAML frontmatter");
expect(result).toContain("## Task");
expect(result).toContain("Fix the bug");
expect(result).toContain("## What Happened Since Your Last Turn");
expect(result).toContain("### Step 1: planner");
expect(result).toContain("#### Step Content");
expect(result).toContain("# Plan");
expect(result).toContain("Detailed plan markdown");
expect(result).toContain("### Step 2: developer");
expect(result).toContain("# Implementation");
expect(result).toContain("### Step 3: reviewer");
expect(result).toContain("# Review");
expect(result).toContain("## Moderator Instruction");
expect(result).toContain("Commit the reviewed code.");
});
test("re-entry omits content from previous steps", () => {
const ctx = makeCtx({
isFirstVisit: false,
steps: [
{
role: "developer",
output: { files: ["app.ts"] },
agent: "uwf-hermes",
detail: "detail-1",
edgePrompt: "Implement the code.",
content: "# Implementation\nCode changes...",
},
{
role: "reviewer",
output: { approved: false },
agent: "uwf-hermes",
detail: "detail-2",
edgePrompt: "Review the work.",
content: "# Review\nNot approved!",
},
],
role: "developer",
edgePrompt: "Fix the issues.",
});
const result = buildHermesPrompt(ctx);
expect(result).toContain("## What Happened Since Your Last Turn");
expect(result).toContain("### Step 2: reviewer");
expect(result).toContain(JSON.stringify({ approved: false }));
expect(result).not.toContain("#### Step Content");
expect(result).not.toContain("# Review");
expect(result).not.toContain("Not approved!");
});
});
@@ -245,75 +245,72 @@ export class HermesAcpClient {
// ---- Session update → structured messages ----
private handleSessionUpdate(update: Record<string, unknown>): void {
switch (update.sessionUpdate as string) {
case "agent_message_chunk":
this.handleAgentMessageChunk(update);
const updateType = update.sessionUpdate as string;
switch (updateType) {
case "agent_message_chunk": {
const content = update.content as { type?: string; text?: string } | undefined;
if (content?.type === "text" && typeof content.text === "string") {
this.messageChunks.push(content.text);
}
break;
case "agent_thought_chunk":
this.handleAgentThoughtChunk(update);
}
case "agent_thought_chunk": {
const content = update.content as { type?: string; text?: string } | undefined;
if (content?.type === "text" && typeof content.text === "string") {
this.reasoningChunks.push(content.text);
}
break;
case "tool_call":
this.handleToolCall(update);
}
case "tool_call": {
const title = (update.title as string) ?? "";
const rawInput = update.rawInput;
const args = rawInput !== undefined && rawInput !== null ? JSON.stringify(rawInput) : "";
const toolCallId = update.toolCallId as string;
this.pendingTools.set(toolCallId, { name: title, args });
// Flush accumulated assistant text before tool call
this.flushAssistantMessage();
break;
case "tool_call_update":
this.handleToolCallUpdate(update);
}
case "tool_call_update": {
const status = update.status as string | undefined;
if (status === "completed" || status === "failed") {
const toolCallId = update.toolCallId as string;
const pending = this.pendingTools.get(toolCallId);
const toolName = pending?.name ?? toolCallId;
const rawOutput = update.rawOutput;
const outputStr =
rawOutput !== undefined && rawOutput !== null
? typeof rawOutput === "string"
? rawOutput
: JSON.stringify(rawOutput)
: "";
this.messages.push({
role: "assistant",
content: null,
reasoning: null,
tool_calls: [{ function: { name: toolName, arguments: pending?.args ?? "" } }],
});
this.messages.push({
role: "tool",
content: outputStr,
reasoning: null,
tool_calls: null,
});
this.pendingTools.delete(toolCallId);
}
break;
}
default:
break;
}
}
private handleAgentMessageChunk(update: Record<string, unknown>): void {
const content = update.content as { type?: string; text?: string } | undefined;
if (content?.type === "text" && typeof content.text === "string") {
this.messageChunks.push(content.text);
}
}
private handleAgentThoughtChunk(update: Record<string, unknown>): void {
const content = update.content as { type?: string; text?: string } | undefined;
if (content?.type === "text" && typeof content.text === "string") {
this.reasoningChunks.push(content.text);
}
}
private handleToolCall(update: Record<string, unknown>): void {
const title = (update.title as string) ?? "";
const rawInput = update.rawInput;
const args = rawInput !== undefined && rawInput !== null ? JSON.stringify(rawInput) : "";
const toolCallId = update.toolCallId as string;
this.pendingTools.set(toolCallId, { name: title, args });
this.flushAssistantMessage();
}
private handleToolCallUpdate(update: Record<string, unknown>): void {
const status = update.status as string | undefined;
if (status !== "completed" && status !== "failed") return;
const toolCallId = update.toolCallId as string;
const pending = this.pendingTools.get(toolCallId);
const toolName = pending?.name ?? toolCallId;
const rawOutput = update.rawOutput;
const outputStr =
rawOutput !== undefined && rawOutput !== null
? typeof rawOutput === "string"
? rawOutput
: JSON.stringify(rawOutput)
: "";
this.messages.push({
role: "assistant",
content: null,
reasoning: null,
tool_calls: [{ function: { name: toolName, arguments: pending?.args ?? "" } }],
});
this.messages.push({
role: "tool",
content: outputStr,
reasoning: null,
tool_calls: null,
});
this.pendingTools.delete(toolCallId);
}
/** Flush any accumulated text/reasoning into an assistant message. */
private flushAssistantMessage(): void {
const text = this.messageChunks.join("");
+37 -23
View File
@@ -14,39 +14,53 @@ import { storeHermesSessionDetail } from "./session-detail.js";
const log = createLogger({ sink: { kind: "stderr" } });
/** Assemble system prompt, task, and prior step outputs for Hermes. */
export function buildHermesPrompt(ctx: AgentContext): string {
const parts: string[] = [];
function buildHistorySummary(steps: AgentContext["steps"]): string {
if (steps.length === 0) {
return "";
}
const lines: string[] = ["## Previous Steps"];
for (let i = 0; i < steps.length; i++) {
const step = steps[i];
if (step === undefined) {
continue;
}
lines.push("");
lines.push(`### Step ${i + 1}: ${step.role}`);
lines.push(`Output: ${JSON.stringify(step.output)}`);
lines.push(`Agent: ${step.agent}`);
}
return lines.join("\n");
}
function buildInitialPrompt(ctx: AgentContext): string {
const roleDef = ctx.workflow.roles[ctx.role];
const rolePrompt = roleDef !== undefined ? buildRolePrompt(roleDef) : "";
const parts: string[] = [];
if (ctx.outputFormatInstruction !== "") {
parts.push(ctx.outputFormatInstruction, "");
}
parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
const historyBlock = buildHistorySummary(ctx.steps);
if (historyBlock !== "") {
parts.push("", historyBlock);
}
parts.push("", "## Moderator Instruction", "", ctx.edgePrompt);
return parts.join("\n");
}
/** Assemble system prompt, task, and prior step outputs for Hermes. */
export function buildHermesPrompt(ctx: AgentContext): string {
if (!ctx.isFirstVisit) {
// Re-entry: show only steps since last visit, meta only
const parts: string[] = [];
if (ctx.outputFormatInstruction !== "") {
parts.push(ctx.outputFormatInstruction, "");
}
parts.push(buildContinuationPrompt(ctx.steps, ctx.role, ctx.edgePrompt));
return parts.join("\n");
}
// First visit: show initial context with content for recent steps
const roleDef = ctx.workflow.roles[ctx.role];
const rolePrompt = roleDef !== undefined ? buildRolePrompt(roleDef) : "";
parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
// Add history with content (last 2-3 steps within quota)
if (ctx.steps.length > 0) {
parts.push(
"",
buildContinuationPrompt(ctx.steps, ctx.role, ctx.edgePrompt, {
includeContent: true,
quota: 32000, // Use THREAD_READ_DEFAULT_QUOTA equivalent
}),
);
} else {
parts.push("", "## Moderator Instruction", "", ctx.edgePrompt);
}
return parts.join("\n");
return buildInitialPrompt(ctx);
}
async function storePromptResult(
@@ -1,22 +1,5 @@
// Re-export session cache from the shared agent-kit package with agent name injected.
import {
getCachedSessionId as getCachedSessionIdBase,
setCachedSessionId as setCachedSessionIdBase,
} from "@uncaged/workflow-agent-kit";
import type { ThreadId } from "@uncaged/workflow-protocol";
export async function getCachedSessionId(threadId: ThreadId, role: string): Promise<string | null> {
return getCachedSessionIdBase("hermes", threadId, role);
}
export async function setCachedSessionId(
threadId: ThreadId,
role: string,
sessionId: string,
): Promise<void> {
return setCachedSessionIdBase("hermes", threadId, role, sessionId);
}
// Re-export session cache from the shared agent-kit package.
export { getCachedSessionId, setCachedSessionId } from "@uncaged/workflow-agent-kit";
export function isResumeDisabled(): boolean {
// Hermes ACP session/resume is broken: _restore fails for custom providers
+3 -4
View File
@@ -83,10 +83,9 @@ Requires `UWF_EDGE_PROMPT` in the environment (set by `uwf thread step`).
function buildRolePrompt(role: RoleDefinition): string
function buildOutputFormatInstruction(schema: JSONSchema): string
function buildContinuationPrompt(
steps: StepContext[],
role: string,
edgePrompt: string,
options?: { includeContent?: boolean; quota?: number },
ctx: AgentContext,
priorOutput: string,
instruction: string,
): string
```
@@ -8,7 +8,6 @@ const reviewerStep: StepContext = {
detail: "2MXBG6PN4A8JR",
agent: "uwf-hermes",
edgePrompt: "Review the developer's work.",
content: null,
};
const developerStep: StepContext = {
@@ -17,7 +16,6 @@ const developerStep: StepContext = {
detail: "1VPBG9SM5E7WK",
agent: "uwf-hermes",
edgePrompt: "Implement the fix.",
content: null,
};
describe("buildContinuationPrompt", () => {
@@ -31,7 +29,6 @@ describe("buildContinuationPrompt", () => {
detail: "7BQST3VW9F2MA",
agent: "uwf-hermes",
edgePrompt: "Revise the plan.",
content: null,
},
];
@@ -73,162 +70,4 @@ describe("buildContinuationPrompt", () => {
expect(result).toContain("## Moderator Instruction");
expect(result).toContain("Please revise your work.");
});
test("includes step content when includeContent option is true", () => {
const stepsWithContent: StepContext[] = [
{
role: "planner",
output: { plan: "hash123" },
detail: "detail1",
agent: "uwf-hermes",
edgePrompt: "",
content: "# Plan\nDetailed plan markdown...",
},
{
role: "developer",
output: { filesChanged: ["app.ts"] },
detail: "detail2",
agent: "uwf-hermes",
edgePrompt: "",
content: "# Implementation\nCode changes...",
},
{
role: "reviewer",
output: { approved: false },
detail: "detail3",
agent: "uwf-hermes",
edgePrompt: "",
content: "# Review\nFeedback...",
},
];
const result = buildContinuationPrompt(stepsWithContent, "committer", "Commit the changes.", {
includeContent: true,
});
expect(result).toContain("## What Happened Since Your Last Turn");
expect(result).toContain("### Step 1: planner");
expect(result).toContain("#### Step Content");
expect(result).toContain("# Plan");
expect(result).toContain("Detailed plan markdown");
expect(result).toContain("### Step 2: developer");
expect(result).toContain("# Implementation");
expect(result).toContain("### Step 3: reviewer");
expect(result).toContain("# Review");
expect(result).toContain("## Moderator Instruction");
expect(result).toContain("Commit the changes.");
});
test("omits step content when includeContent is false (default)", () => {
const stepsWithContent: StepContext[] = [
{
role: "developer",
output: { filesChanged: ["app.ts"] },
detail: "detail1",
agent: "uwf-hermes",
edgePrompt: "",
content: "# Implementation\nCode changes...",
},
{
role: "reviewer",
output: { approved: false },
detail: "detail2",
agent: "uwf-hermes",
edgePrompt: "",
content: "# Review\nFeedback...",
},
];
const result = buildContinuationPrompt(stepsWithContent, "developer", "Fix the issues.");
expect(result).toContain("## What Happened Since Your Last Turn");
expect(result).toContain("### Step 2: reviewer");
expect(result).toContain(JSON.stringify(stepsWithContent[1]?.output));
expect(result).not.toContain("#### Step Content");
expect(result).not.toContain("# Review");
});
test("respects quota when includeContent is true", () => {
const largeContent = "x".repeat(5000);
const stepsWithContent: StepContext[] = [
{
role: "planner",
output: { plan: "hash1" },
detail: "detail1",
agent: "uwf-hermes",
edgePrompt: "",
content: largeContent,
},
{
role: "developer",
output: { files: ["app.ts"] },
detail: "detail2",
agent: "uwf-hermes",
edgePrompt: "",
content: largeContent,
},
{
role: "reviewer",
output: { approved: true },
detail: "detail3",
agent: "uwf-hermes",
edgePrompt: "",
content: "# Review\nLooks good!",
},
];
const result = buildContinuationPrompt(stepsWithContent, "committer", "Commit the changes.", {
includeContent: true,
quota: 1000,
});
// Should include most recent step(s) within quota
expect(result).toContain("### Step 1: reviewer"); // Showing 1 of 3, so step 3 becomes step 1
expect(result).toContain("#### Step Content");
expect(result).toContain("## Moderator Instruction");
expect(result).toContain("Showing 1 of 3 steps (2 omitted due to quota)");
});
test("handles null content gracefully when includeContent is true", () => {
const stepsWithMixedContent: StepContext[] = [
{
role: "planner",
output: { plan: "hash1" },
detail: "detail1",
agent: "uwf-hermes",
edgePrompt: "",
content: "# Plan\nDetails...",
},
{
role: "developer",
output: { files: ["app.ts"] },
detail: "detail2",
agent: "uwf-hermes",
edgePrompt: "",
content: null, // No content available
},
{
role: "reviewer",
output: { approved: true },
detail: "detail3",
agent: "uwf-hermes",
edgePrompt: "",
content: "# Review\nApproved!",
},
];
const result = buildContinuationPrompt(
stepsWithMixedContent,
"committer",
"Commit the changes.",
{ includeContent: true },
);
expect(result).toContain("### Step 1: planner");
expect(result).toContain("# Plan");
expect(result).toContain("### Step 2: developer");
// Step 2 should not have content section since content is null
expect(result).toContain("### Step 3: reviewer");
expect(result).toContain("# Review");
});
});
@@ -1,14 +0,0 @@
import { describe, expect, test } from "vitest";
// We need to test buildHistory indirectly through buildContext
// since buildHistory is not exported. For now, we'll test the integration
// through the public API in a separate integration test.
describe("context module - content extraction", () => {
test("placeholder - content extraction will be tested via integration tests", () => {
// This test is a placeholder. The actual testing of content extraction
// will be done through integration tests in build-continuation-prompt.test.ts
// where we can verify that StepContext objects have the correct content field.
expect(true).toBe(true);
});
});
@@ -1,247 +0,0 @@
import { mkdir, readdir, readFile, rm, stat, writeFile } from "node:fs/promises";
import { dirname, join } from "node:path";
import type { ThreadId } from "@uncaged/workflow-protocol";
import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { getCachedSessionId, getCachePath, setCachedSessionId } from "../src/session-cache.js";
import { resolveStorageRoot } from "../src/storage.js";
describe("session-cache", () => {
let originalStorageRoot: string;
let testStorageRoot: string;
beforeEach(async () => {
// Create a temporary test storage root
originalStorageRoot = resolveStorageRoot();
testStorageRoot = join(originalStorageRoot, "test-cache", `test-${Date.now()}`);
await mkdir(testStorageRoot, { recursive: true });
// Override the storage root for testing
process.env.WORKFLOW_STORAGE_ROOT = testStorageRoot;
});
afterEach(async () => {
// Clean up test storage root
await rm(testStorageRoot, { recursive: true, force: true });
delete process.env.WORKFLOW_STORAGE_ROOT;
});
describe("getCachePath", () => {
test("returns agent-specific file path", () => {
const path = getCachePath("claude-code");
expect(path).toMatch(/\/cache\/claude-code-sessions\.json$/);
});
test("returns different paths for different agents", () => {
const pathClaudeCode = getCachePath("claude-code");
const pathHermes = getCachePath("hermes");
expect(pathClaudeCode).not.toBe(pathHermes);
expect(pathClaudeCode).toMatch(/claude-code-sessions\.json$/);
expect(pathHermes).toMatch(/hermes-sessions\.json$/);
});
test("handles agent names with special characters", () => {
const path1 = getCachePath("my-agent");
const path2 = getCachePath("my_agent");
expect(path1).toMatch(/my-agent-sessions\.json$/);
expect(path2).toMatch(/my_agent-sessions\.json$/);
});
});
describe("session isolation", () => {
const threadId = "01234567890123456789012345" as ThreadId;
const role = "developer";
test("sessions are isolated per agent", async () => {
// Cache different session IDs for each agent
await setCachedSessionId("claude-code", threadId, role, "session-cc-001");
await setCachedSessionId("hermes", threadId, role, "session-hermes-001");
// Each agent should retrieve its own session ID
const sessionCC = await getCachedSessionId("claude-code", threadId, role);
const sessionHermes = await getCachedSessionId("hermes", threadId, role);
expect(sessionCC).toBe("session-cc-001");
expect(sessionHermes).toBe("session-hermes-001");
});
test("updating one agent's cache does not affect another", async () => {
// Set initial sessions for both agents
await setCachedSessionId("claude-code", threadId, role, "session-cc-001");
await setCachedSessionId("hermes", threadId, role, "session-hermes-001");
// Update claude-code's session
await setCachedSessionId("claude-code", threadId, role, "session-cc-002");
// Hermes's session should remain unchanged
const sessionHermes = await getCachedSessionId("hermes", threadId, role);
expect(sessionHermes).toBe("session-hermes-001");
// Claude-code should have the new session
const sessionCC = await getCachedSessionId("claude-code", threadId, role);
expect(sessionCC).toBe("session-cc-002");
});
test("missing session returns null for specific agent", async () => {
const session = await getCachedSessionId("claude-code", threadId, role);
expect(session).toBeNull();
});
test("empty session ID is treated as missing", async () => {
await setCachedSessionId("claude-code", threadId, role, "");
const session = await getCachedSessionId("claude-code", threadId, role);
expect(session).toBeNull();
});
});
describe("file system operations", () => {
const threadId = "01234567890123456789012345" as ThreadId;
const role = "developer";
test("cache directory is created if missing", async () => {
const cachePath = getCachePath("claude-code");
const cacheDir = dirname(cachePath);
// Ensure cache dir doesn't exist
await rm(cacheDir, { recursive: true, force: true });
// Write a session
await setCachedSessionId("claude-code", threadId, role, "session-001");
// Cache directory should be created
const stats = await stat(cacheDir);
expect(stats.isDirectory()).toBe(true);
});
test("multiple agents create separate cache files", async () => {
// Cache sessions for multiple agents
await setCachedSessionId("claude-code", threadId, role, "session-cc-001");
await setCachedSessionId("hermes", threadId, role, "session-hermes-001");
// Separate cache files should exist
const pathCC = getCachePath("claude-code");
const pathHermes = getCachePath("hermes");
const contentCC = JSON.parse(await readFile(pathCC, "utf8")) as Record<string, string>;
const contentHermes = JSON.parse(await readFile(pathHermes, "utf8")) as Record<
string,
string
>;
expect(contentCC).toHaveProperty(`${threadId}:${role}`, "session-cc-001");
expect(contentHermes).toHaveProperty(`${threadId}:${role}`, "session-hermes-001");
});
test("atomic writes prevent partial reads", async () => {
// Write a session
await setCachedSessionId("claude-code", threadId, role, "session-001");
// The final file should exist (no .tmp files left behind)
const cachePath = getCachePath("claude-code");
const dir = dirname(cachePath);
const files = await readdir(dir);
expect(files).toContain("claude-code-sessions.json");
expect(files.every((f) => !f.endsWith(".tmp"))).toBe(true);
});
});
describe("legacy migration", () => {
const threadId = "01234567890123456789012345" as ThreadId;
const role = "developer";
test("old agent-sessions.json is ignored", async () => {
// Create old agent-sessions.json file
const oldCachePath = join(resolveStorageRoot(), "cache", "agent-sessions.json");
await mkdir(dirname(oldCachePath), { recursive: true });
await writeFile(
oldCachePath,
JSON.stringify({
"01234567890123456789012345:developer": "old-session-001",
}),
"utf8",
);
// Query with the new per-agent cache
const session = await getCachedSessionId("claude-code", threadId, role);
// Should return null (old cache is ignored)
expect(session).toBeNull();
});
test("new per-agent cache takes precedence", async () => {
// Create both old and new cache files
const oldPath = join(resolveStorageRoot(), "cache", "agent-sessions.json");
await mkdir(dirname(oldPath), { recursive: true });
await writeFile(
oldPath,
JSON.stringify({
[`${threadId}:${role}`]: "old-session",
}),
"utf8",
);
await setCachedSessionId("claude-code", threadId, role, "new-session");
// The new per-agent cache value should be returned
const session = await getCachedSessionId("claude-code", threadId, role);
expect(session).toBe("new-session");
});
});
describe("error handling", () => {
const threadId = "01234567890123456789012345" as ThreadId;
const role = "developer";
test("invalid JSON in cache file returns empty cache", async () => {
// Create a corrupted cache file
const cachePath = getCachePath("claude-code");
await mkdir(dirname(cachePath), { recursive: true });
await writeFile(cachePath, "{ invalid json }", "utf8");
// Should return null (treating corrupted cache as empty)
const session = await getCachedSessionId("claude-code", threadId, role);
expect(session).toBeNull();
});
test("non-object JSON in cache file returns empty cache", async () => {
// Create a cache file with non-object JSON
const cachePath = getCachePath("claude-code");
await mkdir(dirname(cachePath), { recursive: true });
await writeFile(cachePath, JSON.stringify(["not", "an", "object"]), "utf8");
// Should return null
const session = await getCachedSessionId("claude-code", threadId, role);
expect(session).toBeNull();
});
test("cache entries with non-string values are ignored", async () => {
// Create a cache file with mixed types
const cachePath = getCachePath("claude-code");
const cacheData = {
"thread1:role1": "valid-session",
"thread2:role2": 12345, // number
"thread3:role3": null, // null
"thread4:role4": "", // empty string
};
await mkdir(dirname(cachePath), { recursive: true });
await writeFile(cachePath, JSON.stringify(cacheData), "utf8");
// Valid string entries should be returned
const session1 = await getCachedSessionId("claude-code", "thread1" as ThreadId, "role1");
expect(session1).toBe("valid-session");
// Invalid entries should return null
const session2 = await getCachedSessionId("claude-code", "thread2" as ThreadId, "role2");
const session3 = await getCachedSessionId("claude-code", "thread3" as ThreadId, "role3");
const session4 = await getCachedSessionId("claude-code", "thread4" as ThreadId, "role4");
expect(session2).toBeNull();
expect(session3).toBeNull();
expect(session4).toBeNull(); // empty string is treated as missing
});
});
});
@@ -1,20 +1,11 @@
import type { StepContext } from "@uncaged/workflow-protocol";
function formatStep(step: StepContext, stepNumber: number, includeContent: boolean): string {
const lines = [
function formatStep(step: StepContext, stepNumber: number): string {
return [
`### Step ${stepNumber}: ${step.role}`,
`Output: ${JSON.stringify(step.output)}`,
`Agent: ${step.agent}`,
];
if (includeContent && step.content !== null) {
lines.push("");
lines.push("#### Step Content");
lines.push("");
lines.push(step.content);
}
return lines.join("\n");
].join("\n");
}
function findLastRoleIndex(steps: StepContext[], role: string): number {
@@ -27,45 +18,6 @@ function findLastRoleIndex(steps: StepContext[], role: string): number {
return -1;
}
function selectStepsWithinQuota(steps: StepContext[], quota: number): StepContext[] {
const selected: StepContext[] = [];
let totalChars = 0;
// Work backwards (newest first)
for (let i = steps.length - 1; i >= 0; i--) {
const step = steps[i];
if (step === undefined) continue;
// Estimate size: meta + content
const metaSize = JSON.stringify({
role: step.role,
output: step.output,
agent: step.agent,
}).length;
const contentSize = step.content?.length ?? 0;
const stepSize = metaSize + contentSize;
if (totalChars + stepSize > quota && selected.length > 0) {
// Stop adding steps but keep at least 1
break;
}
selected.unshift(step); // Keep chronological order
totalChars += stepSize;
if (totalChars >= quota) {
break;
}
}
return selected;
}
type BuildContinuationPromptOptions = {
includeContent?: boolean;
quota?: number;
};
/**
* Build a continuation prompt for a role re-entry.
*
@@ -76,11 +28,7 @@ export function buildContinuationPrompt(
steps: StepContext[],
role: string,
edgePrompt: string,
options?: BuildContinuationPromptOptions,
): string {
const includeContent = options?.includeContent ?? false;
const quota = options?.quota ?? Number.POSITIVE_INFINITY;
const lastIndex = findLastRoleIndex(steps, role);
const sinceSteps = lastIndex >= 0 ? steps.slice(lastIndex + 1) : steps;
@@ -89,25 +37,13 @@ export function buildContinuationPrompt(
if (sinceSteps.length > 0) {
parts.push("## What Happened Since Your Last Turn");
const baseStepNumber = lastIndex >= 0 ? lastIndex + 2 : 1;
// Select steps within quota (newest-first if includeContent = true)
const selectedSteps = includeContent ? selectStepsWithinQuota(sinceSteps, quota) : sinceSteps;
const skippedCount = sinceSteps.length - selectedSteps.length;
if (skippedCount > 0) {
parts.push("");
parts.push(
`_Showing ${selectedSteps.length} of ${sinceSteps.length} steps (${skippedCount} omitted due to quota)_`,
);
}
for (let i = 0; i < selectedSteps.length; i++) {
const step = selectedSteps[i];
for (let i = 0; i < sinceSteps.length; i++) {
const step = sinceSteps[i];
if (step === undefined) {
continue;
}
parts.push("");
parts.push(formatStep(step, baseStepNumber + i, includeContent));
parts.push(formatStep(step, baseStepNumber + i));
}
parts.push("");
}
+11 -40
View File
@@ -21,6 +21,14 @@ function fail(message: string): never {
throw new Error(message);
}
function readEdgePrompt(): string {
const value = process.env.UWF_EDGE_PROMPT;
if (value === undefined || value === "") {
fail("UWF_EDGE_PROMPT environment variable is required");
}
return value;
}
function walkChain(store: Store, schemas: AgentStore["schemas"], headHash: CasRef): ChainState {
const headNode = store.get(headHash);
if (headNode === null) {
@@ -82,38 +90,6 @@ function expandOutput(store: Store, outputRef: CasRef): unknown {
return node.payload;
}
function extractStepContent(store: Store, detailRef: CasRef): string | null {
const detailNode = store.get(detailRef);
if (detailNode === null) {
return null;
}
const detail = detailNode.payload as Record<string, unknown>;
const turns = detail.turns;
if (!Array.isArray(turns) || turns.length === 0) {
return null;
}
// Find last assistant content (same logic as extractLastAssistantContent in cli-workflow)
for (let i = turns.length - 1; i >= 0; i--) {
const turnRef = turns[i];
if (typeof turnRef !== "string") {
continue;
}
const turnNode = store.get(turnRef as CasRef);
if (turnNode === null) {
continue;
}
const turn = turnNode.payload as Record<string, unknown>;
if (
turn.role === "assistant" &&
typeof turn.content === "string" &&
turn.content.trim() !== ""
) {
return turn.content;
}
}
return null;
}
async function buildHistory(
store: Store,
stepsNewestFirst: StepNodePayload[],
@@ -121,14 +97,12 @@ async function buildHistory(
const chronological = [...stepsNewestFirst].reverse();
const history: StepContext[] = [];
for (const step of chronological) {
const content = extractStepContent(store, step.detail);
history.push({
role: step.role,
output: expandOutput(store, step.output),
detail: step.detail,
agent: step.agent,
edgePrompt: step.edgePrompt ?? "",
content,
});
}
return history;
@@ -149,11 +123,7 @@ async function loadWorkflow(store: Store, schemas: AgentStore["schemas"], workfl
* Build agent execution context from thread head in threads.yaml.
* Walks the CAS chain from head to StartNode and expands step outputs.
*/
export async function buildContext(
threadId: ThreadId,
role: string,
edgePrompt: string,
): Promise<AgentContext> {
export async function buildContext(threadId: ThreadId, role: string): Promise<AgentContext> {
const storageRoot = resolveStorageRoot();
const agentStore = await createAgentStore(storageRoot);
const { store, schemas } = agentStore;
@@ -172,6 +142,7 @@ export async function buildContext(
}
const steps = await buildHistory(store, chain.stepsNewestFirst);
const edgePrompt = readEdgePrompt();
const isFirstVisit = !steps.some((s) => s.role === role);
return {
@@ -201,7 +172,6 @@ export type BuildContextMeta = {
export async function buildContextWithMeta(
threadId: ThreadId,
role: string,
edgePrompt: string,
): Promise<AgentContext & { meta: BuildContextMeta }> {
const storageRoot = resolveStorageRoot();
const agentStore = await createAgentStore(storageRoot);
@@ -221,6 +191,7 @@ export async function buildContextWithMeta(
}
const steps = await buildHistory(store, chain.stepsNewestFirst);
const edgePrompt = readEdgePrompt();
const isFirstVisit = !steps.some((s) => s.role === role);
return {
+1 -1
View File
@@ -12,7 +12,7 @@ export {
export type { FrontmatterFastPathResult } from "./frontmatter.js";
export { tryFrontmatterFastPath } from "./frontmatter.js";
export { createAgent } from "./run.js";
export { getCachedSessionId, getCachePath, setCachedSessionId } from "./session-cache.js";
export { getCachedSessionId, setCachedSessionId } from "./session-cache.js";
export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
export type {
AgentContext,
+11 -19
View File
@@ -22,24 +22,16 @@ function agentLabel(name: string): string {
return `uwf-${name}`;
}
const USAGE = "usage: <agent-cli> --thread <id> --role <role> --prompt <text>";
function getNamedArg(argv: string[], name: string): string {
const idx = argv.indexOf(name);
if (idx === -1 || idx + 1 >= argv.length) {
return "";
function parseArgv(argv: string[]): { threadId: ThreadId; role: string } {
const threadId = argv[2];
const role = argv[3];
if (threadId === undefined || threadId === "") {
fail("usage: <agent-cli> <thread-id> <role>");
}
return argv[idx + 1];
}
function parseArgv(argv: string[]): { threadId: ThreadId; role: string; prompt: string } {
const threadId = getNamedArg(argv, "--thread");
const role = getNamedArg(argv, "--role");
const prompt = getNamedArg(argv, "--prompt");
if (threadId === "") fail(USAGE);
if (role === "") fail(USAGE);
if (prompt === "") fail(USAGE);
return { threadId: threadId as ThreadId, role, prompt };
if (role === undefined || role === "") {
fail("usage: <agent-cli> <thread-id> <role>");
}
return { threadId: threadId as ThreadId, role };
}
function runWithMessage<T>(label: string, fn: () => Promise<T>): Promise<T> {
@@ -111,11 +103,11 @@ async function persistStep(options: {
export function createAgent(options: AgentOptions): () => Promise<void> {
return async function main(): Promise<void> {
const { threadId, role, prompt } = parseArgv(process.argv);
const { threadId, role } = parseArgv(process.argv);
const storageRoot = resolveStorageRoot();
loadDotenv({ path: getEnvPath(storageRoot) });
const ctx = await runWithMessage("context", () => buildContextWithMeta(threadId, role, prompt));
const ctx = await runWithMessage("context", () => buildContextWithMeta(threadId, role));
const roleDef = ctx.workflow.roles[role];
if (roleDef === undefined) {
@@ -8,8 +8,8 @@ import { resolveStorageRoot } from "./storage.js";
type SessionCache = Record<string, string>;
export function getCachePath(agentName: string): string {
return join(resolveStorageRoot(), "cache", `${agentName}-sessions.json`);
function getCachePath(): string {
return join(resolveStorageRoot(), "cache", "agent-sessions.json");
}
function cacheKey(threadId: ThreadId, role: string): string {
@@ -20,8 +20,8 @@ function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null && !Array.isArray(value);
}
async function readCache(agentName: string): Promise<SessionCache> {
const path = getCachePath(agentName);
async function readCache(): Promise<SessionCache> {
const path = getCachePath();
try {
const text = await readFile(path, "utf8");
const raw = JSON.parse(text) as unknown;
@@ -40,45 +40,36 @@ async function readCache(agentName: string): Promise<SessionCache> {
if (err.code === "ENOENT") {
return {};
}
// Treat JSON parse errors as empty cache
if (err.name === "SyntaxError") {
return {};
}
throw e;
}
}
async function writeCache(agentName: string, cache: SessionCache): Promise<void> {
const path = getCachePath(agentName);
async function writeCache(cache: SessionCache): Promise<void> {
const path = getCachePath();
const dir = dirname(path);
await mkdir(dir, { recursive: true });
// Atomic write: write to temp file then rename to avoid partial reads on concurrent access.
// NOTE: Current workflow execution is serial (execFileSync), so true concurrency doesn't occur.
// This is a safety net for future parallel execution.
const tmpPath = join(dir, `.${agentName}-sessions.${randomBytes(4).toString("hex")}.tmp`);
const tmpPath = join(dir, `.agent-sessions.${randomBytes(4).toString("hex")}.tmp`);
await writeFile(tmpPath, `${JSON.stringify(cache, null, 2)}\n`, "utf8");
await rename(tmpPath, path);
}
/** Read the cached session ID for a thread+role pair. */
export async function getCachedSessionId(
agentName: string,
threadId: ThreadId,
role: string,
): Promise<string | null> {
const cache = await readCache(agentName);
export async function getCachedSessionId(threadId: ThreadId, role: string): Promise<string | null> {
const cache = await readCache();
const sessionId = cache[cacheKey(threadId, role)];
return sessionId ?? null;
}
/** Write the session ID for a thread+role pair into the cache. */
export async function setCachedSessionId(
agentName: string,
threadId: ThreadId,
role: string,
sessionId: string,
): Promise<void> {
const cache = await readCache(agentName);
const cache = await readCache();
cache[cacheKey(threadId, role)] = sessionId;
await writeCache(agentName, cache);
await writeCache(cache);
}
+1 -1
View File
@@ -13,7 +13,7 @@ export type AgentContext = ModeratorContext & {
*/
outputFormatInstruction: string;
/**
* Edge prompt from the graph transition that led to this role (--prompt CLI arg).
* Edge prompt from the graph transition that led to this role (UWF_EDGE_PROMPT).
* Always the real moderator instruction for this step.
*/
edgePrompt: string;
+1 -1
View File
@@ -57,7 +57,7 @@ export function createApi() {
transitions: t.Array(
t.Object({
target: t.String(),
status: t.String(),
condition: t.Union([t.String(), t.Null()]),
}),
),
}),
+40 -15
View File
@@ -1,6 +1,6 @@
import { mkdir, readdir, readFile, unlink, writeFile } from "node:fs/promises";
import { join } from "node:path";
import type { RoleDefinition, Target, WorkflowPayload } from "@uncaged/workflow-protocol";
import type { RoleDefinition, Transition, WorkflowPayload } from "@uncaged/workflow-protocol";
import YAML from "yaml";
import type { WorkFlowSteps, WorkFlowTransition, WorkflowSummary } from "../shared/types.ts";
@@ -11,12 +11,17 @@ async function ensureDir() {
}
function payloadToSteps(payload: WorkflowPayload): WorkFlowSteps {
const conditionMap = new Map<string, string>();
for (const [name, def] of Object.entries(payload.conditions)) {
conditionMap.set(name, def.expression);
}
const steps: WorkFlowSteps = [];
for (const [roleName, roleDef] of Object.entries(payload.roles)) {
const statusMap = payload.graph[roleName] ?? {};
const transitions: WorkFlowTransition[] = Object.entries(statusMap).map(([status, target]) => ({
target: target.role === "$END" ? "END" : target.role,
status,
const graphTransitions = payload.graph[roleName] ?? [];
const transitions: WorkFlowTransition[] = graphTransitions.map((t) => ({
target: t.role === "$END" ? "END" : t.role,
condition: t.condition ? (conditionMap.get(t.condition) ?? t.condition) : null,
}));
steps.push({
@@ -37,7 +42,11 @@ function payloadToSteps(payload: WorkflowPayload): WorkFlowSteps {
function stepsToPayload(name: string, description: string, steps: WorkFlowSteps): WorkflowPayload {
const roles: Record<string, RoleDefinition> = {};
const graph: Record<string, Record<string, Target>> = {};
const conditions: WorkflowPayload["conditions"] = {};
const graph: Record<string, Transition[]> = {};
const expressionToName = new Map<string, string>();
let condIdx = 0;
for (const step of steps) {
const r = step.role;
@@ -50,28 +59,43 @@ function stepsToPayload(name: string, description: string, steps: WorkFlowSteps)
frontmatter: "",
};
const statusMap: Record<string, Target> = {};
for (const t of step.transitions) {
const transitions: Transition[] = step.transitions.map((t) => {
let condName: string | null = null;
if (t.condition) {
if (expressionToName.has(t.condition)) {
condName = expressionToName.get(t.condition) ?? null;
} else {
condName = `cond${condIdx++}`;
expressionToName.set(t.condition, condName);
conditions[condName] = {
description: "",
expression: t.condition,
};
}
}
const targetRole = t.target === "END" ? "$END" : t.target;
statusMap[t.status] = {
return {
role: targetRole,
condition: condName,
prompt: `Transition to ${targetRole}.`,
};
}
graph[r.name] = statusMap;
});
graph[r.name] = transitions;
}
if (steps.length > 0) {
const firstRole = steps[0].role.name;
graph.$START = {
_: {
graph.$START = [
{
role: firstRole,
condition: null,
prompt: `Begin workflow at role ${firstRole}.`,
},
};
];
}
return { name, description, roles, graph };
return { name, description, roles, conditions, graph };
}
export async function listWorkflows(): Promise<WorkflowSummary[]> {
@@ -101,6 +125,7 @@ export async function createWorkflow(name: string, description: string): Promise
name,
description,
roles: {},
conditions: {},
graph: {},
};
await writeFile(join(WORKFLOW_DIR, `${name}.yaml`), YAML.stringify(payload), "utf-8");
+1 -1
View File
@@ -9,7 +9,7 @@ export type WorkFlowRole = {
export type WorkFlowTransition = {
target: string;
status: string;
condition: string | null;
};
export type WorkFlowStep = {
@@ -6,10 +6,10 @@ import {
useReactFlow,
} from "@xyflow/react";
import { Check } from "lucide-react";
import { type ReactNode, useEffect, useRef, useState } from "react";
import { type ReactNode, useEffect, useMemo, useRef, useState } from "react";
import { cn } from "../../lib/utils.ts";
import { useModel } from "../context.tsx";
import type { StatusEdge as StatusEdgeType } from "../type.ts";
import type { ConditionalEdge as ConditionalEdgeType } from "../type.ts";
const SOURCE_COLOR = "#10b981";
const TARGET_COLOR = "#3b82f6";
@@ -23,7 +23,7 @@ function GradientPath({
sourceY,
targetX,
targetY,
hasStatus,
hasCondition,
selected,
}: {
id: string;
@@ -32,11 +32,11 @@ function GradientPath({
sourceY: number;
targetX: number;
targetY: number;
hasStatus: boolean;
hasCondition: boolean | null;
selected: boolean;
}) {
const gradientId = `gradient-${id}`;
const showLack = !hasStatus;
const showLack = hasCondition === false;
const strokeStyle = selected
? { stroke: "#f59e0b", strokeWidth: 2 }
: { stroke: `url(#${gradientId})`, strokeWidth: 1.5 };
@@ -68,20 +68,35 @@ function GradientPath({
);
}
type StatusLabelProps = {
status: string | undefined;
function ElseBadge({ labelX, labelY }: { labelX: number; labelY: number }): ReactNode {
return (
<div
className="absolute pointer-events-none"
style={{
transform: `translate(-50%, -50%) translate(${labelX}px,${labelY}px)`,
}}
>
<span className="inline-block px-1 bg-white rounded text-[10px] border border-gray-300 text-gray-500">
else
</span>
</div>
);
}
type ConditionLabelProps = {
condition: string | undefined;
labelX: number;
labelY: number;
onSave: (value: string) => void;
};
function StatusLabel({ status, labelX, labelY, onSave }: StatusLabelProps): ReactNode {
function ConditionLabel({ condition, labelX, labelY, onSave }: ConditionLabelProps): ReactNode {
const [isOpen, setIsOpen] = useState(false);
const [inputValue, setInputValue] = useState("");
const containerRef = useRef<HTMLDivElement>(null);
function handleBadgeClick() {
setInputValue(status || "");
setInputValue(condition || "");
setIsOpen(true);
}
@@ -112,8 +127,6 @@ function StatusLabel({ status, labelX, labelY, onSave }: StatusLabelProps): Reac
return () => document.removeEventListener("pointerdown", handleClickOutside, true);
}, [isOpen]);
const displayStatus = status?.trim() || null;
return (
<div
ref={containerRef}
@@ -129,13 +142,11 @@ function StatusLabel({ status, labelX, labelY, onSave }: StatusLabelProps): Reac
<span
className={cn(
"inline-block px-1 bg-white rounded text-[10px]",
displayStatus
? "border border-gray-300 text-black"
: "border border-dashed text-red-500",
condition ? "border border-gray-300 text-black" : "border border-dashed text-red-500",
)}
style={displayStatus ? undefined : { borderColor: LACK_COLOR }}
style={condition ? undefined : { borderColor: LACK_COLOR }}
>
{displayStatus ?? "status"}
if
</span>
</div>
{isOpen && (
@@ -144,7 +155,7 @@ function StatusLabel({ status, labelX, labelY, onSave }: StatusLabelProps): Reac
<input
type="text"
className="w-32 rounded border border-gray-300 px-1 py-0.5 text-[10px] focus:border-blue-500 focus:outline-none"
placeholder="输入状态"
placeholder="输入条件"
value={inputValue}
onChange={(e) => setInputValue(e.target.value)}
onKeyDown={handleKeyDown}
@@ -163,8 +174,14 @@ function StatusLabel({ status, labelX, labelY, onSave }: StatusLabelProps): Reac
);
}
export function StatusEdge({
export function isElseEdge(edgeId: string, source: string, allEdges: Edge[]): boolean {
const siblings = allEdges.filter((e) => e.source === source && e.type === "conditional");
return siblings.length >= 2 && siblings[0].id === edgeId;
}
export function ConditionalEdge({
id,
source,
sourceX,
sourceY,
targetX,
@@ -173,7 +190,7 @@ export function StatusEdge({
targetPosition,
selected,
data,
}: EdgeProps<StatusEdgeType>): ReactNode {
}: EdgeProps<ConditionalEdgeType>): ReactNode {
const [edgePath, labelX, labelY] = getSmoothStepPath({
sourceX,
sourceY,
@@ -186,11 +203,13 @@ export function StatusEdge({
const flow = useReactFlow();
const model = useModel();
const status = data?.status;
const allEdges = flow.getEdges();
const isElse = useMemo(() => isElseEdge(id, source, allEdges), [id, source, allEdges]);
const condition = data?.condition;
function handleSave(value: string) {
model.startTransaction();
flow.updateEdgeData(id, { status: value });
flow.updateEdgeData(id, { condition: value });
requestAnimationFrame(model.endTransaction);
}
@@ -203,11 +222,20 @@ export function StatusEdge({
sourceY={sourceY}
targetX={targetX}
targetY={targetY}
hasStatus={!!status?.trim()}
hasCondition={isElse ? null : !!condition}
selected={!!selected}
/>
<EdgeLabelRenderer>
<StatusLabel status={status} labelX={labelX} labelY={labelY} onSave={handleSave} />
{isElse ? (
<ElseBadge labelX={labelX} labelY={labelY} />
) : (
<ConditionLabel
condition={condition}
labelX={labelX}
labelY={labelY}
onSave={handleSave}
/>
)}
</EdgeLabelRenderer>
</>
);
@@ -241,7 +269,7 @@ export function GradientEdge({
sourceY={sourceY}
targetX={targetX}
targetY={targetY}
hasStatus={true}
hasCondition={null}
selected={!!selected}
/>
);
@@ -1,6 +1,6 @@
import { GradientEdge, StatusEdge } from "./status";
import { ConditionalEdge, GradientEdge } from "./conditional";
export const edgeTypes = {
status: StatusEdge,
conditional: ConditionalEdge,
default: GradientEdge,
};
@@ -65,12 +65,12 @@ export const edgesModel = define.model("edges", makeEdges, (set, get, model) =>
const existingFromSource = currentEdges.filter((e) => e.source === normalized.source);
if (existingFromSource.length > 0) {
edge.type = "status";
edge.data = { status: "" };
edge.type = "conditional";
edge.data = { condition: "" };
const promoted = currentEdges.map((e) => {
if (e.source === normalized.source && e.type !== "status") {
return { ...e, type: "status" as const, data: { status: "_" } };
if (e.source === normalized.source && e.type !== "conditional") {
return { ...e, type: "conditional" as const, data: { condition: "" } };
}
return e;
});
@@ -34,8 +34,21 @@ export const handlers = define.memoize((use, model) => {
return node.type === "start" || node.type === "end";
}
const onBeforeDelete: OnBeforeDelete<AnyWorkNode> = async ({ nodes }) => {
function isFirstConditionalSibling(
edge: { id: string; source: string; type: string | null },
allEdges: { id: string; source: string; type: string | null }[],
): boolean {
if (edge.type !== "conditional") return false;
const siblings = allEdges.filter((e) => e.source === edge.source && e.type === "conditional");
return siblings.length >= 2 && siblings[0].id === edge.id;
}
const onBeforeDelete: OnBeforeDelete<AnyWorkNode> = async ({ nodes, edges }) => {
if (nodes.some(isProtectedNode)) return false;
if (edges.length > 0) {
const allEdges = use(edgesModel)[0];
if (edges.some((e) => isFirstConditionalSibling(e, allEdges))) return false;
}
model.startTransaction();
return true;
};
@@ -43,14 +56,16 @@ export const handlers = define.memoize((use, model) => {
if (deletedEdges.length > 0) {
const currentEdges = use(edgesModel)[0];
const sourcesToCheck = new Set(
deletedEdges.filter((e) => e.type === "status").map((e) => e.source),
deletedEdges.filter((e) => e.type === "conditional").map((e) => e.source),
);
if (sourcesToCheck.size > 0) {
let needsDowngrade = false;
const updatedEdges = currentEdges.map((e) => {
if (!sourcesToCheck.has(e.source) || e.type !== "status") return e;
const siblings = currentEdges.filter((s) => s.source === e.source && s.type === "status");
if (!sourcesToCheck.has(e.source) || e.type !== "conditional") return e;
const siblings = currentEdges.filter(
(s) => s.source === e.source && s.type === "conditional",
);
if (siblings.length === 1) {
needsDowngrade = true;
const { data: _, ...rest } = e;
@@ -36,7 +36,7 @@ describe("transIn", () => {
});
it("4.3 Single step with END transition → edge to end node exists", () => {
const steps = [makeStep("A", [{ status: "_", target: "END" }])];
const steps = [makeStep("A", [{ condition: null, target: "END" }])];
const { edges } = transIn(steps);
const endEdge = edges.find((e) => e.target === "end");
expect(endEdge).toBeDefined();
@@ -44,8 +44,8 @@ describe("transIn", () => {
it("4.4 Two steps with default transitions chain", () => {
const steps = [
makeStep("A", [{ status: "_", target: "B" }]),
makeStep("B", [{ status: "_", target: "END" }]),
makeStep("A", [{ condition: null, target: "B" }]),
makeStep("B", [{ condition: null, target: "END" }]),
];
const { edges } = transIn(steps);
// Should have start→A, A→B, B→end
@@ -53,15 +53,15 @@ describe("transIn", () => {
const nodeAId = edges.find((e) => e.source === "start")?.target;
expect(edges.find((e) => e.source === nodeAId && e.target !== "end")).toBeDefined();
expect(edges.find((e) => e.target === "end")).toBeDefined();
// No status edges for single default transitions
expect(edges.every((e) => e.type !== "status")).toBe(true);
// No conditional edges
expect(edges.every((e) => e.type !== "conditional")).toBe(true);
});
it("4.5 Step with multiple transitions → status edges", () => {
it("4.5 Step with multiple transitions → conditional edges", () => {
const steps = [
makeStep("A", [
{ status: "_", target: "B" },
{ status: "approved", target: "C" },
{ condition: null, target: "B" },
{ condition: "x>0", target: "C" },
]),
makeStep("B", []),
makeStep("C", []),
@@ -69,35 +69,23 @@ describe("transIn", () => {
const { edges } = transIn(steps);
const nodeAId = edges.find((e) => e.source === "start")?.target;
const outEdges = edges.filter((e) => e.source === nodeAId);
expect(outEdges.every((e) => e.type === "status")).toBe(true);
});
it("4.5b Multiple transitions include expected status values", () => {
const steps = [
makeStep("A", [
{ status: "_", target: "B" },
{ status: "approved", target: "C" },
]),
makeStep("B", []),
makeStep("C", []),
];
const { edges } = transIn(steps);
const nodeAId = edges.find((e) => e.source === "start")?.target;
const outEdges = edges.filter((e) => e.source === nodeAId);
const defaultEdge = outEdges.find(
(e) => (e as { data?: { status?: string } }).data?.status === "_",
expect(outEdges.every((e) => e.type === "conditional")).toBe(true);
// else-branch has empty condition
const elseEdge = outEdges.find(
(e) => (e as { data?: { condition?: string } }).data?.condition === "",
);
expect(defaultEdge).toBeDefined();
const approvedEdge = outEdges.find(
(e) => (e as { data?: { status?: string } }).data?.status === "approved",
expect(elseEdge).toBeDefined();
// if-branch has condition
const ifEdge = outEdges.find(
(e) => (e as { data?: { condition?: string } }).data?.condition === "x>0",
);
expect(approvedEdge).toBeDefined();
expect(ifEdge).toBeDefined();
});
it("4.6 With 1 incoming edge: targetHandle = 'input'; with 2: first gets 'input'", () => {
const steps = [
makeStep("A", [{ status: "_", target: "END" }]),
makeStep("B", [{ status: "_", target: "END" }]),
makeStep("A", [{ condition: null, target: "END" }]),
makeStep("B", [{ condition: null, target: "END" }]),
];
const { edges } = transIn(steps);
// start→A and start→B; end has 2 incoming edges
@@ -107,8 +95,8 @@ describe("transIn", () => {
it("4.7 Same role name maps to same node id across steps", () => {
const steps = [
makeStep("A", [{ status: "_", target: "B" }]),
makeStep("B", [{ status: "_", target: "A" }]),
makeStep("A", [{ condition: null, target: "B" }]),
makeStep("B", [{ condition: null, target: "A" }]),
];
const { edges } = transIn(steps);
const aId = edges.find((e) => e.source === "start")?.target;
@@ -33,13 +33,13 @@ function defaultEdge(source: string, target: string): AnyWorkEdge {
return { id: `${source}-${target}`, source, target, animated: true } as AnyWorkEdge;
}
function statusEdge(source: string, target: string, status: string): AnyWorkEdge {
function conditionalEdge(source: string, target: string, condition: string): AnyWorkEdge {
return {
id: `${source}-${target}-status`,
id: `${source}-${target}-cond`,
source,
target,
type: "status" as const,
data: { status },
type: "conditional" as const,
data: { condition },
animated: true,
} as AnyWorkEdge;
}
@@ -76,36 +76,36 @@ describe("validateRoleNodes (via validate)", () => {
expect(nodeErrors.some((e) => e.message.includes("缺少输出连接"))).toBe(true);
});
it("5.3 Empty status on status edge → error", () => {
it("5.3 Empty condition on non-first conditional edge → error", () => {
const n1 = roleNode("n1");
const n2 = roleNode("n2");
const n3 = roleNode("n3");
const nodes = baseNodes(n1, n2, n3);
const edges = [
defaultEdge("start", "n1"),
statusEdge("n1", "n2", "_"),
statusEdge("n1", "n3", ""), // empty status → error
conditionalEdge("n1", "n2", ""), // else-branch (index 0) - exempt
conditionalEdge("n1", "n3", ""), // if-branch (index 1) - empty condition → error
defaultEdge("n2", "end"),
defaultEdge("n3", "end"),
];
const result = validate(nodes, edges);
expect(result.errors.some((e) => e.message.includes("状态值不能为空"))).toBe(true);
expect(result.errors.some((e) => e.message.includes("条件表达式不能为空"))).toBe(true);
});
it("5.4 Mix of status and non-status outgoing → error", () => {
it("5.4 Mix of conditional and non-conditional outgoing → error", () => {
const n1 = roleNode("n1");
const n2 = roleNode("n2");
const n3 = roleNode("n3");
const nodes = baseNodes(n1, n2, n3);
const edges = [
defaultEdge("start", "n1"),
statusEdge("n1", "n2", "approved"),
conditionalEdge("n1", "n2", "x>0"),
defaultEdge("n1", "n3"), // mix → error
defaultEdge("n2", "end"),
defaultEdge("n3", "end"),
];
const result = validate(nodes, edges);
expect(result.errors.some((e) => e.message.includes("所有出边必须附带状态"))).toBe(true);
expect(result.errors.some((e) => e.message.includes("所有出边必须附带条件"))).toBe(true);
});
it("5.5 Valid role node (1 in, 1 out default) → no errors for that node", () => {
@@ -118,15 +118,15 @@ describe("validateRoleNodes (via validate)", () => {
expect(roleErrors).toHaveLength(0);
});
it("5.6 Valid role node (1 in, 2 status out with statuses) → no errors", () => {
it("5.6 Valid role node (1 in, 2 conditional out with conditions) → no errors", () => {
const n1 = roleNode("n1");
const n2 = roleNode("n2");
const n3 = roleNode("n3");
const nodes = baseNodes(n1, n2, n3);
const edges = [
defaultEdge("start", "n1"),
statusEdge("n1", "n2", "_"),
statusEdge("n1", "n3", "approved"),
conditionalEdge("n1", "n2", ""), // else-branch
conditionalEdge("n1", "n3", "x>0"), // if-branch
defaultEdge("n2", "end"),
defaultEdge("n3", "end"),
];
@@ -1,4 +1,4 @@
import type { AnyWorkEdge, AnyWorkNode, StatusEdge } from "../type";
import type { AnyWorkEdge, AnyWorkNode, ConditionalEdge } from "../type";
import { uuid } from "../utils";
import type { WorkFlowStep } from "./type";
@@ -9,7 +9,6 @@ type Result = {
const _OUT_HANDLES = ["output-top", "output", "output-bottom"] as const;
const IN_HANDLES = ["input-top", "input", "input-bottom"] as const;
const DEFAULT_STATUS = "_";
function assignHandles(
indices: number[],
@@ -51,8 +50,8 @@ function buildNodeMap(
function sortTransitions(step: WorkFlowStep): WorkFlowStep["transitions"] {
if (step.transitions.length <= 1) return step.transitions;
return [...step.transitions].sort((a, b) => {
if (a.status === DEFAULT_STATUS && b.status !== DEFAULT_STATUS) return -1;
if (a.status !== DEFAULT_STATUS && b.status === DEFAULT_STATUS) return 1;
if (a.condition === null && b.condition !== null) return -1;
if (a.condition !== null && b.condition === null) return 1;
return 0;
});
}
@@ -61,32 +60,32 @@ function buildStepEdges(
sourceId: string,
step: WorkFlowStep,
nameToId: Map<string, string>,
): { primaryEdges: AnyWorkEdge[]; statusEdges: AnyWorkEdge[] } {
): { elseEdges: AnyWorkEdge[]; ifEdges: AnyWorkEdge[] } {
const hasMultiple = step.transitions.length > 1;
const sorted = sortTransitions(step);
const primaryEdges: AnyWorkEdge[] = [];
const statusEdges: AnyWorkEdge[] = [];
const elseEdges: AnyWorkEdge[] = [];
const ifEdges: AnyWorkEdge[] = [];
for (let i = 0; i < sorted.length; i++) {
const t = sorted[i];
const targetId = nameToId.get(t.target);
if (!targetId) continue;
const edgeId = `e-${sourceId}-${targetId}-${i}`;
if (hasMultiple || t.status !== DEFAULT_STATUS) {
const edge: StatusEdge = {
if (hasMultiple || t.condition !== null) {
const edge: ConditionalEdge = {
id: edgeId,
source: sourceId,
target: targetId,
sourceHandle: "output",
targetHandle: "input",
type: "status",
data: { status: t.status },
type: "conditional",
data: { condition: t.condition ?? "" },
animated: true,
};
if (hasMultiple && t.status === DEFAULT_STATUS) primaryEdges.push(edge);
else statusEdges.push(edge);
if (hasMultiple && i === 0) elseEdges.push(edge);
else ifEdges.push(edge);
} else {
primaryEdges.push({
elseEdges.push({
id: edgeId,
source: sourceId,
target: targetId,
@@ -96,23 +95,23 @@ function buildStepEdges(
});
}
}
return { primaryEdges, statusEdges };
return { elseEdges, ifEdges };
}
function pushStepEdges(
edges: AnyWorkEdge[],
primaryEdges: AnyWorkEdge[],
statusEdges: AnyWorkEdge[],
elseEdges: AnyWorkEdge[],
ifEdges: AnyWorkEdge[],
idToOrder: Map<string, number>,
): void {
for (const e of primaryEdges) edges.push({ ...e, sourceHandle: "output" });
if (statusEdges.length > 0) {
const statusHandles = ["output-top", "output-bottom"] as const;
const sorted = [...statusEdges].sort(
for (const e of elseEdges) edges.push({ ...e, sourceHandle: "output" });
if (ifEdges.length > 0) {
const ifHandles = ["output-top", "output-bottom"] as const;
const sorted = [...ifEdges].sort(
(a, b) => (idToOrder.get(b.target) ?? 0) - (idToOrder.get(a.target) ?? 0),
);
for (let i = 0; i < sorted.length; i++) {
edges.push({ ...sorted[i], sourceHandle: statusHandles[i % statusHandles.length] });
edges.push({ ...sorted[i], sourceHandle: ifHandles[i % ifHandles.length] });
}
}
}
@@ -165,8 +164,8 @@ export function transIn(steps: WorkFlowStep[]): Result {
for (const step of steps) {
const sourceId = nameToId.get(step.role.name) ?? "";
const { primaryEdges, statusEdges } = buildStepEdges(sourceId, step, nameToId);
pushStepEdges(edges, primaryEdges, statusEdges, idToOrder);
const { elseEdges, ifEdges } = buildStepEdges(sourceId, step, nameToId);
pushStepEdges(edges, elseEdges, ifEdges, idToOrder);
}
assignTargetHandles(edges, idToOrder);
@@ -1,8 +1,6 @@
import type { AnyWorkEdge, AnyWorkNode, StatusEdge, WorkNode } from "../type";
import type { AnyWorkEdge, AnyWorkNode, ConditionalEdge, WorkNode } from "../type";
import type { WorkFlowStep, WorkFlowTransition } from "./type";
const DEFAULT_STATUS = "_";
export function transOut(nodes: AnyWorkNode[], edges: AnyWorkEdge[]): WorkFlowStep[] {
const nodeMap = new Map<string, AnyWorkNode>();
for (const node of nodes) {
@@ -45,7 +43,7 @@ function traverse(
const roleNode = node as WorkNode<"role">;
const outEdges = outgoingEdges.get(nodeId) ?? [];
const transitions: WorkFlowTransition[] = outEdges.map((edge) => {
const transitions: WorkFlowTransition[] = outEdges.map((edge, index) => {
const targetNode = nodeMap.get(edge.target);
const target =
edge.target === "end"
@@ -54,12 +52,13 @@ function traverse(
? (targetNode as WorkNode<"role">).data.name
: edge.target;
const status =
edge.type === "status"
? ((edge as StatusEdge).data?.status ?? DEFAULT_STATUS)
: DEFAULT_STATUS;
let condition: string | null = null;
if (edge.type === "conditional") {
const isElse = outEdges.length >= 2 && index === 0;
condition = isElse ? null : ((edge as ConditionalEdge).data?.condition ?? null);
}
return { target, status };
return { target, condition };
});
const { name, description, identity, prepare, execute, report } = roleNode.data;
@@ -1,4 +1,4 @@
import type { AnyWorkEdge, AnyWorkNode, StatusEdge } from "../type";
import type { AnyWorkEdge, AnyWorkNode, ConditionalEdge } from "../type";
export type ValidationError = {
nodeId: string | null;
@@ -91,10 +91,10 @@ function validateEndNode(
}
}
function hasEmptyStatusOnEdge(statusEdges: AnyWorkEdge[]): boolean {
return statusEdges.some((edge) => {
const status = (edge as StatusEdge).data?.status?.trim();
return !status;
function hasEmptyConditionOnIfEdge(conditionalEdges: AnyWorkEdge[]): boolean {
return conditionalEdges.slice(1).some((edge) => {
const cond = (edge as ConditionalEdge).data?.condition?.trim();
return !cond;
});
}
@@ -113,11 +113,11 @@ function validateRoleNodeEdges(
}
if (outEdges.length <= 1) return;
const statusEdges = outEdges.filter((e) => e.type === "status");
if (statusEdges.length !== outEdges.length) {
errors.push({ nodeId: node.id, message: "多输出节点的所有出边必须附带状态" });
} else if (hasEmptyStatusOnEdge(statusEdges)) {
errors.push({ nodeId: node.id, message: "状态边的状态值不能为空" });
const conditionalEdges = outEdges.filter((e) => e.type === "conditional");
if (conditionalEdges.length !== outEdges.length) {
errors.push({ nodeId: node.id, message: "多输出节点的所有出边必须附带条件" });
} else if (hasEmptyConditionOnIfEdge(conditionalEdges)) {
errors.push({ nodeId: node.id, message: "条件边的条件表达式不能为空" });
}
}
@@ -21,9 +21,9 @@ export type WorkNodeType = keyof NodeMap;
export type WorkNode<T extends WorkNodeType> = Node<NodeMap[T], T>;
export type AnyWorkNode = WorkNode<"start"> | WorkNode<"end"> | WorkNode<"role">;
export type StatusEdgeData = AnyKeyBase & {
status: string;
export type ConditionalEdgeData = AnyKeyBase & {
condition: string;
};
export type StatusEdge = Edge<StatusEdgeData, "status">;
export type AnyWorkEdge = StatusEdge | Edge;
export type ConditionalEdge = Edge<ConditionalEdgeData, "conditional">;
export type AnyWorkEdge = ConditionalEdge | Edge;
@@ -11,7 +11,7 @@ const DEFAULT_STEPS: WorkFlowSteps = [
execute: "制定详细的实施计划和步骤分解",
report: "输出结构化的计划文档,包含步骤列表和预期产出",
},
transitions: [{ target: "developer", status: "_" }],
transitions: [{ target: "developer", condition: null }],
},
{
role: {
@@ -22,7 +22,7 @@ const DEFAULT_STEPS: WorkFlowSteps = [
execute: "编写高质量的代码实现",
report: "输出变更文件列表和实现摘要",
},
transitions: [{ target: "reviewer", status: "_" }],
transitions: [{ target: "reviewer", condition: null }],
},
{
role: {
@@ -34,8 +34,8 @@ const DEFAULT_STEPS: WorkFlowSteps = [
report: "输出审查结果,包含 approved 状态和评审意见",
},
transitions: [
{ target: "END", status: "approved" },
{ target: "developer", status: "rejected" },
{ target: "END", condition: null },
{ target: "developer", condition: "steps[-1].output.approved = false" },
],
},
];
@@ -1,6 +1,7 @@
import path from "node:path";
import { defineConfig } from "vitest/config";
// biome-ignore lint/style/noDefaultExport: Vitest loads config from default export.
export default defineConfig({
test: {
environment: "node",
@@ -1,122 +1,312 @@
import { describe, expect, test } from "bun:test";
import type { Target, WorkflowPayload } from "@uncaged/workflow-protocol";
import type { ModeratorContext, WorkflowPayload } from "@uncaged/workflow-protocol";
import { evaluate } from "../src/evaluate.js";
const solveIssueGraph: WorkflowPayload["graph"] = {
$START: {
_: { role: "planner", prompt: "Start planning from the issue in the task." },
const solveIssueWorkflow: WorkflowPayload = {
name: "solve-issue",
description: "End-to-end issue resolution",
roles: {
planner: {
description: "Creates implementation plan",
goal: "You are a planning agent.",
capabilities: ["planning"],
procedure: "Create a step-by-step plan.",
output: "Output the plan and steps.",
frontmatter: "5GWKR8TN1V3JA",
},
developer: {
description: "Implements code changes",
goal: "You are a developer agent.",
capabilities: ["coding"],
procedure: "Implement the plan.",
output: "List files changed and summary.",
frontmatter: "8CNWT4KR6D1HV",
},
reviewer: {
description: "Reviews code changes",
goal: "You are a code reviewer.",
capabilities: ["code-review"],
procedure: "Review the implementation.",
output: "Approve or reject with comments.",
frontmatter: "1VPBG9SM5E7WK",
},
},
planner: {
_: { role: "developer", prompt: "Implement the plan: {{plan}}" },
conditions: {
needsClarification: {
description: "Planner requests clarification from user",
expression: "$exists($last('planner').needsClarification)",
},
rejected: {
description: "Reviewer rejected the implementation",
expression: "$last('reviewer').approved = false",
},
},
developer: {
_: { role: "reviewer", prompt: "Review the changes: {{summary}}" },
},
reviewer: {
approved: { role: "$END", prompt: "Done." },
rejected: { role: "developer", prompt: "Fix: {{comments}}" },
graph: {
$START: [
{
role: "planner",
condition: null,
prompt: "Start planning from the issue in the task.",
},
],
planner: [
{
role: "developer",
condition: "needsClarification",
prompt: "Clarification is needed; hand off to developer.",
},
{ role: "$END", condition: null, prompt: "Planning complete; end workflow." },
],
developer: [
{
role: "reviewer",
condition: null,
prompt: "Implementation done; send to reviewer.",
},
],
reviewer: [
{
role: "developer",
condition: "rejected",
prompt: "Reviewer rejected; return to developer.",
},
{ role: "$END", condition: null, prompt: "Review passed; end workflow." },
],
},
};
function makeContext(steps: ModeratorContext["steps"]): ModeratorContext {
return {
start: {
workflow: "4KNM2PXR3B1QW",
prompt: "Fix the login bug",
},
steps,
};
}
describe("evaluate", () => {
test("$START → first role (unit status _)", () => {
const result = evaluate(solveIssueGraph, "$START", { status: "_" });
test("$START → first role (fallback)", async () => {
const result = await evaluate(solveIssueWorkflow, makeContext([]));
expect(result).toEqual({
ok: true,
value: { role: "planner", prompt: "Start planning from the issue in the task." },
});
});
test("status-based routing (reviewer rejected → developer)", () => {
const result = evaluate(solveIssueGraph, "reviewer", {
status: "rejected",
comments: "missing tests",
});
test("condition match (rejected → developer)", async () => {
const context = makeContext([
{
role: "reviewer",
output: { approved: false },
detail: "2MXBG6PN4A8JR",
agent: "uwf-hermes",
},
]);
const result = await evaluate(solveIssueWorkflow, context);
expect(result).toEqual({
ok: true,
value: { role: "developer", prompt: "Fix: missing tests" },
value: { role: "developer", prompt: "Reviewer rejected; return to developer." },
});
});
test("status-based routing (reviewer approved → $END)", () => {
const result = evaluate(solveIssueGraph, "reviewer", { status: "approved" });
test("fallback when condition does not match → $END", async () => {
const context = makeContext([
{
role: "reviewer",
output: { approved: true },
detail: "2MXBG6PN4A8JR",
agent: "uwf-hermes",
},
]);
const result = await evaluate(solveIssueWorkflow, context);
expect(result).toEqual({
ok: true,
value: { role: "$END", prompt: "Done." },
value: { role: "$END", prompt: "Review passed; end workflow." },
});
});
test("missing role in graph → error", () => {
const result = evaluate(solveIssueGraph, "unknown-role", { status: "_" });
test("missing role in graph → error", async () => {
const context = makeContext([
{
role: "unknown-role",
output: {},
detail: "2MXBG6PN4A8JR",
agent: "uwf-hermes",
},
]);
const result = await evaluate(solveIssueWorkflow, context);
expect(result.ok).toBe(false);
if (!result.ok) {
expect(result.error.message).toBe('no transitions defined for role "unknown-role"');
}
});
test("missing status in graph → error", () => {
const result = evaluate(solveIssueGraph, "reviewer", { status: "pending" });
expect(result.ok).toBe(false);
if (!result.ok) {
expect(result.error.message).toBe('no transition for role "reviewer" with status "pending"');
}
});
test("mustache template rendering with simple fields", () => {
const result = evaluate(solveIssueGraph, "planner", {
status: "_",
plan: "Add auth middleware",
});
expect(result).toEqual({
ok: true,
value: { role: "developer", prompt: "Implement the plan: Add auth middleware" },
});
});
test("mustache does not HTML-escape prompt content", () => {
const result = evaluate(solveIssueGraph, "reviewer", {
status: "rejected",
comments: 'use <T> & "Result<T, E>" types',
});
expect(result).toEqual({
ok: true,
value: { role: "developer", prompt: 'Fix: use <T> & "Result<T, E>" types' },
});
});
test("triple mustache also works for unescaped output", () => {
const graph: Record<string, Record<string, Target>> = {
reviewer: {
_: { role: "developer", prompt: "Fix: {{{comments}}}" },
test("output expansion in context works with JSONata", async () => {
const context = makeContext([
{
role: "planner",
output: { needsClarification: true },
detail: "7BQST3VW9F2MA",
agent: "uwf-hermes",
},
};
const result = evaluate(graph, "reviewer", {
status: "_",
comments: "<script>alert(1)</script>",
});
]);
const result = await evaluate(solveIssueWorkflow, context);
expect(result).toEqual({
ok: true,
value: { role: "developer", prompt: "Fix: <script>alert(1)</script>" },
value: { role: "developer", prompt: "Clarification is needed; hand off to developer." },
});
});
test("mustache template with nested object paths", () => {
const graph: Record<string, Record<string, Target>> = {
reviewer: {
_: {
role: "developer",
prompt: "Address: {{review.comments}}",
test("$last returns most recent matching role's frontmatter", async () => {
const workflow: WorkflowPayload = {
...solveIssueWorkflow,
conditions: {
devFailed: {
description: "Developer failed",
expression: "$last('developer').status = 'failed'",
},
},
graph: {
$START: [
{
role: "developer",
condition: null,
prompt: "Begin development.",
},
],
developer: [
{ role: "$END", condition: "devFailed", prompt: "Development failed; end." },
{
role: "reviewer",
condition: null,
prompt: "Development succeeded; review.",
},
],
},
};
const result = evaluate(graph, "reviewer", {
status: "_",
review: { comments: "refactor the handler" },
});
const context = makeContext([
{
role: "developer",
output: { status: "done" },
detail: "1VPBG9SM5E7WK",
agent: "uwf-hermes",
},
{
role: "reviewer",
output: { approved: false },
detail: "2MXBG6PN4A8JR",
agent: "uwf-hermes",
},
{
role: "developer",
output: { status: "failed" },
detail: "3QNTH7WK8D2PA",
agent: "uwf-hermes",
},
]);
const result = await evaluate(workflow, context);
expect(result).toEqual({
ok: true,
value: { role: "developer", prompt: "Address: refactor the handler" },
value: { role: "$END", prompt: "Development failed; end." },
});
});
test("$first returns earliest matching role's frontmatter", async () => {
const workflow: WorkflowPayload = {
...solveIssueWorkflow,
conditions: {
firstPlanReady: {
description: "First planner run was ready",
expression: "$first('planner').status = 'ready'",
},
},
graph: {
$START: [
{
role: "planner",
condition: null,
prompt: "Begin planning.",
},
],
planner: [
{ role: "$END", condition: "firstPlanReady", prompt: "First plan was ready; end." },
{
role: "developer",
condition: null,
prompt: "Plan not ready on first pass; implement.",
},
],
},
};
const context = makeContext([
{
role: "planner",
output: { status: "ready", plan: "ABC123" },
detail: "7BQST3VW9F2MA",
agent: "uwf-hermes",
},
{
role: "developer",
output: { status: "done" },
detail: "1VPBG9SM5E7WK",
agent: "uwf-hermes",
},
{
role: "planner",
output: { status: "revised", plan: "DEF456" },
detail: "4RNMK6PX8B3WQ",
agent: "uwf-hermes",
},
]);
const result = await evaluate(workflow, context);
expect(result).toEqual({
ok: true,
value: { role: "$END", prompt: "First plan was ready; end." },
});
});
test("$last returns undefined for unmatched role", async () => {
const workflow: WorkflowPayload = {
...solveIssueWorkflow,
conditions: {
hasReviewer: {
description: "Reviewer has run",
expression: "$exists($last('reviewer'))",
},
},
graph: {
$START: [
{
role: "planner",
condition: null,
prompt: "Begin planning.",
},
],
planner: [
{ role: "$END", condition: "hasReviewer", prompt: "Reviewer already ran; end." },
{
role: "developer",
condition: null,
prompt: "No reviewer yet; implement.",
},
],
},
};
const context = makeContext([
{
role: "planner",
output: { status: "ready" },
detail: "7BQST3VW9F2MA",
agent: "uwf-hermes",
},
]);
const result = await evaluate(workflow, context);
// no reviewer step → $exists returns false → fallback to developer
expect(result).toEqual({
ok: true,
value: { role: "developer", prompt: "No reviewer yet; implement." },
});
});
});
+1 -2
View File
@@ -19,10 +19,9 @@
},
"dependencies": {
"@uncaged/workflow-protocol": "workspace:^",
"mustache": "^4.2.0"
"jsonata": "^1.8.7"
},
"devDependencies": {
"@types/mustache": "^4.2.6",
"typescript": "^5.8.3"
},
"publishConfig": {
+101 -30
View File
@@ -1,42 +1,65 @@
import type { Target } from "@uncaged/workflow-protocol";
import mustache from "mustache";
import type { ModeratorContext, WorkflowPayload } from "@uncaged/workflow-protocol";
import jsonata from "jsonata";
import type { EvaluateResult, Result } from "./types.js";
// Disable HTML escaping — prompts are plain text, not HTML.
mustache.escape = (text: string) => text;
const START_ROLE = "$START";
const UNIT_STATUS = "_";
type LastOutput = Record<string, unknown> & { status: string };
export function evaluate(
graph: Record<string, Record<string, Target>>,
lastRole: string,
lastOutput: LastOutput,
): Result<EvaluateResult, Error> {
const status = lastRole === START_ROLE ? UNIT_STATUS : lastOutput.status;
const roleTargets = graph[lastRole];
if (roleTargets === undefined) {
return {
ok: false,
error: new Error(`no transitions defined for role "${lastRole}"`),
};
function isTruthy(value: unknown): boolean {
if (value === null || value === undefined) {
return false;
}
const target = roleTargets[status];
if (target === undefined) {
return {
ok: false,
error: new Error(`no transition for role "${lastRole}" with status "${status}"`),
};
if (typeof value === "boolean") {
return value;
}
if (typeof value === "number") {
return value !== 0 && !Number.isNaN(value);
}
if (typeof value === "string") {
return value.length > 0;
}
return true;
}
function findByRole(
steps: ModeratorContext["steps"],
role: string,
direction: "first" | "last",
): unknown {
if (direction === "last") {
for (let i = steps.length - 1; i >= 0; i--) {
if (steps[i].role === role) {
return steps[i].output;
}
}
} else {
for (const step of steps) {
if (step.role === role) {
return step.output;
}
}
}
return undefined;
}
async function evaluateJsonata(
expression: string,
context: ModeratorContext,
): Promise<Result<unknown, Error>> {
try {
const prompt = mustache.render(target.prompt, lastOutput);
return { ok: true, value: { role: target.role, prompt } };
const expr = jsonata(expression);
expr.registerFunction(
"first",
(role: string) => findByRole(context.steps, role, "first"),
"<s:x>",
);
expr.registerFunction(
"last",
(role: string) => findByRole(context.steps, role, "last"),
"<s:x>",
);
const result = await expr.evaluate(context);
return { ok: true, value: result };
} catch (error) {
return {
ok: false,
@@ -44,3 +67,51 @@ export function evaluate(
};
}
}
function currentRole(context: ModeratorContext): string {
if (context.steps.length === 0) {
return START_ROLE;
}
return context.steps[context.steps.length - 1].role;
}
export async function evaluate(
workflow: WorkflowPayload,
context: ModeratorContext,
): Promise<Result<EvaluateResult, Error>> {
const role = currentRole(context);
const transitions = workflow.graph[role];
if (transitions === undefined) {
return {
ok: false,
error: new Error(`no transitions defined for role "${role}"`),
};
}
for (const transition of transitions) {
if (transition.condition === null) {
return { ok: true, value: { role: transition.role, prompt: transition.prompt } };
}
const conditionDef = workflow.conditions[transition.condition];
if (conditionDef === undefined) {
return {
ok: false,
error: new Error(`unknown condition "${transition.condition}"`),
};
}
const evalResult = await evaluateJsonata(conditionDef.expression, context);
if (!evalResult.ok) {
return evalResult;
}
if (isTruthy(evalResult.value)) {
return { ok: true, value: { role: transition.role, prompt: transition.prompt } };
}
}
return {
ok: false,
error: new Error(`no transition matched for role "${role}"`),
};
}
+1 -1
View File
@@ -92,7 +92,7 @@ type StepNodePayload = StepRecord & {
### Moderator context
```typescript
type StepContext = Omit<StepRecord, "output"> & { output: unknown; content: string | null };
type StepContext = Omit<StepRecord, "output"> & { output: unknown };
type ModeratorContext = {
start: StartNodePayload;
+2 -3
View File
@@ -7,6 +7,7 @@ export type {
AgentAlias,
AgentConfig,
CasRef,
ConditionDefinition,
ModelAlias,
ModelConfig,
ModeratorContext,
@@ -14,8 +15,6 @@ export type {
ProviderConfig,
RoleDefinition,
RoleName,
RunningThreadItem,
RunningThreadsOutput,
Scenario,
StartEntry,
StartNodePayload,
@@ -25,12 +24,12 @@ export type {
StepNodePayload,
StepOutput,
StepRecord,
Target,
ThreadForkOutput,
ThreadId,
ThreadListItem,
ThreadStepsOutput,
ThreadsIndex,
Transition,
WorkflowConfig,
WorkflowName,
WorkflowPayload,
+20 -5
View File
@@ -14,11 +14,22 @@ const ROLE_DEFINITION: JSONSchema = {
additionalProperties: false,
};
const TARGET: JSONSchema = {
const CONDITION_DEFINITION: JSONSchema = {
type: "object",
required: ["role", "prompt"],
required: ["description", "expression"],
properties: {
description: { type: "string" },
expression: { type: "string" },
},
additionalProperties: false,
};
const TRANSITION: JSONSchema = {
type: "object",
required: ["role", "condition", "prompt"],
properties: {
role: { type: "string" },
condition: { anyOf: [{ type: "string" }, { type: "null" }] },
prompt: { type: "string" },
},
additionalProperties: false,
@@ -27,7 +38,7 @@ const TARGET: JSONSchema = {
export const WORKFLOW_SCHEMA: JSONSchema = {
title: "Workflow",
type: "object",
required: ["name", "description", "roles", "graph"],
required: ["name", "description", "roles", "conditions", "graph"],
properties: {
name: { type: "string" },
description: { type: "string" },
@@ -35,11 +46,15 @@ export const WORKFLOW_SCHEMA: JSONSchema = {
type: "object",
additionalProperties: ROLE_DEFINITION,
},
conditions: {
type: "object",
additionalProperties: CONDITION_DEFINITION,
},
graph: {
type: "object",
additionalProperties: {
type: "object",
additionalProperties: TARGET,
type: "array",
items: TRANSITION,
},
},
},
+9 -17
View File
@@ -27,16 +27,23 @@ export type RoleDefinition = {
frontmatter: CasRef;
};
export type Target = {
export type Transition = {
role: string;
condition: string | null;
prompt: string;
};
export type ConditionDefinition = {
description: string;
expression: string;
};
export type WorkflowPayload = {
name: string;
description: string;
roles: Record<string, RoleDefinition>;
graph: Record<string, Record<string, Target>>;
conditions: Record<string, ConditionDefinition>;
graph: Record<string, Transition[]>;
};
// ── 4.3 Thread 节点 ─────────────────────────────────────────────────
@@ -56,7 +63,6 @@ export type StepNodePayload = StepRecord & {
/** JSONata 上下文中的 step — output 被展开 */
export type StepContext = Omit<StepRecord, "output"> & {
output: unknown;
content: string | null;
};
export type ModeratorContext = {
@@ -78,7 +84,6 @@ export type StepOutput = {
thread: ThreadId;
head: CasRef;
done: boolean;
background: boolean | null;
};
/** uwf thread steps — single step entry */
@@ -121,19 +126,6 @@ export type ThreadListItem = {
head: CasRef;
};
/** uwf thread running — single running thread entry */
export type RunningThreadItem = {
thread: ThreadId;
workflow: CasRef;
pid: number;
startedAt: number;
};
/** uwf thread running output */
export type RunningThreadsOutput = {
threads: RunningThreadItem[];
};
// ── 4.6 配置 ────────────────────────────────────────────────────────
/** Alias types for config references */
@@ -1,64 +0,0 @@
import type { AgentContext } from "@uncaged/workflow-runtime";
/** Max characters of step content to include in the prompt. */
const CONTENT_QUOTA = 16_000;
/** Builds the full agent prompt: system instructions plus summarized thread history. */
export async function buildAgentPrompt(ctx: AgentContext): Promise<string> {
const lines: string[] = [];
lines.push(ctx.currentRole.systemPrompt);
lines.push("");
lines.push("## Task");
lines.push(ctx.start.content);
const { steps } = ctx;
if (steps.length === 0) {
return lines.join("\n");
}
if (steps.length === 1) {
const s = steps[0];
lines.push("");
lines.push(`## Step: ${s.role}`);
lines.push("");
lines.push(`Meta: ${JSON.stringify(s.meta)}`);
appendContent(lines, s.content);
} else {
lines.push("");
lines.push("## Previous Steps");
for (let i = 0; i < steps.length - 1; i++) {
const s = steps[i];
lines.push("");
lines.push(`### Step ${i + 1}: ${s.role}`);
lines.push(`Summary: ${JSON.stringify(s.meta)}`);
}
const last = steps[steps.length - 1];
lines.push("");
lines.push(`## Latest Step: ${last.role}`);
lines.push("");
lines.push(`Meta: ${JSON.stringify(last.meta)}`);
appendContent(lines, last.content);
}
lines.push("");
lines.push("## Tools");
lines.push(
`Use \`uncaged-workflow thread ${ctx.threadId}\` to read full details of any previous step.`,
);
return lines.join("\n");
}
function appendContent(lines: string[], content: string | null | undefined): void {
if (content === null || content === undefined || content.trim() === "") {
return;
}
const truncated =
content.length > CONTENT_QUOTA
? `${content.slice(0, CONTENT_QUOTA)}\n... (truncated)`
: content;
lines.push("");
lines.push("<output>");
lines.push(truncated);
lines.push("</output>");
}
-1
View File
@@ -23,7 +23,6 @@ All exports come from `src/index.ts`.
```typescript
function encodeUint64AsCrockford(value: bigint): string
function generateUlid(nowMs: number): string
function extractUlidTimestamp(ulid: string): number | null
```
### Logging
@@ -1,55 +0,0 @@
import { describe, expect, it } from "bun:test";
import { extractUlidTimestamp, generateUlid } from "../ulid.js";
describe("extractUlidTimestamp", () => {
it("should extract correct timestamp from ULID", () => {
const knownTimestamp = Date.UTC(2026, 4, 20, 0, 0, 0);
const ulid = generateUlid(knownTimestamp);
const extracted = extractUlidTimestamp(ulid);
expect(extracted).toBe(knownTimestamp);
});
it("should handle epoch timestamp (timestamp 0)", () => {
const ulid = generateUlid(0);
const extracted = extractUlidTimestamp(ulid);
expect(extracted).toBe(0);
});
it("should handle recent timestamps", () => {
const recentTimestamp = Date.now();
const ulid = generateUlid(recentTimestamp);
const extracted = extractUlidTimestamp(ulid);
expect(extracted).toBe(recentTimestamp);
});
it("should handle max 48-bit timestamp", () => {
const maxTimestamp = 2 ** 48 - 1;
const ulid = generateUlid(maxTimestamp);
const extracted = extractUlidTimestamp(ulid);
expect(extracted).toBe(maxTimestamp);
});
it("should return null for invalid ULID length", () => {
expect(extractUlidTimestamp("")).toBe(null);
expect(extractUlidTimestamp("TOOSHORT")).toBe(null);
expect(extractUlidTimestamp("TOOLONGAAAAAAAAAAAAAAAAAA")).toBe(null);
});
it("should return null for invalid Crockford Base32 characters", () => {
expect(extractUlidTimestamp("INVALID!@#$%^&CHARACTERS")).toBe(null);
});
it("should extract timestamps from multiple ULIDs correctly", () => {
const timestamps = [
Date.UTC(2020, 0, 1, 0, 0, 0),
Date.UTC(2023, 5, 15, 12, 30, 45),
Date.UTC(2026, 11, 31, 23, 59, 59),
];
for (const ts of timestamps) {
const ulid = generateUlid(ts);
const extracted = extractUlidTimestamp(ulid);
expect(extracted).toBe(ts);
}
});
});
+13 -19
View File
@@ -15,7 +15,7 @@ uwf setup --provider <name> --base-url <url> \\
## Workflow Commands
\`\`\`
uwf workflow add <file> # register a workflow from YAML file
uwf workflow put <file> # register a workflow from YAML file
uwf workflow show <id> # show workflow by name or CAS hash
uwf workflow list # list all registered workflows
\`\`\`
@@ -24,27 +24,20 @@ uwf workflow list # list all registered workflows
\`\`\`
uwf thread start <workflow> -p <prompt> # create a thread (no execution)
uwf thread exec <thread-id> # execute one moderatoragentextract cycle
uwf thread step <thread-id> # execute one moderatoragentextract cycle
[--agent <cmd>] # override agent command
[-c, --count <number>] # run multiple steps (default: 1)
[--background] # run in background
uwf thread show <thread-id> # show thread head pointer
uwf thread list # list threads
[--status <status>] # filter: idle, running, or completed
uwf thread list # list active threads
[--all] # include archived threads
uwf thread kill <thread-id> # terminate and archive a thread
uwf thread steps <thread-id> # list all steps in a thread
uwf thread read <thread-id> # render thread context as markdown
[--quota <chars>] # max output characters (default 32000)
[--before <step-hash>] # load steps before this hash (exclusive)
[--start] # include start step in output
uwf thread stop <thread-id> # stop background execution (keep thread active)
uwf thread cancel <thread-id> # cancel thread (stop + move to history)
\`\`\`
## Step Commands
\`\`\`
uwf step list <thread-id> # list all steps in a thread
uwf step show <step-hash> # show details of a specific step
uwf step fork <step-hash> # fork a thread from a specific step
uwf thread fork <step-hash> # fork a thread from a specific step
uwf thread step-details <step-hash> # dump full detail node of a step as YAML
\`\`\`
## CAS Commands
@@ -85,9 +78,10 @@ uwf -V, --version # print version
## Key Concepts
- **Workflow**: YAML definition with roles, conditions, and a routing graph; stored as a CAS node identified by its XXH64 hash.
- **Thread**: A running instance of a workflow; points to a chain of CAS step nodes.
- **Step**: One moderatoragentextract cycle; stored as a CAS node with output + detail refs.
- **Turn**: Agent-internal interaction (within a single step); stored per-turn in the detail node.
- **CAS**: Content-addressable store; every artifact (workflows, steps, details, turns) is hashed.
- **Thread**: A single workflow execution (ULID). State is an immutable CAS chain; active threads are indexed in \`threads.yaml\`.
- **Step**: One moderatoragentextract cycle. Run \`uwf thread step\` repeatedly until \`$END\`.
- **CAS**: Content-Addressed Storage all nodes are immutable and identified by hash.
- **Role**: Named actor with goal, capabilities, procedure, output, and frontmatter schema; the moderator routes between roles.
- **Edge Prompt**: Required instruction on each graph edge the moderator's dispatch message to the agent.
`;
}
+1 -1
View File
@@ -24,4 +24,4 @@ export { normalizeRefsField } from "./refs-field.js";
export { err, ok } from "./result.js";
export { getDefaultWorkflowStorageRoot, getGlobalCasDir } from "./storage-root.js";
export type { LogFn, Result } from "./types.js";
export { extractUlidTimestamp, generateUlid } from "./ulid.js";
export { generateUlid } from "./ulid.js";
+1 -17
View File
@@ -1,4 +1,4 @@
import { decodeCrockfordBase32Bits, encodeCrockfordBase32Bits } from "./base32.js";
import { encodeCrockfordBase32Bits } from "./base32.js";
const ULID_TIME_BITS = 48;
const ULID_RANDOM_BITS = 80;
@@ -26,19 +26,3 @@ export function generateUlid(nowMs: number): string {
const payload = (time << BigInt(ULID_RANDOM_BITS)) | rand;
return encodeCrockfordBase32Bits(payload, ULID_TIME_BITS + ULID_RANDOM_BITS);
}
/**
* Extract the timestamp (in milliseconds) from a ULID string.
* Returns null if the ULID is invalid.
*/
export function extractUlidTimestamp(ulid: string): number | null {
if (ulid.length !== 26) {
return null;
}
const timestampPart = ulid.slice(0, 10);
const decoded = decodeCrockfordBase32Bits(timestampPart, ULID_TIME_BITS);
if (!decoded.ok) {
return null;
}
return Number(decoded.value);
}
-89
View File
@@ -1,89 +0,0 @@
#!/usr/bin/env bash
# batch-solve.sh — solve multiple Gitea issues via solve-issue workflow
#
# Usage:
# ./scripts/batch-solve.sh [--agent CMD] [--repo OWNER/REPO] [--count N] ISSUE_NUM...
#
# Examples:
# ./scripts/batch-solve.sh 448 449
# ./scripts/batch-solve.sh --agent "bun run $(pwd)/packages/workflow-agent-claude-code/src/cli.ts" 448 449
# ./scripts/batch-solve.sh --repo uncaged/workflow --count 15 448 449
set -euo pipefail
AGENT=""
REPO="uncaged/workflow"
COUNT=10
ISSUES=()
while [[ $# -gt 0 ]]; do
case "$1" in
--agent) AGENT="$2"; shift 2 ;;
--repo) REPO="$2"; shift 2 ;;
--count) COUNT="$2"; shift 2 ;;
*) ISSUES+=("$1"); shift ;;
esac
done
if [[ ${#ISSUES[@]} -eq 0 ]]; then
echo "Usage: $0 [--agent CMD] [--repo OWNER/REPO] [--count N] ISSUE_NUM..." >&2
exit 1
fi
AGENT_FLAG=""
if [[ -n "$AGENT" ]]; then
AGENT_FLAG="--agent $AGENT"
fi
TOTAL=${#ISSUES[@]}
PASSED=0
FAILED=0
RESULTS=()
echo "━━━ Batch solve: ${TOTAL} issues ━━━"
echo ""
for i in "${!ISSUES[@]}"; do
ISSUE="${ISSUES[$i]}"
NUM=$((i + 1))
echo "┌─── [$NUM/$TOTAL] Issue #${ISSUE} ───"
# Read issue title
TITLE=$(tea issues "$ISSUE" -r "$REPO" 2>/dev/null | head -1 | sed 's/^# #[0-9]* //' | sed 's/ (.*//' || echo "unknown")
echo "│ Title: $TITLE"
# Start thread
PROMPT="Fix issue #${ISSUE} in ${REPO}. Read the issue first with 'tea issues ${ISSUE} -r ${REPO}' for full spec."
THREAD_JSON=$(uwf thread start solve-issue -p "$PROMPT" 2>&1)
THREAD_ID=$(echo "$THREAD_JSON" | python3 -c "import json,sys; print(json.load(sys.stdin)['thread'])")
echo "│ Thread: $THREAD_ID"
# Run steps
echo "│ Running (max $COUNT steps)..."
# shellcheck disable=SC2086
if STEP_OUTPUT=$(uwf thread step "$THREAD_ID" $AGENT_FLAG -c "$COUNT" 2>&1); then
# Check if done
LAST_DONE=$(echo "$STEP_OUTPUT" | python3 -c "import json,sys; lines=sys.stdin.read().strip(); data=json.loads(lines); print(data[-1].get('done', False))")
if [[ "$LAST_DONE" == "True" ]]; then
echo "│ ✅ Done!"
PASSED=$((PASSED + 1))
RESULTS+=("✅ #${ISSUE}${TITLE}")
else
echo "│ ⚠️ Ran out of steps (not done)"
FAILED=$((FAILED + 1))
RESULTS+=("⚠️ #${ISSUE}${TITLE} (incomplete)")
fi
else
echo "│ ❌ Failed"
FAILED=$((FAILED + 1))
RESULTS+=("❌ #${ISSUE}${TITLE} (error)")
fi
echo "└───"
echo ""
done
echo "━━━ Results: ${PASSED}/${TOTAL} passed, ${FAILED} failed ━━━"
for R in "${RESULTS[@]}"; do
echo " $R"
done