Compare commits

...

15 Commits

Author SHA1 Message Date
xiaoju 669af841e1 refactor: address review feedback for CLI restructure
- Extract shared module (shared.ts) — walkChain, expandDeep, etc. deduplicated
- Hide step read command (half-baked, not ready for users)
- Remove cmdThreadKill dead code
- Revert unrelated protocol type change
- Revert unrelated package.json change
- Fix unused imports (biome)

Refs #463
2026-05-24 11:32:47 +00:00
xiaoju 650313b1c2 feat(step): expand detail CAS refs by default in step list
Previously step list showed raw CAS refs for detail fields.
Now detail is recursively expanded (like output already was),
since every turn is individually hashed and walkable.

Refs #463
2026-05-24 11:12:22 +00:00
xiaoju c40007eeaf fix(agent-claude-code): add missing workflow-util dependency
The claude-code agent imports createLogger from @uncaged/workflow-util
but was missing the dependency declaration, causing test failures.
2026-05-24 11:04:02 +00:00
xiaoju 1f13b1e79c fix(cli): resolve lint errors and unused imports (#463)
Fix all lint errors flagged by biome check to ensure clean codebase.

## Changes

### Removed Unused Imports
- `packages/cli-workflow/src/commands/thread.ts`:
  - Removed `StartEntry` (moved to step.ts)
  - Removed `StepEntry` (moved to step.ts)
  - Removed `ThreadForkOutput` (moved to step.ts)
  - Removed `ThreadStepsOutput` (moved to step.ts)

- `packages/cli-workflow/src/cli.ts`:
  - Removed unused `yamlStringify` import from yaml package

### Fixed Unused Parameter
- `packages/cli-workflow/src/commands/step.ts`:
  - Prefixed unused `before` parameter with underscore in `cmdStepRead`
  - Parameter is part of the function signature for future use (awaiting #462)

### Fixed Import Order
- `packages/cli-workflow/src/__tests__/thread.test.ts`:
  - Reordered imports to follow biome's organization rules
  - Moved cmdStepShow import before cmdThreadRead imports

## Test Results
-  `bun run check` passes (typecheck + lint + log tags)
-  All 124 tests passing
-  Build completes successfully

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-24 10:50:49 +00:00
xiaoju 031c3aa632 docs(cli): add deprecation handlers and update documentation (#463)
Complete the CLI refactoring with deprecation error handlers, updated
help text, and comprehensive migration guide.

## Changes

### Deprecation Handlers
Add error handlers for all removed commands with helpful migration messages:
- `workflow put` → suggests `workflow add`
- `thread step` → suggests `thread exec`
- `thread steps` → suggests `step list`
- `thread step-details` → suggests `step show`
- `thread fork` → suggests `step fork`
- `thread kill` → suggests `thread stop` or `thread cancel`
- `thread running` → suggests `thread list --status running`

Error messages follow the format:
```
Error: Command 'X' has been removed.
Use 'Y' instead.

For more information, see: uwf help Y
```

### Help Documentation
Updated CLI help text to explain four-layer architecture:
- Main help shows architecture diagram with Chinese labels
- Command group descriptions reference layers:
  - `workflow` → "Workflow definitions (layer 1: templates)"
  - `thread` → "Thread execution (layer 2: instances)"
  - `step` → "Step results (layer 3: single cycle)"
- Deprecated commands appear in help with [DEPRECATED] tag

### README Updates
Comprehensive documentation updates:
- Added "Four-Layer Architecture" section with diagram
- Updated all command tables with new command names
- Added complete migration guide with:
  - Renamed commands table
  - Merged commands table
  - Split commands table
  - Moved commands table
  - Example deprecation error output
- Updated "Internal Structure" to show new step.ts module

## Testing
-  All 124 tests pass
-  Build completes successfully
-  Deprecation handlers tested manually
-  Help output verified for main, thread, and step commands

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-24 10:46:31 +00:00
xiaoju 7b50969307 refactor(cli): reorganize CLI commands into four-layer model (#463)
Implement comprehensive CLI refactoring to clarify the four-layer model:
workflow → thread → step → turn

## Breaking Changes

### Renamed Commands
- `uwf workflow put` → `uwf workflow add`
- `uwf thread step` → `uwf thread exec`

### Removed Commands
- `uwf thread running` (merged into `thread list --status running`)
- `uwf thread kill` (split into `thread stop` and `thread cancel`)

### Moved Commands
- `uwf thread steps` → `uwf step list`
- `uwf thread step-details` → `uwf step show`
- `uwf thread fork` → `uwf step fork`

## New Commands

### Thread Commands
- `uwf thread list --status <idle|running|completed>` - Filter threads by status
- `uwf thread stop <thread-id>` - Stop background execution (keep thread active)
- `uwf thread cancel <thread-id>` - Cancel thread (stop + archive to history)

### Step Command Group (New)
- `uwf step list <thread-id>` - List all steps in a thread
- `uwf step show <step-hash>` - Show step details
- `uwf step read <step-hash> [--before N]` - Read step output as markdown
- `uwf step fork <step-hash>` - Fork thread from a step

## Implementation Details

### Files Modified
- `packages/cli-workflow/src/commands/workflow.ts` - Renamed cmdWorkflowPut → cmdWorkflowAdd
- `packages/cli-workflow/src/commands/thread.ts`:
  - Renamed cmdThreadStep → cmdThreadExec
  - Added cmdThreadStop and cmdThreadCancel (split from cmdThreadKill)
  - Updated cmdThreadList to support --status filter with idle/running/completed
  - Removed cmdThreadSteps, cmdThreadStepDetails, cmdThreadFork
- `packages/cli-workflow/src/commands/step.ts` - New module with:
  - cmdStepList (moved from cmdThreadSteps)
  - cmdStepShow (moved from cmdThreadStepDetails)
  - cmdStepFork (moved from cmdThreadFork)
  - cmdStepRead (new, stub implementation pending #462)
- `packages/cli-workflow/src/cli.ts` - Updated all CLI command registrations

### Tests Updated
- `packages/cli-workflow/src/__tests__/thread-step-count.test.ts` - Updated references from "thread step" to "thread exec"
- `packages/cli-workflow/src/__tests__/thread.test.ts` - Updated imports to use cmdStepShow from step.ts

## Test Results
All 124 tests pass in cli-workflow package.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-24 10:40:32 +00:00
xiaoju fc6072c28c Merge pull request 'feat: use git worktree for isolated development in solve-issue workflow' (#465) from fix/464-worktree-isolation into main 2026-05-24 10:27:51 +00:00
xiaoju b0e3f4a363 feat: use git worktree for isolated development in solve-issue workflow
All roles (developer, reviewer, tester, committer) now work in
~/repos/workflow-worktrees/fix/<issue>-<slug> instead of modifying
the main working directory. Prevents self-destructive edits.

Fixes #464
2026-05-24 10:22:25 +00:00
xiaoju 38112053a0 Merge pull request 'fix(agent-kit): separate session cache per agent' (#462) from fix/461-per-agent-session-cache into main 2026-05-24 09:19:50 +00:00
xiaoju 1d174ee5c9 fix(agent-kit): separate session cache per agent
Each agent now maintains its own session cache file instead of sharing
a single agent-sessions.json. This prevents session ID conflicts when
multiple agents operate on the same thread+role pair.

Changes:
- getCachePath() now takes agentName parameter
- getCachedSessionId/setCachedSessionId require agentName as first param
- Cache files named <agent>-sessions.json (e.g., hermes-sessions.json)
- Agent wrappers inject their agent name into cache calls
- Add comprehensive tests for session cache isolation
- Handle malformed JSON gracefully (treat as empty cache)

Fixes #461
2026-05-24 09:16:06 +00:00
xiaoju 6e3b32ca34 Merge pull request 'fix(cli): replace markdown headings with XML tags in thread read output' (#460) from fix/459-xml-tag-isolation into main 2026-05-24 08:44:47 +00:00
xiaoju 932bbe5c41 fix(cli): replace markdown headings with XML tags in thread read output
Changed uwf thread read to wrap role prompts and agent outputs in XML tags
(<prompt> and <output>) instead of markdown headings (### Prompt, ### Content).
This prevents Claude Code from treating step outputs as structural headings.

- Updated formatStepPrompt to use <prompt>...</prompt> tags
- Updated formatStepContent to use <output>...</output> tags
- Added comprehensive test suite in thread-read-xml-tags.test.ts
- Updated existing tests to verify XML tag behavior

Fixes #459

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-24 08:04:34 +00:00
xiaomo 9440b9af82 Merge pull request 'chore: fix biome noExcessiveCognitiveComplexity warnings' (#458) from fix/444-biome-complexity-warnings into main 2026-05-24 07:30:41 +00:00
xiaoju f96d6eb7c4 refactor(agent-builtin): reduce cognitive complexity in loop.ts
Refactored runBuiltinLoop function to reduce cognitive complexity from 30 to below 15 by extracting helper functions:

- shouldInjectDeadlineWarning: checks if deadline warning should be shown
- shouldProcessToolCalls: determines if tool calls should be processed
- extractFinalText: extracts last assistant message content
- injectDeadlineWarning: injects deadline warning message
- handleTextOnlyTurn: handles text-only turn logic
- handleToolCallTurn: handles tool call turn logic
- processLoopIteration: processes a single loop iteration

Added 24 new unit tests for the extracted helper functions, bringing total test count to 41 (all passing). All existing behavior is preserved.

Fixes #444

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-24 05:53:55 +00:00
xiaomo 95102941f1 Merge pull request 'feat(cli): thread step --background + thread running' (#457) from fix/456-thread-step-background into main 2026-05-24 05:33:56 +00:00
18 changed files with 2045 additions and 512 deletions
+27 -11
View File
@@ -38,19 +38,26 @@ roles:
capabilities: capabilities:
- coding - coding
procedure: | procedure: |
Before starting any work, ensure a clean worktree: IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
1. `git checkout main && git pull` to get the latest code
2. `git checkout -b fix/<issue-number>-<short-description>` to create a fresh branch Before starting any work, set up an isolated worktree:
- If bounced back from reviewer or tester, reuse the existing branch and rebase onto latest main: 1. `cd ~/repos/workflow && git fetch origin` to get latest refs
`git checkout main && git pull && git checkout <branch> && git rebase main` 2. First time (no existing branch):
- `git worktree add ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
- `cd ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug> && bun install`
3. If bounced back from reviewer or tester (branch already exists):
- The worktree should already exist at `~/repos/workflow-worktrees/fix/<issue-number>-<short-slug>`
- `cd ~/repos/workflow-worktrees/fix/<issue-number>-<short-slug>`
- `git fetch origin && git rebase origin/main`
4. ALL subsequent work must happen inside the worktree directory.
Then implement TDD: Then implement TDD:
3. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan) 5. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
4. If bounced back from reviewer or tester: read the previous role's output to understand what needs fixing 6. If bounced back from reviewer or tester: read the previous role's output to understand what needs fixing
5. Write tests first based on the spec 7. Write tests first based on the spec
6. Implement the code to make tests pass 8. Implement the code to make tests pass
7. Ensure `bun run build` passes with no errors 9. Ensure `bun run build` passes with no errors
8. Run `bun test` to verify all tests pass 10. Run `bun test` to verify all tests pass
output: "List all files changed and provide a summary. Frontmatter must include: status (done or failed)." output: "List all files changed and provide a summary. Frontmatter must include: status (done or failed)."
frontmatter: frontmatter:
type: object type: object
@@ -66,6 +73,8 @@ roles:
- code-review - code-review
- static-analysis - static-analysis
procedure: | procedure: |
First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
Before reviewing, verify the git branch: Before reviewing, verify the git branch:
1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on 1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
2. If the branch doesn't correspond to the issue, flag it in your output and reject 2. If the branch doesn't correspond to the issue, flag it in your output and reject
@@ -99,6 +108,8 @@ roles:
capabilities: capabilities:
- testing - testing
procedure: | procedure: |
First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
1. Run `bun test` for automated test verification 1. Run `bun test` for automated test verification
2. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan) 2. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
3. Verify each scenario in the spec is covered and passing 3. Verify each scenario in the spec is covered and passing
@@ -119,6 +130,8 @@ roles:
goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue." goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue."
capabilities: [] capabilities: []
procedure: | procedure: |
First, cd into the worktree: `cd ~/repos/workflow-worktrees/fix/<issue-number>-*` (find the exact directory)
Note: You inherit the developer's worktree and branch. Do NOT create a new branch. Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
1. Stage all changes: `git add -A` 1. Stage all changes: `git add -A`
2. Commit with a descriptive message referencing the issue: `git commit -m "type: description\n\nFixes #N"` 2. Commit with a descriptive message referencing the issue: `git commit -m "type: description\n\nFixes #N"`
@@ -126,6 +139,9 @@ roles:
- If push hook fails: capture the error log in your output, mark hook_failed - If push hook fails: capture the error log in your output, mark hook_failed
4. On push success: create a PR via `tea pr create --title "..." --description "..."` 4. On push success: create a PR via `tea pr create --title "..." --description "..."`
- PR description must follow the project template: What / Why / Changes / Ref sections, with `Fixes #N` in Ref - PR description must follow the project template: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
5. After PR creation, clean up the worktree:
- `cd ~/repos/workflow`
- `git worktree remove ~/repos/workflow-worktrees/fix/<issue-number>-<slug>`
output: "Include PR URL on success or error log on failure. Frontmatter must include: success (true or false)." output: "Include PR URL on success or error log on failure. Frontmatter must include: success (true or false)."
frontmatter: frontmatter:
type: object type: object
+91 -13
View File
@@ -6,6 +6,18 @@
Layer 4 entry point for the workflow engine. The `uwf` binary orchestrates one step per invocation: load thread head from `threads.yaml`, run the moderator, spawn the configured agent CLI, run extract, append a CAS step node, and update the head pointer (or archive when `$END`). Layer 4 entry point for the workflow engine. The `uwf` binary orchestrates one step per invocation: load thread head from `threads.yaml`, run the moderator, spawn the configured agent CLI, run extract, append a CAS step node, and update the head pointer (or archive when `$END`).
### Four-Layer Architecture
```
workflow → thread → step → turn
模板定义 执行实例 单步结果 agent内部交互
```
- **Workflow** (layer 1): YAML template with roles and routing graph
- **Thread** (layer 2): Single workflow execution instance
- **Step** (layer 3): One moderator→agent→extract cycle
- **Turn** (layer 4): Agent-internal interactions (use `step read` or CAS to inspect)
This package has no library `src/index.ts` — it is consumed as a CLI binary only. This package has no library `src/index.ts` — it is consumed as a CLI binary only.
**Dependencies:** `@uncaged/json-cas`, `@uncaged/json-cas-fs`, `@uncaged/workflow-agent-kit`, `@uncaged/workflow-moderator`, `@uncaged/workflow-protocol`, `@uncaged/workflow-util`, `commander`, `dotenv`, `yaml` **Dependencies:** `@uncaged/json-cas`, `@uncaged/json-cas-fs`, `@uncaged/workflow-agent-kit`, `@uncaged/workflow-moderator`, `@uncaged/workflow-protocol`, `@uncaged/workflow-util`, `commander`, `dotenv`, `yaml`
@@ -30,34 +42,53 @@ bun link packages/cli-workflow
-h, --help Show help -h, --help Show help
``` ```
### Thread ### Thread (Layer 2: Execution Instances)
| Command | Description | | Command | Description |
|---------|-------------| |---------|-------------|
| `uwf thread start <workflow> -p <prompt>` | Create a thread without executing | | `uwf thread start <workflow> -p <prompt>` | Create a thread without executing |
| `uwf thread step <thread-id> [--agent <cmd>] [-c <count>]` | Execute one or more moderator→agent→extract cycles | | `uwf thread exec <thread-id> [--agent <cmd>] [-c <count>] [--background]` | Execute one or more moderator→agent→extract cycles |
| `uwf thread show <thread-id>` | Show thread head pointer | | `uwf thread show <thread-id>` | Show thread head pointer |
| `uwf thread list [--all]` | List active threads (`--all` includes archived) | | `uwf thread list [--status <idle\|running\|completed>]` | List threads, optionally filtered by status |
| `uwf thread steps <thread-id>` | List all steps chronologically |
| `uwf thread read <thread-id> [--quota N] [--before <hash>] [--start]` | Render thread as readable markdown | | `uwf thread read <thread-id> [--quota N] [--before <hash>] [--start]` | Render thread as readable markdown |
| `uwf thread fork <step-hash>` | Fork from a specific step | | `uwf thread stop <thread-id>` | Stop background execution (keep thread active) |
| `uwf thread step-details <step-hash>` | Dump full detail node as YAML | | `uwf thread cancel <thread-id>` | Cancel thread (stop + archive to history) |
| `uwf thread kill <thread-id>` | Terminate and archive |
Examples: Examples:
```bash ```bash
uwf thread start solve-issue -p "Fix the login redirect bug" uwf thread start solve-issue -p "Fix the login redirect bug"
uwf thread step 01ARZ3NDEKTSV4RRFFQ69G5FAV uwf thread exec 01ARZ3NDEKTSV4RRFFQ69G5FAV
uwf thread step 01ARZ3NDEKTSV4RRFFQ69G5FAV -c 3 --agent uwf-builtin uwf thread exec 01ARZ3NDEKTSV4RRFFQ69G5FAV -c 3 --agent uwf-builtin
uwf thread exec 01ARZ3NDEKTSV4RRFFQ69G5FAV --background
uwf thread list --status running
uwf thread read 01ARZ3NDEKTSV4RRFFQ69G5FAV --quota 8000 uwf thread read 01ARZ3NDEKTSV4RRFFQ69G5FAV --quota 8000
uwf thread stop 01ARZ3NDEKTSV4RRFFQ69G5FAV
``` ```
### Workflow ### Step (Layer 3: Single Cycle Results)
| Command | Description | | Command | Description |
|---------|-------------| |---------|-------------|
| `uwf workflow put <file.yaml>` | Register a workflow from YAML | | `uwf step list <thread-id>` | List all steps in a thread chronologically |
| `uwf step show <step-hash>` | Show step metadata and frontmatter |
| `uwf step read <step-hash> [--before N]` | Read step output as markdown |
| `uwf step fork <step-hash>` | Fork a thread from a specific step |
Examples:
```bash
uwf step list 01ARZ3NDEKTSV4RRFFQ69G5FAV
uwf step show 32GCDE899RRQ3
uwf step read 32GCDE899RRQ3 --before 3
uwf step fork 32GCDE899RRQ3
```
### Workflow (Layer 1: Templates)
| Command | Description |
|---------|-------------|
| `uwf workflow add <file.yaml>` | Register a workflow from YAML |
| `uwf workflow show <name-or-hash>` | Show workflow definition | | `uwf workflow show <name-or-hash>` | Show workflow definition |
| `uwf workflow list` | List registered workflows | | `uwf workflow list` | List registered workflows |
@@ -99,6 +130,52 @@ Config: `~/.uncaged/workflow/config.yaml`. API keys: `~/.uncaged/workflow/.env`.
| `uwf log show [--thread <id>] [--process <pid>] [--date YYYY-MM-DD]` | Show filtered log entries | | `uwf log show [--thread <id>] [--process <pid>] [--date YYYY-MM-DD]` | Show filtered log entries |
| `uwf log clean [--before YYYY-MM-DD]` | Delete old log files | | `uwf log clean [--before YYYY-MM-DD]` | Delete old log files |
## Migration Guide
### Breaking Changes (v0.x → v1.x)
The CLI was reorganized to clarify the four-layer architecture. **No backward compatibility** — old commands have been removed.
#### Renamed Commands
| Old Command | New Command | Notes |
|------------|-------------|-------|
| `workflow put` | `workflow add` | More intuitive verb |
| `thread step` | `thread exec` | Eliminates ambiguity with "step" noun |
| `thread list --all` | `thread list --status completed` | Unified status filtering |
#### Removed Commands (Merged)
| Old Command | New Command | Notes |
|------------|-------------|-------|
| `thread running` | `thread list --status running` | Merged into unified list |
#### Removed Commands (Split)
| Old Command | New Commands | Notes |
|------------|-------------|-------|
| `thread kill` | `thread stop` or `thread cancel` | `stop` keeps thread active, `cancel` archives it |
#### Moved Commands
| Old Command | New Command | Notes |
|------------|-------------|-------|
| `thread steps` | `step list` | Moved to step layer |
| `thread step-details` | `step show` | Moved to step layer |
| `thread fork` | `step fork` | Moved to step layer (forks are step-based) |
#### Deprecation Errors
Old commands now show helpful error messages:
```bash
$ uwf thread step 01ARZ3NDEKTSV4RRFFQ69G5FAV
Error: Command 'thread step' has been removed.
Use 'thread exec' instead.
For more information, see: uwf help thread exec
```
## Internal Structure ## Internal Structure
``` ```
@@ -109,8 +186,9 @@ src/
├── validate.ts Workflow YAML validation ├── validate.ts Workflow YAML validation
├── schemas.ts CLI-local schema registration ├── schemas.ts CLI-local schema registration
└── commands/ └── commands/
├── thread.ts Thread lifecycle and step execution ├── thread.ts Thread lifecycle and exec
├── workflow.ts Workflow registry (put/show/list) ├── step.ts Step operations (list/show/read/fork)
├── workflow.ts Workflow registry (add/show/list)
├── cas.ts CAS inspection and schema ops ├── cas.ts CAS inspection and schema ops
├── setup.ts Interactive/non-interactive setup ├── setup.ts Interactive/non-interactive setup
├── skill.ts Built-in skill references ├── skill.ts Built-in skill references
@@ -0,0 +1,683 @@
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { bootstrap, putSchema } from "@uncaged/json-cas";
import { createFsStore } from "@uncaged/json-cas-fs";
import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { cmdThreadRead, THREAD_READ_DEFAULT_QUOTA } from "../commands/thread.js";
import { registerUwfSchemas } from "../schemas.js";
import type { UwfStore } from "../store.js";
import { saveThreadsIndex } from "../store.js";
// ── schemas used in tests ────────────────────────────────────────────────────
const TURN_SCHEMA = {
title: "hermes-turn",
type: "object" as const,
required: ["index", "role", "content"],
properties: {
index: { type: "integer" as const },
role: { type: "string" as const },
content: { type: "string" as const },
toolCalls: {
anyOf: [
{ type: "array" as const, items: { type: "object" as const } },
{ type: "null" as const },
],
},
reasoning: { anyOf: [{ type: "string" as const }, { type: "null" as const }] },
},
additionalProperties: false,
};
const DETAIL_SCHEMA = {
title: "hermes-detail",
type: "object" as const,
required: ["sessionId", "model", "duration", "turnCount", "turns"],
properties: {
sessionId: { type: "string" as const },
model: { type: "string" as const },
duration: { type: "integer" as const },
turnCount: { type: "integer" as const },
turns: {
type: "array" as const,
items: { type: "string" as const, format: "cas_ref" },
},
},
additionalProperties: false,
};
// ── helpers ───────────────────────────────────────────────────────────────────
async function makeUwfStore(storageRoot: string): Promise<UwfStore> {
const casDir = join(storageRoot, "cas");
await mkdir(casDir, { recursive: true });
const store = createFsStore(casDir);
const schemas = await registerUwfSchemas(store);
return { storageRoot, store, schemas };
}
async function registerDetailSchemas(store: ReturnType<typeof createFsStore>) {
await bootstrap(store);
const [turn, detail] = await Promise.all([
putSchema(store, TURN_SCHEMA),
putSchema(store, DETAIL_SCHEMA),
]);
return { turn, detail };
}
// ── fixture ───────────────────────────────────────────────────────────────────
let tmpDir: string;
beforeEach(async () => {
tmpDir = await mkdtemp(join(tmpdir(), "cli-uwf-test-"));
});
afterEach(async () => {
await rm(tmpDir, { recursive: true, force: true });
});
// ── thread read XML tag isolation ─────────────────────────────────────────────
describe("thread read XML tag isolation", () => {
test("scenario 1: wraps output in XML tags instead of heading", async () => {
const uwf = await makeUwfStore(tmpDir);
const detailSchemas = await registerDetailSchemas(uwf.store);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
planner: {
description: "Planner",
goal: "You are a planning agent. Your task is to...",
capabilities: [],
procedure: "Plan the work.",
output: "Summarize the plan.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Fix issue #459",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const turnHash = await uwf.store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content:
"---\nstatus: ready\nplan: CMWGHQKT58RY4\n---\n\n# Analysis Complete\n## Issue Summary\nThe issue requires XML tag isolation.",
toolCalls: null,
reasoning: null,
});
const detailHash = await uwf.store.put(detailSchemas.detail, {
sessionId: "sx",
model: "mx",
duration: 500,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "planner",
output: outputHash,
detail: detailHash,
agent: "uwf-claude-code",
});
const threadId = "01JTEST0000000000000001" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
// Should wrap output in XML tags
expect(markdown).toContain("<output>");
expect(markdown).toContain("</output>");
// Should not have ### Content heading
expect(markdown).not.toContain("### Content");
// Should preserve markdown headings inside output tags
expect(markdown).toContain("# Analysis Complete");
expect(markdown).toContain("## Issue Summary");
});
test("scenario 2: wraps prompt in XML tags", async () => {
const uwf = await makeUwfStore(tmpDir);
const detailSchemas = await registerDetailSchemas(uwf.store);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
planner: {
description: "Planner",
goal: "You are a planning agent. Your task is to analyze and plan.",
capabilities: [],
procedure: "Plan the work.",
output: "Summarize the plan.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Fix issue",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const turnHash = await uwf.store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content: "---\nstatus: ready\n---\n\nContent here...",
toolCalls: null,
reasoning: null,
});
const detailHash = await uwf.store.put(detailSchemas.detail, {
sessionId: "sx",
model: "mx",
duration: 500,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "planner",
output: outputHash,
detail: detailHash,
agent: "uwf-claude-code",
});
const threadId = "01JTEST0000000000000002" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
// Should wrap prompt in XML tags
expect(markdown).toContain("<prompt>");
expect(markdown).toContain("</prompt>");
expect(markdown).toContain("You are a planning agent. Your task is to analyze and plan.");
// Should not have ### Prompt heading
expect(markdown).not.toContain("### Prompt");
// Should wrap output in XML tags
expect(markdown).toContain("<output>");
expect(markdown).toContain("</output>");
});
test("scenario 3: same role repeated does not show prompt twice", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
writer: {
description: "Writer",
goal: "You are a writer agent.",
capabilities: [],
procedure: "Write content.",
output: "Summarize writing.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Write something",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const step1 = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "writer",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const step2 = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: step1 as CasRef,
role: "writer",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000003" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: step2 });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
// Should only show prompt tags once
const promptCount = (markdown.match(/<prompt>/g) ?? []).length;
expect(promptCount).toBe(1);
});
test("scenario 4: step with no detail shows no output tags", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
worker: {
description: "Worker",
goal: "You are a worker agent.",
capabilities: [],
procedure: "Do work.",
output: "Summarize work.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Do stuff",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000004" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
// Should not have output tags
expect(markdown).not.toContain("<output>");
expect(markdown).not.toContain("</output>");
// Step header should still be displayed
expect(markdown).toContain("## Step 1: worker");
// Prompt should still be shown
expect(markdown).toContain("<prompt>");
});
test("scenario 5: empty content shows no output tags", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Do stuff",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
// A detail ref that doesn't exist → extractLastAssistantContent returns null
const missingDetailRef = "missingdetail0" as CasRef;
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "worker",
output: outputHash,
detail: missingDetailRef,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000005" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
// Should not have output tags
expect(markdown).not.toContain("<output>");
expect(markdown).not.toContain("</output>");
});
test("scenario 6: thread read with --start flag shows task section", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
roleA: {
description: "Role A",
goal: "Goal for roleA",
capabilities: [],
procedure: "Do stuff.",
output: "Output.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Initial prompt",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "roleA",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000006" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, true);
// Should include task section
expect(markdown).toContain("# Thread");
expect(markdown).toContain("## Task");
expect(markdown).toContain("Initial prompt");
// Prompts should use XML tags
expect(markdown).toContain("<prompt>");
});
test("scenario 7: thread read with --before parameter", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
roleA: {
description: "Role A",
goal: "Goal for roleA",
capabilities: [],
procedure: "Do stuff.",
output: "Output.",
meta: "placeholder00" as CasRef,
},
roleB: {
description: "Role B",
goal: "Goal for roleB",
capabilities: [],
procedure: "Do stuff.",
output: "Output.",
meta: "placeholder00" as CasRef,
},
roleC: {
description: "Role C",
goal: "Goal for roleC",
capabilities: [],
procedure: "Do stuff.",
output: "Output.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Initial prompt",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const step1 = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "roleA",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const step2 = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: step1 as CasRef,
role: "roleB",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const step3 = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: step2 as CasRef,
role: "roleC",
output: outputHash,
detail: null,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000007" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: step3 });
const markdown = await cmdThreadRead(
tmpDir,
threadId,
THREAD_READ_DEFAULT_QUOTA,
step2 as CasRef,
false,
);
// Should only show roleA
expect(markdown).toContain("roleA");
expect(markdown).not.toContain("roleB");
expect(markdown).not.toContain("roleC");
// Should use XML tags
expect(markdown).toContain("<prompt>");
});
test("scenario 9: special characters in content are preserved", async () => {
const uwf = await makeUwfStore(tmpDir);
const detailSchemas = await registerDetailSchemas(uwf.store);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
writer: {
description: "Writer",
goal: "You are a writer.",
capabilities: [],
procedure: "Write content.",
output: "Summarize.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Write something",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const turnHash = await uwf.store.put(detailSchemas.turn, {
index: 0,
role: "assistant",
content: "Content with <special> & characters > like <this>",
toolCalls: null,
reasoning: null,
});
const detailHash = await uwf.store.put(detailSchemas.detail, {
sessionId: "sx",
model: "mx",
duration: 500,
turnCount: 1,
turns: [turnHash],
});
const stepHash = await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev: null,
role: "writer",
output: outputHash,
detail: detailHash,
agent: "uwf-test",
});
const threadId = "01JTEST0000000000000008" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: stepHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
// Special characters should be preserved as-is
expect(markdown).toContain("Content with <special> & characters > like <this>");
});
test("scenario 10: quota limit with XML tags", async () => {
const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
name: "test-wf",
description: "desc",
roles: {
roleA: {
description: "Role A",
goal: "Goal for roleA",
capabilities: [],
procedure: "Do stuff.",
output: "Output.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
graph: {},
});
const startHash = await uwf.store.put(uwf.schemas.startNode, {
workflow: workflowHash,
prompt: "Initial prompt",
});
const outputHash = await uwf.store.put(uwf.schemas.workflow, {
name: "out",
description: "",
roles: {},
conditions: {},
graph: {},
});
const steps: CasRef[] = [];
let prev: CasRef | null = null;
for (let i = 0; i < 5; i++) {
const step = (await uwf.store.put(uwf.schemas.stepNode, {
start: startHash,
prev,
role: "roleA",
output: outputHash,
detail: null,
agent: "uwf-test",
})) as CasRef;
steps.push(step);
prev = step;
}
const threadId = "01JTEST0000000000000009" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: steps[steps.length - 1]! });
// Use very small quota
const markdown = await cmdThreadRead(tmpDir, threadId, 1, null, false);
// Should have skip hint
expect(markdown).toContain("earlier step");
// Should have XML tags for displayed steps
if (markdown.includes("<prompt>")) {
expect(markdown).toContain("</prompt>");
}
});
});
@@ -22,48 +22,48 @@ function runCli(args: string[]): { stdout: string; stderr: string; exitCode: num
} }
} }
describe("thread step --count CLI parsing", () => { describe("thread exec --count CLI parsing", () => {
test("--help shows -c/--count option", () => { test("--help shows -c/--count option", () => {
const result = runCli(["thread", "step", "--help"]); const result = runCli(["thread", "exec", "--help"]);
expect(result.stdout).toContain("--count"); expect(result.stdout).toContain("--count");
expect(result.stdout).toContain("-c"); expect(result.stdout).toContain("-c");
}); });
test("description says 'one or more steps'", () => { test("description says 'one or more steps'", () => {
const result = runCli(["thread", "step", "--help"]); const result = runCli(["thread", "exec", "--help"]);
expect(result.stdout).toContain("one or more steps"); expect(result.stdout).toContain("one or more steps");
}); });
}); });
describe("cmdThreadStep count logic", () => { describe("cmdThreadExec count logic", () => {
test("count=0 fails with validation error", () => { test("count=0 fails with validation error", () => {
const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "0"]); const result = runCli(["thread", "exec", "FAKE_THREAD_ID", "-c", "0"]);
expect(result.exitCode).not.toBe(0); expect(result.exitCode).not.toBe(0);
expect(result.stderr).toContain("positive integer"); expect(result.stderr).toContain("positive integer");
}); });
test("negative count fails with validation error", () => { test("negative count fails with validation error", () => {
const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "-1"]); const result = runCli(["thread", "exec", "FAKE_THREAD_ID", "-c", "-1"]);
expect(result.exitCode).not.toBe(0); expect(result.exitCode).not.toBe(0);
expect(result.stderr).toContain("positive integer"); expect(result.stderr).toContain("positive integer");
}); });
test("non-integer count fails with validation error", () => { test("non-integer count fails with validation error", () => {
const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "1.5"]); const result = runCli(["thread", "exec", "FAKE_THREAD_ID", "-c", "1.5"]);
expect(result.exitCode).not.toBe(0); expect(result.exitCode).not.toBe(0);
expect(result.stderr).toContain("positive integer"); expect(result.stderr).toContain("positive integer");
}); });
test("count=1 is the default (no -c flag)", () => { test("count=1 is the default (no -c flag)", () => {
// Without -c, it should attempt to run 1 step (failing on missing thread, not on count validation) // Without -c, it should attempt to run 1 step (failing on missing thread, not on count validation)
const result = runCli(["thread", "step", "FAKE_THREAD_ID"]); const result = runCli(["thread", "exec", "FAKE_THREAD_ID"]);
expect(result.exitCode).not.toBe(0); expect(result.exitCode).not.toBe(0);
// Should NOT contain "positive integer" error — should fail on thread lookup instead // Should NOT contain "positive integer" error — should fail on thread lookup instead
expect(result.stderr).not.toContain("positive integer"); expect(result.stderr).not.toContain("positive integer");
}); });
test("count=3 passes validation (fails on thread lookup)", () => { test("count=3 passes validation (fails on thread lookup)", () => {
const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "3"]); const result = runCli(["thread", "exec", "FAKE_THREAD_ID", "-c", "3"]);
expect(result.exitCode).not.toBe(0); expect(result.exitCode).not.toBe(0);
// Should NOT contain "positive integer" error — should fail on thread/storage lookup // Should NOT contain "positive integer" error — should fail on thread/storage lookup
expect(result.stderr).not.toContain("positive integer"); expect(result.stderr).not.toContain("positive integer");
@@ -5,9 +5,9 @@ import { bootstrap, putSchema } from "@uncaged/json-cas";
import { createFsStore } from "@uncaged/json-cas-fs"; import { createFsStore } from "@uncaged/json-cas-fs";
import type { CasRef, ThreadId } from "@uncaged/workflow-protocol"; import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
import { afterEach, beforeEach, describe, expect, test } from "vitest"; import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { cmdStepShow } from "../commands/step.js";
import { import {
cmdThreadRead, cmdThreadRead,
cmdThreadStepDetails,
extractLastAssistantContent, extractLastAssistantContent,
THREAD_READ_DEFAULT_QUOTA, THREAD_READ_DEFAULT_QUOTA,
} from "../commands/thread.js"; } from "../commands/thread.js";
@@ -198,10 +198,10 @@ describe("extractLastAssistantContent", () => {
}); });
}); });
// ── cmdThreadRead: ### Content section ─────────────────────────────────────── // ── cmdThreadRead: <output> section ──────────────────────────────────────────
describe("cmdThreadRead ### Content section", () => { describe("cmdThreadRead <output> section", () => {
test("includes ### Content before ### Output when detail has assistant turns", async () => { test("includes <output> tags when detail has assistant turns", async () => {
const uwf = await makeUwfStore(tmpDir); const uwf = await makeUwfStore(tmpDir);
const detailSchemas = await registerDetailSchemas(uwf.store); const detailSchemas = await registerDetailSchemas(uwf.store);
@@ -264,12 +264,13 @@ describe("cmdThreadRead ### Content section", () => {
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false); const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
expect(markdown).toContain("### Content"); expect(markdown).toContain("<output>");
expect(markdown).toContain("</output>");
expect(markdown).toContain("The assistant response text"); expect(markdown).toContain("The assistant response text");
expect(markdown).not.toContain("### Output"); expect(markdown).not.toContain("### Content");
}); });
test("omits ### Content when detail has no matching assistant turns", async () => { test("omits <output> tags when detail has no matching assistant turns", async () => {
const uwf = await makeUwfStore(tmpDir); const uwf = await makeUwfStore(tmpDir);
const workflowHash = await uwf.store.put(uwf.schemas.workflow, { const workflowHash = await uwf.store.put(uwf.schemas.workflow, {
@@ -308,14 +309,15 @@ describe("cmdThreadRead ### Content section", () => {
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false); const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
expect(markdown).not.toContain("<output>");
expect(markdown).not.toContain("</output>");
expect(markdown).not.toContain("### Content"); expect(markdown).not.toContain("### Content");
expect(markdown).not.toContain("### Output");
}); });
}); });
// ── cmdThreadStepDetails ────────────────────────────────────────────────────── // ── cmdStepShow ───────────────────────────────────────────────────────────────
describe("cmdThreadStepDetails", () => { describe("cmdStepShow", () => {
test("returns expanded detail node with turns inlined", async () => { test("returns expanded detail node with turns inlined", async () => {
const uwf = await makeUwfStore(tmpDir); const uwf = await makeUwfStore(tmpDir);
const detailSchemas = await registerDetailSchemas(uwf.store); const detailSchemas = await registerDetailSchemas(uwf.store);
@@ -363,7 +365,7 @@ describe("cmdThreadStepDetails", () => {
agent: "uwf-hermes", agent: "uwf-hermes",
}); });
const result = await cmdThreadStepDetails(tmpDir, stepHash); const result = await cmdStepShow(tmpDir, stepHash);
expect(result).toMatchObject({ expect(result).toMatchObject({
sessionId: "sess42", sessionId: "sess42",
@@ -384,9 +386,9 @@ describe("cmdThreadStepDetails", () => {
}); });
}); });
// ── cmdThreadRead: ### Prompt deduplication ─────────────────────────────────── // ── cmdThreadRead: <prompt> deduplication ───────────────────────────────────
describe("cmdThreadRead ### Prompt deduplication", () => { describe("cmdThreadRead <prompt> deduplication", () => {
async function makeThreadWithRoles(uwf: UwfStore, roles: string[]): Promise<string> { async function makeThreadWithRoles(uwf: UwfStore, roles: string[]): Promise<string> {
const roleMap: Record<string, unknown> = {}; const roleMap: Record<string, unknown> = {};
for (const r of [...new Set(roles)]) { for (const r of [...new Set(roles)]) {
@@ -434,36 +436,36 @@ describe("cmdThreadRead ### Prompt deduplication", () => {
return stepHash; return stepHash;
} }
test("same consecutive role shows ### Prompt once", async () => { test("same consecutive role shows <prompt> once", async () => {
const uwf = await makeUwfStore(tmpDir); const uwf = await makeUwfStore(tmpDir);
const headHash = await makeThreadWithRoles(uwf, ["writer", "writer"]); const headHash = await makeThreadWithRoles(uwf, ["writer", "writer"]);
const threadId = "01JTEST0000000000000003" as ThreadId; const threadId = "01JTEST0000000000000003" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: headHash }); await saveThreadsIndex(tmpDir, { [threadId]: headHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false); const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
const count = (markdown.match(/### Prompt/g) ?? []).length; const count = (markdown.match(/<prompt>/g) ?? []).length;
expect(count).toBe(1); expect(count).toBe(1);
}); });
test("different consecutive roles each show ### Prompt", async () => { test("different consecutive roles each show <prompt>", async () => {
const uwf = await makeUwfStore(tmpDir); const uwf = await makeUwfStore(tmpDir);
const headHash = await makeThreadWithRoles(uwf, ["planner", "coder"]); const headHash = await makeThreadWithRoles(uwf, ["planner", "coder"]);
const threadId = "01JTEST0000000000000004" as ThreadId; const threadId = "01JTEST0000000000000004" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: headHash }); await saveThreadsIndex(tmpDir, { [threadId]: headHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false); const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
const count = (markdown.match(/### Prompt/g) ?? []).length; const count = (markdown.match(/<prompt>/g) ?? []).length;
expect(count).toBe(2); expect(count).toBe(2);
}); });
test("non-consecutive same role shows ### Prompt twice", async () => { test("non-consecutive same role shows <prompt> twice", async () => {
const uwf = await makeUwfStore(tmpDir); const uwf = await makeUwfStore(tmpDir);
const headHash = await makeThreadWithRoles(uwf, ["roleA", "roleB", "roleA"]); const headHash = await makeThreadWithRoles(uwf, ["roleA", "roleB", "roleA"]);
const threadId = "01JTEST0000000000000005" as ThreadId; const threadId = "01JTEST0000000000000005" as ThreadId;
await saveThreadsIndex(tmpDir, { [threadId]: headHash }); await saveThreadsIndex(tmpDir, { [threadId]: headHash });
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false); const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
const count = (markdown.match(/### Prompt/g) ?? []).length; const count = (markdown.match(/<prompt>/g) ?? []).length;
expect(count).toBe(2); expect(count).toBe(2);
}); });
}); });
@@ -584,9 +586,9 @@ describe("cmdThreadRead start section / before / quota", () => {
// ── Tests that call process.exit must be last ───────────────────────────────── // ── Tests that call process.exit must be last ─────────────────────────────────
describe("cmdThreadStepDetails (process.exit tests - must be last)", () => { describe("cmdStepShow (process.exit tests - must be last)", () => {
test("throws when step hash does not exist", async () => { test("throws when step hash does not exist", async () => {
await expect(cmdThreadStepDetails(tmpDir, "nonexistenth0" as CasRef)).rejects.toThrow(); await expect(cmdStepShow(tmpDir, "nonexistenth0" as CasRef)).rejects.toThrow();
}); });
test("before with unknown hash rejects", async () => { test("before with unknown hash rejects", async () => {
+167 -47
View File
@@ -1,8 +1,7 @@
#!/usr/bin/env bun #!/usr/bin/env bun
import type { ThreadId } from "@uncaged/workflow-protocol"; import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
import { Command } from "commander"; import { Command } from "commander";
import { stringify as yamlStringify } from "yaml";
import { import {
cmdCasGet, cmdCasGet,
cmdCasHas, cmdCasHas,
@@ -17,20 +16,19 @@ import {
import { cmdLogClean, cmdLogList, cmdLogShow } from "./commands/log.js"; import { cmdLogClean, cmdLogList, cmdLogShow } from "./commands/log.js";
import { cmdSetup, cmdSetupInteractive } from "./commands/setup.js"; import { cmdSetup, cmdSetupInteractive } from "./commands/setup.js";
import { cmdSkillCli } from "./commands/skill.js"; import { cmdSkillCli } from "./commands/skill.js";
import { cmdStepFork, cmdStepList, cmdStepShow } from "./commands/step.js";
import { import {
cmdThreadFork, cmdThreadCancel,
cmdThreadKill, cmdThreadExec,
cmdThreadList, cmdThreadList,
cmdThreadRead, cmdThreadRead,
cmdThreadRunning,
cmdThreadShow, cmdThreadShow,
cmdThreadStart, cmdThreadStart,
cmdThreadStep, cmdThreadStop,
cmdThreadStepDetails,
cmdThreadSteps,
THREAD_READ_DEFAULT_QUOTA, THREAD_READ_DEFAULT_QUOTA,
type ThreadStatus,
} from "./commands/thread.js"; } from "./commands/thread.js";
import { cmdWorkflowList, cmdWorkflowPut, cmdWorkflowShow } from "./commands/workflow.js"; import { cmdWorkflowAdd, cmdWorkflowList, cmdWorkflowShow } from "./commands/workflow.js";
import { formatOutput, type OutputFormat } from "./format.js"; import { formatOutput, type OutputFormat } from "./format.js";
import { resolveStorageRoot } from "./store.js"; import { resolveStorageRoot } from "./store.js";
@@ -53,20 +51,27 @@ const program = new Command();
const pkg = await import("../package.json", { with: { type: "json" } }); const pkg = await import("../package.json", { with: { type: "json" } });
program program
.name("uwf") .name("uwf")
.description("Stateless workflow CLI") .description(
"Stateless workflow CLI\n\n" +
"Four-layer architecture:\n" +
" workflow → thread → step → turn\n" +
" 模板定义 执行实例 单步结果 agent内部交互",
)
.version(pkg.default.version, "-V, --version"); .version(pkg.default.version, "-V, --version");
program.option("--format <fmt>", "Output format: json or yaml", "json"); program.option("--format <fmt>", "Output format: json or yaml", "json");
const workflow = program.command("workflow").description("Workflow registry and CAS"); const workflow = program
.command("workflow")
.description("Workflow definitions (layer 1: templates)");
workflow workflow
.command("put") .command("add")
.description("Register a workflow from YAML") .description("Register a workflow from YAML")
.argument("<file>", "Workflow YAML file") .argument("<file>", "Workflow YAML file")
.action((file: string) => { .action((file: string) => {
const storageRoot = resolveStorageRoot(); const storageRoot = resolveStorageRoot();
runAction(async () => { runAction(async () => {
const result = await cmdWorkflowPut(storageRoot, file); const result = await cmdWorkflowAdd(storageRoot, file);
writeOutput(result); writeOutput(result);
}); });
}); });
@@ -94,7 +99,7 @@ workflow
}); });
}); });
const thread = program.command("thread").description("Thread lifecycle and execution"); const thread = program.command("thread").description("Thread execution (layer 2: instances)");
thread thread
.command("start") .command("start")
@@ -110,7 +115,7 @@ thread
}); });
thread thread
.command("step") .command("exec")
.description("Execute one or more steps") .description("Execute one or more steps")
.argument("<thread-id>", "Thread ULID") .argument("<thread-id>", "Thread ULID")
.option("--agent <cmd>", "Override agent command") .option("--agent <cmd>", "Override agent command")
@@ -134,7 +139,7 @@ thread
const background = opts.background ?? false; const background = opts.background ?? false;
const backgroundWorker = opts._backgroundWorker ?? false; const backgroundWorker = opts._backgroundWorker ?? false;
const results = await cmdThreadStep( const results = await cmdThreadExec(
storageRoot, storageRoot,
threadId, threadId,
agentOverride, agentOverride,
@@ -165,47 +170,49 @@ thread
thread thread
.command("list") .command("list")
.description("List active threads") .description("List threads")
.option("--all", "Include archived threads") .option("--status <status>", "Filter by status: idle, running, or completed")
.action((opts: { all: boolean }) => { .action((opts: { status: string | undefined }) => {
const storageRoot = resolveStorageRoot(); const storageRoot = resolveStorageRoot();
runAction(async () => { runAction(async () => {
const result = await cmdThreadList(storageRoot, opts.all); const validStatuses: ThreadStatus[] = ["idle", "running", "completed"];
let statusFilter: ThreadStatus | null = null;
if (opts.status !== undefined) {
if (!validStatuses.includes(opts.status as ThreadStatus)) {
process.stderr.write(
`Invalid status: ${opts.status}. Must be one of: idle, running, completed\n`,
);
process.exit(1);
}
statusFilter = opts.status as ThreadStatus;
}
const result = await cmdThreadList(storageRoot, statusFilter);
writeOutput(result); writeOutput(result);
}); });
}); });
thread thread
.command("running") .command("stop")
.description("List threads currently executing in the background") .description("Stop background execution of a thread (keep thread active)")
.action(() => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdThreadRunning(storageRoot);
writeOutput(result);
});
});
thread
.command("kill")
.description("Terminate and archive a thread")
.argument("<thread-id>", "Thread ULID") .argument("<thread-id>", "Thread ULID")
.action((threadId: string) => { .action((threadId: string) => {
const storageRoot = resolveStorageRoot(); const storageRoot = resolveStorageRoot();
runAction(async () => { runAction(async () => {
const result = await cmdThreadKill(storageRoot, threadId); const result = await cmdThreadStop(storageRoot, threadId);
writeOutput(result); writeOutput(result);
}); });
}); });
thread thread
.command("steps") .command("cancel")
.description("List all steps in a thread") .description("Cancel a thread (stop execution and move to history)")
.argument("<thread-id>", "Thread ULID") .argument("<thread-id>", "Thread ULID")
.action((threadId: string) => { .action((threadId: string) => {
const storageRoot = resolveStorageRoot(); const storageRoot = resolveStorageRoot();
runAction(async () => { runAction(async () => {
const result = await cmdThreadSteps(storageRoot, threadId); const result = await cmdThreadCancel(storageRoot, threadId);
writeOutput(result); writeOutput(result);
}); });
}); });
@@ -239,28 +246,141 @@ thread
}, },
); );
thread const step = program.command("step").description("Step results (layer 3: single cycle)");
step
.command("list")
.description("List all steps in a thread")
.argument("<thread-id>", "Thread ULID")
.action((threadId: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdStepList(storageRoot, threadId);
writeOutput(result);
});
});
step
.command("show")
.description("Show details of a specific step")
.argument("<step-hash>", "CAS hash of the StepNode")
.action((stepHash: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const detail = await cmdStepShow(storageRoot, stepHash as CasRef);
writeOutput(detail);
});
});
// step read is not yet registered (half-baked, see step.ts cmdStepRead)
step
.command("fork") .command("fork")
.description("Fork a thread from a specific step") .description("Fork a thread from a specific step")
.argument("<step-hash>", "CAS hash of the StartNode or StepNode to fork from") .argument("<step-hash>", "CAS hash of the StartNode or StepNode to fork from")
.action((stepHash: string) => { .action((stepHash: string) => {
const storageRoot = resolveStorageRoot(); const storageRoot = resolveStorageRoot();
runAction(async () => { runAction(async () => {
const result = await cmdThreadFork(storageRoot, stepHash); const result = await cmdStepFork(storageRoot, stepHash as CasRef);
writeOutput(result); writeOutput(result);
}); });
}); });
// ── Deprecation Handlers ──────────────────────────────────────────────────────
// These commands have been removed. Show helpful error messages.
workflow
.command("put")
.description("[DEPRECATED] Use 'workflow add' instead")
.argument("<file>", "Workflow YAML file")
.action(() => {
process.stderr.write(`Error: Command 'workflow put' has been removed.
Use 'workflow add' instead.
For more information, see: uwf help workflow add
`);
process.exit(1);
});
thread
.command("step")
.description("[DEPRECATED] Use 'thread exec' instead")
.argument("<thread-id>", "Thread ULID")
.allowUnknownOption()
.action(() => {
process.stderr.write(`Error: Command 'thread step' has been removed.
Use 'thread exec' instead.
For more information, see: uwf help thread exec
`);
process.exit(1);
});
thread
.command("steps")
.description("[DEPRECATED] Use 'step list' instead")
.argument("<thread-id>", "Thread ULID")
.action(() => {
process.stderr.write(`Error: Command 'thread steps' has been removed.
Use 'step list' instead.
For more information, see: uwf help step list
`);
process.exit(1);
});
thread thread
.command("step-details") .command("step-details")
.description("Dump the full detail node of a step as YAML") .description("[DEPRECATED] Use 'step show' instead")
.argument("<step-hash>", "CAS hash of the StepNode") .argument("<step-hash>", "Step hash")
.action((stepHash: string) => { .action(() => {
const storageRoot = resolveStorageRoot(); process.stderr.write(`Error: Command 'thread step-details' has been removed.
runAction(async () => { Use 'step show' instead.
const detail = await cmdThreadStepDetails(storageRoot, stepHash);
process.stdout.write(yamlStringify(detail)); For more information, see: uwf help step show
}); `);
process.exit(1);
});
thread
.command("fork")
.description("[DEPRECATED] Use 'step fork' instead")
.argument("<step-hash>", "Step hash")
.action(() => {
process.stderr.write(`Error: Command 'thread fork' has been removed.
Use 'step fork' instead.
For more information, see: uwf help step fork
`);
process.exit(1);
});
thread
.command("kill")
.description("[DEPRECATED] Use 'thread stop' or 'thread cancel' instead")
.argument("<thread-id>", "Thread ULID")
.action(() => {
process.stderr.write(`Error: Command 'thread kill' has been removed.
Use 'thread stop' to stop background execution (keep thread active),
or 'thread cancel' to cancel and archive the thread.
For more information, see:
uwf help thread stop
uwf help thread cancel
`);
process.exit(1);
});
thread
.command("running")
.description("[DEPRECATED] Use 'thread list --status running' instead")
.action(() => {
process.stderr.write(`Error: Command 'thread running' has been removed.
Use 'thread list --status running' instead.
For more information, see: uwf help thread list
`);
process.exit(1);
}); });
const skill = program.command("skill").description("Built-in skill references for agents"); const skill = program.command("skill").description("Built-in skill references for agents");
@@ -0,0 +1,227 @@
import type { Store as CasStore, JSONSchema } from "@uncaged/json-cas";
import { getSchema } from "@uncaged/json-cas";
import type {
CasRef,
StartNodePayload,
StepNodePayload,
ThreadId,
} from "@uncaged/workflow-protocol";
import { loadThreadsIndex, type UwfStore } from "../store.js";
type ChainState = {
startHash: CasRef;
start: StartNodePayload;
stepsNewestFirst: StepNodePayload[];
headIsStart: boolean;
};
type OrderedStepItem = {
hash: CasRef;
payload: StepNodePayload;
timestamp: number;
};
function fail(message: string): never {
process.stderr.write(`${message}\n`);
process.exit(1);
}
function walkChain(uwf: UwfStore, headHash: CasRef): ChainState {
const headNode = uwf.store.get(headHash);
if (headNode === null) {
fail(`CAS node not found: ${headHash}`);
}
if (headNode.type === uwf.schemas.startNode) {
return {
startHash: headHash,
start: headNode.payload as StartNodePayload,
stepsNewestFirst: [],
headIsStart: true,
};
}
if (headNode.type !== uwf.schemas.stepNode) {
fail(`head ${headHash} is not a StartNode or StepNode`);
}
const stepsNewestFirst: StepNodePayload[] = [];
let hash: CasRef | null = headHash;
while (hash !== null) {
const node = uwf.store.get(hash);
if (node === null) {
fail(`CAS node not found while walking chain: ${hash}`);
}
if (node.type !== uwf.schemas.stepNode) {
break;
}
const payload = node.payload as StepNodePayload;
stepsNewestFirst.push(payload);
hash = payload.prev;
}
const newest = stepsNewestFirst[0];
if (newest === undefined) {
fail(`empty step chain at head ${headHash}`);
}
const startNode = uwf.store.get(newest.start);
if (startNode === null || startNode.type !== uwf.schemas.startNode) {
fail(`StartNode not found: ${newest.start}`);
}
return {
startHash: newest.start,
start: startNode.payload as StartNodePayload,
stepsNewestFirst,
headIsStart: false,
};
}
function expandOutput(uwf: UwfStore, outputRef: CasRef): unknown {
const node = uwf.store.get(outputRef);
if (node === null) {
return {};
}
return node.payload;
}
/**
* Recursively expand all cas_ref fields in a CAS node's payload,
* replacing hash strings with the referenced node's expanded payload.
*/
function expandDeep(store: CasStore, hash: CasRef, visited?: Set<string>): unknown {
const seen = visited ?? new Set<string>();
if (seen.has(hash)) return hash; // cycle guard
seen.add(hash);
const node = store.get(hash);
if (node === null) return hash;
const schema = getSchema(store, node.type);
if (schema === null) return node.payload;
return expandValue(store, schema, node.payload, seen);
}
function expandCasRefField(store: CasStore, value: unknown, visited: Set<string>): unknown {
if (typeof value === "string") {
return expandDeep(store, value as CasRef, visited);
}
return value;
}
function expandAnyOfField(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (!Array.isArray(schema.anyOf)) return value;
for (const sub of schema.anyOf as JSONSchema[]) {
if (sub.format === "cas_ref" && typeof value === "string") {
return expandDeep(store, value as CasRef, visited);
}
}
return value;
}
function expandArrayField(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (!schema.items || !Array.isArray(value)) return value;
const itemSchema = schema.items as JSONSchema;
return (value as unknown[]).map((item) => expandValue(store, itemSchema, item, visited));
}
function expandObjectField(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (value === null || typeof value !== "object" || Array.isArray(value) || !schema.properties) {
return value;
}
const props = schema.properties as Record<string, JSONSchema>;
const obj = value as Record<string, unknown>;
const result: Record<string, unknown> = {};
for (const [key, val] of Object.entries(obj)) {
const propSchema = props[key];
result[key] = propSchema ? expandValue(store, propSchema, val, visited) : val;
}
return result;
}
function expandValue(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (schema.format === "cas_ref") return expandCasRefField(store, value, visited);
if (Array.isArray(schema.anyOf)) return expandAnyOfField(store, schema, value, visited);
if (schema.type === "array") return expandArrayField(store, schema, value, visited);
return expandObjectField(store, schema, value, visited);
}
function collectOrderedSteps(
uwf: UwfStore,
headHash: CasRef,
chain: ChainState,
): OrderedStepItem[] {
let hash: CasRef | null = headHash;
const hashToNode = new Map<string, { payload: StepNodePayload; timestamp: number }>();
while (hash !== null) {
const node = uwf.store.get(hash);
if (node === null || node.type !== uwf.schemas.stepNode) {
break;
}
const payload = node.payload as StepNodePayload;
hashToNode.set(hash, { payload, timestamp: node.timestamp });
hash = payload.prev;
}
let cur: CasRef | null = chain.headIsStart ? null : headHash;
const ordered: OrderedStepItem[] = [];
while (cur !== null) {
const entry = hashToNode.get(cur);
if (entry === undefined) {
break;
}
ordered.push({ hash: cur, ...entry });
cur = entry.payload.prev;
}
ordered.reverse();
return ordered;
}
async function resolveHeadHash(storageRoot: string, threadId: ThreadId): Promise<CasRef> {
const index = await loadThreadsIndex(storageRoot);
const head = index[threadId];
if (head === undefined) {
fail(`thread not active: ${threadId}`);
}
return head;
}
export {
type ChainState,
collectOrderedSteps,
expandAnyOfField,
expandArrayField,
expandCasRefField,
expandDeep,
expandObjectField,
expandOutput,
expandValue,
fail,
type OrderedStepItem,
resolveHeadHash,
walkChain,
};
+145
View File
@@ -0,0 +1,145 @@
import type {
CasRef,
StartEntry,
StepEntry,
StepNodePayload,
ThreadForkOutput,
ThreadId,
ThreadStepsOutput,
} from "@uncaged/workflow-protocol";
import { generateUlid } from "@uncaged/workflow-util";
import { createUwfStore, loadThreadsIndex, saveThreadsIndex } from "../store.js";
import {
collectOrderedSteps,
expandDeep,
expandOutput,
fail,
resolveHeadHash,
walkChain,
} from "./shared.js";
/**
* List all steps in a thread (previously: thread steps)
*/
export async function cmdStepList(
storageRoot: string,
threadId: ThreadId,
): Promise<ThreadStepsOutput> {
const headHash = await resolveHeadHash(storageRoot, threadId);
const uwf = await createUwfStore(storageRoot);
const chain = walkChain(uwf, headHash);
const startNode = uwf.store.get(chain.startHash);
if (startNode === null) {
fail(`StartNode not found: ${chain.startHash}`);
}
const startEntry: StartEntry = {
hash: chain.startHash,
workflow: chain.start.workflow,
prompt: chain.start.prompt,
timestamp: startNode.timestamp,
};
const stepEntries: StepEntry[] = [];
const ordered = collectOrderedSteps(uwf, headHash, chain);
for (const item of ordered) {
stepEntries.push({
hash: item.hash,
role: item.payload.role,
output: expandOutput(uwf, item.payload.output),
detail: item.payload.detail ?? null,
agent: item.payload.agent,
timestamp: item.timestamp,
});
}
return {
thread: threadId,
workflow: chain.start.workflow,
steps: [startEntry, ...stepEntries],
};
}
/**
* Show details of a specific step (previously: thread step-details)
*/
export async function cmdStepShow(storageRoot: string, stepHash: CasRef): Promise<unknown> {
const uwf = await createUwfStore(storageRoot);
const node = uwf.store.get(stepHash);
if (node === null) {
fail(`CAS node not found: ${stepHash}`);
}
if (node.type !== uwf.schemas.stepNode) {
fail(`node ${stepHash} is not a StepNode`);
}
const payload = node.payload as StepNodePayload;
if (!payload.detail) {
fail(`step ${stepHash} has no detail`);
}
return expandDeep(uwf.store, payload.detail);
}
/**
* Fork a thread from a specific step (previously: thread fork)
*/
export async function cmdStepFork(
storageRoot: string,
stepHash: CasRef,
): Promise<ThreadForkOutput> {
const uwf = await createUwfStore(storageRoot);
const node = uwf.store.get(stepHash);
if (node === null) {
fail(`CAS node not found: ${stepHash}`);
}
if (node.type !== uwf.schemas.startNode && node.type !== uwf.schemas.stepNode) {
fail(`node ${stepHash} is not a StartNode or StepNode`);
}
const newThreadId = generateUlid(Date.now()) as ThreadId;
const index = await loadThreadsIndex(storageRoot);
index[newThreadId] = stepHash;
await saveThreadsIndex(storageRoot, index);
return {
thread: newThreadId,
forkedFrom: {
step: stepHash,
},
};
}
/**
* Read a step's agent output as markdown (new command - requires #462)
* TODO: Implement once unified agent detail/turn schema is available
*/
export async function cmdStepRead(
storageRoot: string,
stepHash: CasRef,
_before: number | null = null,
): Promise<string> {
const uwf = await createUwfStore(storageRoot);
const node = uwf.store.get(stepHash);
if (node === null) {
fail(`CAS node not found: ${stepHash}`);
}
if (node.type !== uwf.schemas.stepNode) {
fail(`node ${stepHash} is not a StepNode`);
}
const payload = node.payload as StepNodePayload;
if (!payload.output) {
fail(`step ${stepHash} has no output`);
}
// TODO: Implement progressive turn reading with --before N
// For now, return a placeholder
const outputNode = uwf.store.get(payload.output);
if (outputNode === null) {
fail(`output node not found: ${payload.output}`);
}
// Return the output as JSON for now
// Once #462 is implemented, this will properly format frontmatter + markdown
return JSON.stringify(outputNode.payload, null, 2);
}
+88 -307
View File
@@ -1,8 +1,7 @@
import { execFileSync, spawn } from "node:child_process"; import { execFileSync, spawn } from "node:child_process";
import { access, readFile } from "node:fs/promises"; import { access, readFile } from "node:fs/promises";
import { dirname, isAbsolute, resolve as resolvePath } from "node:path"; import { dirname, isAbsolute, resolve as resolvePath } from "node:path";
import type { Store as CasStore, JSONSchema } from "@uncaged/json-cas"; import { validate } from "@uncaged/json-cas";
import { getSchema, validate } from "@uncaged/json-cas";
import { getEnvPath, loadWorkflowConfig } from "@uncaged/workflow-agent-kit"; import { getEnvPath, loadWorkflowConfig } from "@uncaged/workflow-agent-kit";
import { evaluate } from "@uncaged/workflow-moderator"; import { evaluate } from "@uncaged/workflow-moderator";
import type { import type {
@@ -11,17 +10,13 @@ import type {
CasRef, CasRef,
ModeratorContext, ModeratorContext,
RunningThreadsOutput, RunningThreadsOutput,
StartEntry,
StartNodePayload, StartNodePayload,
StartOutput, StartOutput,
StepContext, StepContext,
StepEntry,
StepNodePayload, StepNodePayload,
StepOutput, StepOutput,
ThreadForkOutput,
ThreadId, ThreadId,
ThreadListItem, ThreadListItem,
ThreadStepsOutput,
WorkflowConfig, WorkflowConfig,
WorkflowPayload, WorkflowPayload,
} from "@uncaged/workflow-protocol"; } from "@uncaged/workflow-protocol";
@@ -47,6 +42,14 @@ import {
type UwfStore, type UwfStore,
} from "../store.js"; } from "../store.js";
import { checkWorkflowFilenameConsistency, isCasRef, parseWorkflowPayload } from "../validate.js"; import { checkWorkflowFilenameConsistency, isCasRef, parseWorkflowPayload } from "../validate.js";
import {
type ChainState,
collectOrderedSteps,
expandOutput,
fail,
type OrderedStepItem,
walkChain,
} from "./shared.js";
import { materializeWorkflowPayload } from "./workflow.js"; import { materializeWorkflowPayload } from "./workflow.js";
const END_ROLE = "$END"; const END_ROLE = "$END";
@@ -65,29 +68,6 @@ function failStep(plog: ProcessLogger, message: string): never {
fail(message); fail(message);
} }
type ChainState = {
startHash: CasRef;
start: StartNodePayload;
stepsNewestFirst: StepNodePayload[];
headIsStart: boolean;
};
type OrderedStepItem = {
hash: CasRef;
payload: StepNodePayload;
timestamp: number;
};
export type KillOutput = {
thread: ThreadId;
archived: boolean;
};
function fail(message: string): never {
process.stderr.write(`${message}\n`);
process.exit(1);
}
/** /**
* Check if a string looks like a file path (contains path separators or has .yaml/.yml extension). * Check if a string looks like a file path (contains path separators or has .yaml/.yml extension).
*/ */
@@ -346,226 +326,70 @@ export async function cmdThreadShow(storageRoot: string, threadId: ThreadId): Pr
fail(`thread not found: ${threadId}`); fail(`thread not found: ${threadId}`);
} }
export type ThreadStatus = "idle" | "running" | "completed";
export type ThreadListItemWithStatus = ThreadListItem & {
status: ThreadStatus;
};
async function threadListItemFromActive( async function threadListItemFromActive(
storageRoot: string,
uwf: UwfStore, uwf: UwfStore,
threadId: ThreadId, threadId: ThreadId,
head: CasRef, head: CasRef,
): Promise<ThreadListItem | null> { ): Promise<ThreadListItemWithStatus | null> {
const workflow = resolveWorkflowFromHead(uwf, head); const workflow = resolveWorkflowFromHead(uwf, head);
if (workflow === null) { if (workflow === null) {
return null; return null;
} }
return { thread: threadId, workflow, head };
// Check if thread is currently running in background
const runningMarker = await isThreadRunning(storageRoot, threadId);
const status: ThreadStatus = runningMarker !== null ? "running" : "idle";
return { thread: threadId, workflow, head, status };
} }
export async function cmdThreadList( export async function cmdThreadList(
storageRoot: string, storageRoot: string,
includeAll: boolean, statusFilter: ThreadStatus | null,
): Promise<ThreadListItem[]> { ): Promise<ThreadListItemWithStatus[]> {
const uwf = await createUwfStore(storageRoot); const uwf = await createUwfStore(storageRoot);
const index = await loadThreadsIndex(storageRoot); const index = await loadThreadsIndex(storageRoot);
const items: ThreadListItem[] = []; const items: ThreadListItemWithStatus[] = [];
// Add active threads
for (const [threadId, head] of Object.entries(index)) { for (const [threadId, head] of Object.entries(index)) {
const item = await threadListItemFromActive(uwf, threadId as ThreadId, head); const item = await threadListItemFromActive(storageRoot, uwf, threadId as ThreadId, head);
if (item !== null) { if (item !== null) {
items.push(item); items.push(item);
} }
} }
if (!includeAll) { // Add completed threads if requested
return items; if (statusFilter === "completed" || statusFilter === null) {
const activeIds = new Set(items.map((i) => i.thread));
const history = await loadThreadHistory(storageRoot);
for (const entry of history) {
if (!activeIds.has(entry.thread)) {
items.push({
thread: entry.thread,
workflow: entry.workflow,
head: entry.head,
status: "completed",
});
}
}
} }
const activeIds = new Set(items.map((i) => i.thread)); // Apply status filter if provided
const history = await loadThreadHistory(storageRoot); if (statusFilter !== null) {
for (const entry of history) { return items.filter((item) => item.status === statusFilter);
if (!activeIds.has(entry.thread)) {
items.push({
thread: entry.thread,
workflow: entry.workflow,
head: entry.head,
});
}
} }
return items; return items;
} }
function walkChain(uwf: UwfStore, headHash: CasRef): ChainState {
const headNode = uwf.store.get(headHash);
if (headNode === null) {
fail(`CAS node not found: ${headHash}`);
}
if (headNode.type === uwf.schemas.startNode) {
return {
startHash: headHash,
start: headNode.payload as StartNodePayload,
stepsNewestFirst: [],
headIsStart: true,
};
}
if (headNode.type !== uwf.schemas.stepNode) {
fail(`head ${headHash} is not a StartNode or StepNode`);
}
const stepsNewestFirst: StepNodePayload[] = [];
let hash: CasRef | null = headHash;
while (hash !== null) {
const node = uwf.store.get(hash);
if (node === null) {
fail(`CAS node not found while walking chain: ${hash}`);
}
if (node.type !== uwf.schemas.stepNode) {
break;
}
const payload = node.payload as StepNodePayload;
stepsNewestFirst.push(payload);
hash = payload.prev;
}
const newest = stepsNewestFirst[0];
if (newest === undefined) {
fail(`empty step chain at head ${headHash}`);
}
const startNode = uwf.store.get(newest.start);
if (startNode === null || startNode.type !== uwf.schemas.startNode) {
fail(`StartNode not found: ${newest.start}`);
}
return {
startHash: newest.start,
start: startNode.payload as StartNodePayload,
stepsNewestFirst,
headIsStart: false,
};
}
function expandOutput(uwf: UwfStore, outputRef: CasRef): unknown {
const node = uwf.store.get(outputRef);
if (node === null) {
return {};
}
return node.payload;
}
/**
* Recursively expand all cas_ref fields in a CAS node's payload,
* replacing hash strings with the referenced node's expanded payload.
*/
function expandDeep(store: CasStore, hash: CasRef, visited?: Set<string>): unknown {
const seen = visited ?? new Set<string>();
if (seen.has(hash)) return hash; // cycle guard
seen.add(hash);
const node = store.get(hash);
if (node === null) return hash;
const schema = getSchema(store, node.type);
if (schema === null) return node.payload;
return expandValue(store, schema, node.payload, seen);
}
function expandCasRefField(store: CasStore, value: unknown, visited: Set<string>): unknown {
if (typeof value === "string") {
return expandDeep(store, value as CasRef, visited);
}
return value;
}
function expandAnyOfField(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (!Array.isArray(schema.anyOf)) return value;
for (const sub of schema.anyOf as JSONSchema[]) {
if (sub.format === "cas_ref" && typeof value === "string") {
return expandDeep(store, value as CasRef, visited);
}
}
return value;
}
function expandArrayField(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (!schema.items || !Array.isArray(value)) return value;
const itemSchema = schema.items as JSONSchema;
return (value as unknown[]).map((item) => expandValue(store, itemSchema, item, visited));
}
function expandObjectField(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (value === null || typeof value !== "object" || Array.isArray(value) || !schema.properties) {
return value;
}
const props = schema.properties as Record<string, JSONSchema>;
const obj = value as Record<string, unknown>;
const result: Record<string, unknown> = {};
for (const [key, val] of Object.entries(obj)) {
const propSchema = props[key];
result[key] = propSchema ? expandValue(store, propSchema, val, visited) : val;
}
return result;
}
function expandValue(
store: CasStore,
schema: JSONSchema,
value: unknown,
visited: Set<string>,
): unknown {
if (schema.format === "cas_ref") return expandCasRefField(store, value, visited);
if (Array.isArray(schema.anyOf)) return expandAnyOfField(store, schema, value, visited);
if (schema.type === "array") return expandArrayField(store, schema, value, visited);
return expandObjectField(store, schema, value, visited);
}
function collectOrderedSteps(
uwf: UwfStore,
headHash: CasRef,
chain: ChainState,
): OrderedStepItem[] {
let hash: CasRef | null = headHash;
const hashToNode = new Map<string, { payload: StepNodePayload; timestamp: number }>();
while (hash !== null) {
const node = uwf.store.get(hash);
if (node === null || node.type !== uwf.schemas.stepNode) {
break;
}
const payload = node.payload as StepNodePayload;
hashToNode.set(hash, { payload, timestamp: node.timestamp });
hash = payload.prev;
}
let cur: CasRef | null = chain.headIsStart ? null : headHash;
const ordered: OrderedStepItem[] = [];
while (cur !== null) {
const entry = hashToNode.get(cur);
if (entry === undefined) {
break;
}
ordered.push({ hash: cur, ...entry });
cur = entry.payload.prev;
}
ordered.reverse();
return ordered;
}
function formatYaml(value: unknown): string { function formatYaml(value: unknown): string {
return stringify(value, { aliasDuplicateObjects: false }).trimEnd(); return stringify(value, { aliasDuplicateObjects: false }).trimEnd();
} }
@@ -665,14 +489,14 @@ function formatStepPrompt(
): string { ): string {
if (!roleDef || shownPromptRoles.has(role)) return ""; if (!roleDef || shownPromptRoles.has(role)) return "";
shownPromptRoles.add(role); shownPromptRoles.add(role);
return ["", "", "### Prompt", "", roleDef.goal].join("\n"); return ["", "", "<prompt>", roleDef.goal, "</prompt>"].join("\n");
} }
function formatStepContent(uwf: UwfStore, item: OrderedStepItem): string { function formatStepContent(uwf: UwfStore, item: OrderedStepItem): string {
if (!item.payload.detail) return ""; if (!item.payload.detail) return "";
const content = extractLastAssistantContent(uwf, item.payload.detail); const content = extractLastAssistantContent(uwf, item.payload.detail);
if (content === null) return ""; if (content === null) return "";
return ["", "", "### Content", "", content].join("\n"); return ["", "", "<output>", content, "</output>"].join("\n");
} }
function formatStartSection(options: { function formatStartSection(options: {
@@ -857,7 +681,7 @@ async function archiveThread(
}); });
} }
export async function cmdThreadStep( export async function cmdThreadExec(
storageRoot: string, storageRoot: string,
threadId: ThreadId, threadId: ThreadId,
agentOverride: string | null, agentOverride: string | null,
@@ -953,7 +777,7 @@ async function cmdThreadStepBackground(
failStep(plog, "unable to determine script path for background execution"); failStep(plog, "unable to determine script path for background execution");
} }
const args = ["thread", "step", threadId, "--count", String(count)]; const args = ["thread", "exec", threadId, "--count", String(count)];
if (agentOverride !== null) { if (agentOverride !== null) {
args.push("--agent", agentOverride); args.push("--agent", agentOverride);
@@ -1085,47 +909,6 @@ async function resolveHeadHash(storageRoot: string, threadId: ThreadId): Promise
fail(`thread not found: ${threadId}`); fail(`thread not found: ${threadId}`);
} }
export async function cmdThreadSteps(
storageRoot: string,
threadId: ThreadId,
): Promise<ThreadStepsOutput> {
const headHash = await resolveHeadHash(storageRoot, threadId);
const uwf = await createUwfStore(storageRoot);
const chain = walkChain(uwf, headHash);
const startNode = uwf.store.get(chain.startHash);
if (startNode === null) {
fail(`StartNode not found: ${chain.startHash}`);
}
const startEntry: StartEntry = {
hash: chain.startHash,
workflow: chain.start.workflow,
prompt: chain.start.prompt,
timestamp: startNode.timestamp,
};
const stepEntries: StepEntry[] = [];
const ordered = collectOrderedSteps(uwf, headHash, chain);
for (const item of ordered) {
stepEntries.push({
hash: item.hash,
role: item.payload.role,
output: expandOutput(uwf, item.payload.output),
detail: item.payload.detail,
agent: item.payload.agent,
timestamp: item.timestamp,
});
}
return {
thread: threadId,
workflow: chain.start.workflow,
steps: [startEntry, ...stepEntries],
};
}
export async function cmdThreadRead( export async function cmdThreadRead(
storageRoot: string, storageRoot: string,
threadId: ThreadId, threadId: ThreadId,
@@ -1153,52 +936,50 @@ export async function cmdThreadRead(
}); });
} }
export async function cmdThreadFork( export type StopOutput = {
storageRoot: string, thread: ThreadId;
stepHash: CasRef, stopped: boolean;
): Promise<ThreadForkOutput> { };
const uwf = await createUwfStore(storageRoot);
const node = uwf.store.get(stepHash);
if (node === null) {
fail(`CAS node not found: ${stepHash}`);
}
if (node.type !== uwf.schemas.startNode && node.type !== uwf.schemas.stepNode) {
fail(`node ${stepHash} is not a StartNode or StepNode`);
}
const newThreadId = generateUlid(Date.now()) as ThreadId; export type CancelOutput = {
thread: ThreadId;
cancelled: boolean;
};
/**
* Stop background execution of a thread (but keep thread active)
*/
export async function cmdThreadStop(storageRoot: string, threadId: ThreadId): Promise<StopOutput> {
const index = await loadThreadsIndex(storageRoot); const index = await loadThreadsIndex(storageRoot);
index[newThreadId] = stepHash; const head = index[threadId];
await saveThreadsIndex(storageRoot, index); if (head === undefined) {
fail(`thread not active: ${threadId}`);
}
return { // Check if thread is running in background and terminate it
thread: newThreadId, const runningMarker = await isThreadRunning(storageRoot, threadId);
forkedFrom: { if (runningMarker === null) {
step: stepHash, process.stderr.write(`Warning: thread ${threadId} is not currently running\n`);
}, return { thread: threadId, stopped: false };
}; }
try {
process.kill(runningMarker.pid, "SIGTERM");
} catch {
// Process may have already exited, ignore error
}
await deleteMarker(storageRoot, threadId);
return { thread: threadId, stopped: true };
} }
export async function cmdThreadStepDetails( /**
* Cancel a thread (stop execution + move to history)
*/
export async function cmdThreadCancel(
storageRoot: string, storageRoot: string,
stepHash: CasRef, threadId: ThreadId,
): Promise<unknown> { ): Promise<CancelOutput> {
const uwf = await createUwfStore(storageRoot);
const node = uwf.store.get(stepHash);
if (node === null) {
fail(`CAS node not found: ${stepHash}`);
}
if (node.type !== uwf.schemas.stepNode) {
fail(`node ${stepHash} is not a StepNode`);
}
const payload = node.payload as StepNodePayload;
if (!payload.detail) {
fail(`step ${stepHash} has no detail`);
}
return expandDeep(uwf.store, payload.detail);
}
export async function cmdThreadKill(storageRoot: string, threadId: ThreadId): Promise<KillOutput> {
const index = await loadThreadsIndex(storageRoot); const index = await loadThreadsIndex(storageRoot);
const head = index[threadId]; const head = index[threadId];
if (head === undefined) { if (head === undefined) {
@@ -1233,7 +1014,7 @@ export async function cmdThreadKill(storageRoot: string, threadId: ThreadId): Pr
}; };
await appendThreadHistory(storageRoot, historyEntry); await appendThreadHistory(storageRoot, historyEntry);
return { thread: threadId, archived: true }; return { thread: threadId, cancelled: true };
} }
export async function cmdThreadRunning(storageRoot: string): Promise<RunningThreadsOutput> { export async function cmdThreadRunning(storageRoot: string): Promise<RunningThreadsOutput> {
@@ -29,7 +29,7 @@ export type WorkflowListEntry = {
origin: WorkflowOrigin; origin: WorkflowOrigin;
}; };
export type WorkflowPutOutput = { export type WorkflowAddOutput = {
name: string; name: string;
hash: CasRef; hash: CasRef;
}; };
@@ -111,10 +111,10 @@ export async function materializeWorkflowPayload(
}; };
} }
export async function cmdWorkflowPut( export async function cmdWorkflowAdd(
storageRoot: string, storageRoot: string,
filePath: string, filePath: string,
): Promise<WorkflowPutOutput> { ): Promise<WorkflowAddOutput> {
let text: string; let text: string;
try { try {
text = await readFile(filePath, "utf8"); text = await readFile(filePath, "utf8");
@@ -19,7 +19,14 @@ mock.module("../src/tools/index.js", () => ({
getBuiltinTools: () => [], getBuiltinTools: () => [],
})); }));
import { executeTurnTools, runBuiltinLoop, shouldNudge } from "../src/loop.js"; import {
executeTurnTools,
extractFinalText,
runBuiltinLoop,
shouldInjectDeadlineWarning,
shouldNudge,
shouldProcessToolCalls,
} from "../src/loop.js";
const fakeProvider = {} as any; const fakeProvider = {} as any;
const fakeToolCtx = {} as any; const fakeToolCtx = {} as any;
@@ -154,3 +161,96 @@ describe("runBuiltinLoop integration", () => {
expect(original.length).toBe(1); expect(original.length).toBe(1);
}); });
}); });
describe("shouldInjectDeadlineWarning", () => {
test("5.1 returns true when turn count reaches warning threshold and not yet warned", () => {
expect(shouldInjectDeadlineWarning(7, 10, false, false)).toBe(true);
});
test("5.2 returns false when already warned", () => {
expect(shouldInjectDeadlineWarning(7, 10, true, false)).toBe(false);
});
test("5.3 returns false when noTools is true", () => {
expect(shouldInjectDeadlineWarning(7, 10, false, true)).toBe(false);
});
test("5.4 returns false when turns remaining > DEADLINE_WARNING_TURNS", () => {
expect(shouldInjectDeadlineWarning(5, 10, false, false)).toBe(false);
});
test("5.5 returns true when exactly at warning threshold", () => {
expect(shouldInjectDeadlineWarning(7, 10, false, false)).toBe(true);
});
test("5.6 returns false when turns remaining is 0", () => {
expect(shouldInjectDeadlineWarning(10, 10, false, false)).toBe(false);
});
});
describe("shouldProcessToolCalls", () => {
test("6.1 returns true when toolCalls present and noTools=false", () => {
expect(shouldProcessToolCalls([{ id: "x", name: "read", arguments: "{}" }], false)).toBe(true);
});
test("6.2 returns false when toolCalls is null", () => {
expect(shouldProcessToolCalls(null, false)).toBe(false);
});
test("6.3 returns false when toolCalls is empty array", () => {
expect(shouldProcessToolCalls([], false)).toBe(false);
});
test("6.4 returns false when noTools=true", () => {
expect(shouldProcessToolCalls([{ id: "x", name: "read", arguments: "{}" }], true)).toBe(false);
});
test("6.5 returns true when multiple tool calls present", () => {
expect(
shouldProcessToolCalls(
[
{ id: "x1", name: "read", arguments: "{}" },
{ id: "x2", name: "write", arguments: "{}" },
],
false,
),
).toBe(true);
});
});
describe("extractFinalText", () => {
test("7.1 returns last assistant message content", () => {
const messages = [
{ role: "system" as const, content: "sys", tool_calls: null },
{ role: "assistant" as const, content: "first", tool_calls: null },
{ role: "assistant" as const, content: "last", tool_calls: null },
];
expect(extractFinalText(messages)).toBe("last");
});
test("7.2 returns empty string when no assistant messages", () => {
expect(extractFinalText([{ role: "system" as const, content: "sys", tool_calls: null }])).toBe(
"",
);
});
test("7.3 skips assistant messages with null content", () => {
const messages = [
{ role: "assistant" as const, content: "first", tool_calls: null },
{
role: "assistant" as const,
content: null,
tool_calls: [{ id: "x", name: "t", arguments: "{}" }],
},
{ role: "assistant" as const, content: "second", tool_calls: null },
];
expect(extractFinalText(messages)).toBe("second");
});
test("7.4 skips assistant messages with empty content", () => {
const messages = [
{ role: "assistant" as const, content: "first", tool_calls: null },
{ role: "assistant" as const, content: "", tool_calls: null },
{ role: "user" as const, content: "nudge", tool_calls: null },
];
expect(extractFinalText(messages)).toBe("first");
});
test("7.5 handles empty messages array", () => {
expect(extractFinalText([])).toBe("");
});
test("7.6 handles messages with only user and system roles", () => {
const messages = [
{ role: "system" as const, content: "sys", tool_calls: null },
{ role: "user" as const, content: "query", tool_calls: null },
];
expect(extractFinalText(messages)).toBe("");
});
});
+191 -82
View File
@@ -1,7 +1,12 @@
import type { ResolvedLlmProvider } from "@uncaged/workflow-agent-kit"; import type { ResolvedLlmProvider } from "@uncaged/workflow-agent-kit";
import { createLogger } from "@uncaged/workflow-util"; import { createLogger } from "@uncaged/workflow-util";
import { type ChatMessage, chatCompletionWithTools, type LlmToolCall } from "./llm/index.js"; import {
type ChatMessage,
chatCompletionWithTools,
type LlmToolCall,
type OpenAiToolDefinition,
} from "./llm/index.js";
import { appendSessionTurn } from "./session.js"; import { appendSessionTurn } from "./session.js";
import { import {
builtinToolsToOpenAi, builtinToolsToOpenAi,
@@ -80,10 +85,184 @@ export type ShouldNudgeOptions = {
const MAX_NUDGES = 3; const MAX_NUDGES = 3;
const DEADLINE_WARNING_TURNS = 3; const DEADLINE_WARNING_TURNS = 3;
export function shouldInjectDeadlineWarning(
turn: number,
maxTurns: number,
alreadyWarned: boolean,
noTools: boolean,
): boolean {
const turnsRemaining = maxTurns - turn;
return (
!noTools && !alreadyWarned && turnsRemaining > 0 && turnsRemaining <= DEADLINE_WARNING_TURNS
);
}
export function shouldProcessToolCalls(toolCalls: LlmToolCall[] | null, noTools: boolean): boolean {
return !noTools && toolCalls !== null && toolCalls.length > 0;
}
export function extractFinalText(messages: ChatMessage[]): string {
for (let i = messages.length - 1; i >= 0; i--) {
const msg = messages[i];
if (
msg !== undefined &&
msg.role === "assistant" &&
msg.content !== null &&
msg.content.trim() !== ""
) {
return msg.content;
}
}
return "";
}
function injectDeadlineWarning(messages: ChatMessage[], turnsRemaining: number): void {
log("4NRXW6KT", `${turnsRemaining} turns remaining, injecting deadline warning`);
messages.push({
role: "user",
content:
`⚠️ You have ${turnsRemaining} turns remaining. ` +
"Wrap up your work and output the YAML frontmatter starting with `---`. " +
"If you cannot finish in time, output frontmatter with `status: failed` and describe what remains.",
});
}
type HandleTextOnlyTurnResult = {
shouldBreak: boolean;
finalText: string;
turnCount: number;
nudgeCount: number;
turnAdjustment: number;
};
async function handleTextOnlyTurn(
text: string,
messages: ChatMessage[],
storageRoot: string,
sessionId: string,
noTools: boolean,
turn: number,
maxTurns: number,
currentNudgeCount: number,
): Promise<HandleTextOnlyTurnResult> {
await appendTurn(storageRoot, sessionId, {
role: "assistant",
content: text,
toolCalls: null,
reasoning: null,
});
const turnCount = 1;
let nudgeCount = currentNudgeCount;
let turnAdjustment = 0;
if (shouldNudge({ noTools, text, turn, maxTurns })) {
nudgeCount += 1;
log("7FXQM2KN", `text-only turn without frontmatter, nudge ${nudgeCount}/${MAX_NUDGES}`);
const nudge =
"You stopped calling tools but your response does not start with the required `---` YAML frontmatter. " +
"Either continue using tools to complete your work, or output your final response starting with `---`.";
messages.push({ role: "user", content: nudge });
// Nudge doesn't consume turn budget (up to MAX_NUDGES)
if (nudgeCount <= MAX_NUDGES) {
turnAdjustment = -1;
}
return { shouldBreak: false, finalText: "", turnCount, nudgeCount, turnAdjustment };
}
return { shouldBreak: true, finalText: text, turnCount, nudgeCount, turnAdjustment };
}
async function handleToolCallTurn(
content: string,
toolCalls: LlmToolCall[],
messages: ChatMessage[],
storageRoot: string,
sessionId: string,
toolCtx: ToolContext,
): Promise<number> {
await appendTurn(storageRoot, sessionId, {
role: "assistant",
content,
toolCalls: mapToolCallsForPayload(toolCalls),
reasoning: null,
});
let turnCount = 1;
// Execute tools
turnCount += await executeTurnTools(toolCalls, toolCtx, messages, storageRoot, sessionId);
return turnCount;
}
export function shouldNudge({ noTools, text, turn, maxTurns }: ShouldNudgeOptions): boolean { export function shouldNudge({ noTools, text, turn, maxTurns }: ShouldNudgeOptions): boolean {
return !noTools && !text.trimStart().startsWith("---") && turn < maxTurns - 1; return !noTools && !text.trimStart().startsWith("---") && turn < maxTurns - 1;
} }
type ProcessLoopIterationResult = {
shouldBreak: boolean;
finalText: string;
turnCount: number;
nudgeCount: number;
turnAdjustment: number;
};
async function processLoopIteration(
options: RunBuiltinLoopOptions,
messages: ChatMessage[],
openAiTools: OpenAiToolDefinition[],
turn: number,
nudgeCount: number,
): Promise<ProcessLoopIterationResult> {
const response = await chatCompletionWithTools(
options.provider,
messages,
openAiTools.length > 0 ? openAiTools : null,
);
// When noTools is set, ignore any tool_calls the LLM might still return
const effectiveToolCalls = options.noTools ? null : (response.toolCalls ?? null);
const assistantMessage: ChatMessage = {
role: "assistant",
content: response.content,
tool_calls: effectiveToolCalls,
};
messages.push(assistantMessage);
if (!shouldProcessToolCalls(effectiveToolCalls, options.noTools)) {
const text = response.content ?? "";
const result = await handleTextOnlyTurn(
text,
messages,
options.storageRoot,
options.sessionId,
options.noTools,
turn,
options.maxTurns,
nudgeCount,
);
return result;
}
// At this point, effectiveToolCalls is guaranteed to be non-null and non-empty
const turnCount = await handleToolCallTurn(
response.content ?? "",
effectiveToolCalls as LlmToolCall[],
messages,
options.storageRoot,
options.sessionId,
options.toolCtx,
);
return {
shouldBreak: false,
finalText: "",
turnCount,
nudgeCount,
turnAdjustment: 0,
};
}
/** Agent run loop: LLM ↔ tools until no tool_calls or maxTurns. */ /** Agent run loop: LLM ↔ tools until no tool_calls or maxTurns. */
export async function runBuiltinLoop( export async function runBuiltinLoop(
options: RunBuiltinLoopOptions, options: RunBuiltinLoopOptions,
@@ -99,95 +278,25 @@ export async function runBuiltinLoop(
log("8K2M4N7P", `builtin loop turn ${turn + 1}/${options.maxTurns}`); log("8K2M4N7P", `builtin loop turn ${turn + 1}/${options.maxTurns}`);
// Warn agent when approaching turn limit // Warn agent when approaching turn limit
const turnsRemaining = options.maxTurns - turn; if (shouldInjectDeadlineWarning(turn, options.maxTurns, deadlineWarned, options.noTools)) {
if (!options.noTools && !deadlineWarned && turnsRemaining <= DEADLINE_WARNING_TURNS) {
deadlineWarned = true; deadlineWarned = true;
log("4NRXW6KT", `${turnsRemaining} turns remaining, injecting deadline warning`); const turnsRemaining = options.maxTurns - turn;
messages.push({ injectDeadlineWarning(messages, turnsRemaining);
role: "user",
content:
`⚠️ You have ${turnsRemaining} turns remaining. ` +
"Wrap up your work and output the YAML frontmatter starting with `---`. " +
"If you cannot finish in time, output frontmatter with `status: failed` and describe what remains.",
});
} }
const response = await chatCompletionWithTools( const result = await processLoopIteration(options, messages, openAiTools, turn, nudgeCount);
options.provider, turnCount += result.turnCount;
messages, nudgeCount = result.nudgeCount;
openAiTools.length > 0 ? openAiTools : null, turn += result.turnAdjustment;
);
// When noTools is set, ignore any tool_calls the LLM might still return if (result.shouldBreak) {
const effectiveToolCalls = options.noTools ? null : (response.toolCalls ?? null); finalText = result.finalText;
const assistantMessage: ChatMessage = {
role: "assistant",
content: response.content,
tool_calls: effectiveToolCalls,
};
messages.push(assistantMessage);
if (effectiveToolCalls === null || effectiveToolCalls.length === 0) {
const text = response.content ?? "";
await appendTurn(options.storageRoot, options.sessionId, {
role: "assistant",
content: text,
toolCalls: null,
reasoning: null,
});
turnCount += 1;
if (shouldNudge({ noTools: options.noTools, text, turn, maxTurns: options.maxTurns })) {
nudgeCount += 1;
log("7FXQM2KN", `text-only turn without frontmatter, nudge ${nudgeCount}/${MAX_NUDGES}`);
const nudge =
"You stopped calling tools but your response does not start with the required `---` YAML frontmatter. " +
"Either continue using tools to complete your work, or output your final response starting with `---`.";
messages.push({ role: "user", content: nudge });
// Nudge doesn't consume turn budget (up to MAX_NUDGES)
if (nudgeCount <= MAX_NUDGES) {
turn -= 1;
}
continue;
}
finalText = text;
break; break;
} }
// Assistant turn with tool calls
await appendTurn(options.storageRoot, options.sessionId, {
role: "assistant",
content: response.content ?? "",
toolCalls: mapToolCallsForPayload(effectiveToolCalls),
reasoning: null,
});
turnCount += 1;
// Execute tools
turnCount += await executeTurnTools(
effectiveToolCalls,
options.toolCtx,
messages,
options.storageRoot,
options.sessionId,
);
} }
if (finalText === "" && messages.length > 0) { if (finalText === "") {
for (let i = messages.length - 1; i >= 0; i--) { finalText = extractFinalText(messages);
const msg = messages[i];
if (
msg !== undefined &&
msg.role === "assistant" &&
msg.content !== null &&
msg.content.trim() !== ""
) {
finalText = msg.content;
break;
}
}
} }
return { finalText, messages, turnCount }; return { finalText, messages, turnCount };
@@ -146,13 +146,13 @@ async function runClaudeCode(ctx: AgentContext): Promise<AgentRunResult> {
// Try resuming a cached session for re-entry scenarios (e.g. reviewer reject → developer re-entry). // Try resuming a cached session for re-entry scenarios (e.g. reviewer reject → developer re-entry).
if (!ctx.isFirstVisit) { if (!ctx.isFirstVisit) {
const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role); const cachedSessionId = await getCachedSessionId("claude-code", ctx.threadId, ctx.role);
if (cachedSessionId !== null) { if (cachedSessionId !== null) {
try { try {
const { stdout } = await spawnClaudeResume(cachedSessionId, fullPrompt); const { stdout } = await spawnClaudeResume(cachedSessionId, fullPrompt);
const result = await processClaudeOutput(stdout, ctx.store); const result = await processClaudeOutput(stdout, ctx.store);
if (result.sessionId !== undefined && result.sessionId !== "") { if (result.sessionId !== undefined && result.sessionId !== "") {
await setCachedSessionId(ctx.threadId, ctx.role, result.sessionId); await setCachedSessionId("claude-code", ctx.threadId, ctx.role, result.sessionId);
} }
return result; return result;
} catch (err) { } catch (err) {
@@ -169,7 +169,7 @@ async function runClaudeCode(ctx: AgentContext): Promise<AgentRunResult> {
const { stdout } = await spawnClaudeRun(fullPrompt); const { stdout } = await spawnClaudeRun(fullPrompt);
const result = await processClaudeOutput(stdout, ctx.store); const result = await processClaudeOutput(stdout, ctx.store);
if (result.sessionId !== undefined && result.sessionId !== "") { if (result.sessionId !== undefined && result.sessionId !== "") {
await setCachedSessionId(ctx.threadId, ctx.role, result.sessionId); await setCachedSessionId("claude-code", ctx.threadId, ctx.role, result.sessionId);
} }
return result; return result;
} }
@@ -1,5 +1,22 @@
// Re-export session cache from the shared agent-kit package. // Re-export session cache from the shared agent-kit package with agent name injected.
export { getCachedSessionId, setCachedSessionId } from "@uncaged/workflow-agent-kit";
import {
getCachedSessionId as getCachedSessionIdBase,
setCachedSessionId as setCachedSessionIdBase,
} from "@uncaged/workflow-agent-kit";
import type { ThreadId } from "@uncaged/workflow-protocol";
export async function getCachedSessionId(threadId: ThreadId, role: string): Promise<string | null> {
return getCachedSessionIdBase("hermes", threadId, role);
}
export async function setCachedSessionId(
threadId: ThreadId,
role: string,
sessionId: string,
): Promise<void> {
return setCachedSessionIdBase("hermes", threadId, role, sessionId);
}
export function isResumeDisabled(): boolean { export function isResumeDisabled(): boolean {
// Hermes ACP session/resume is broken: _restore fails for custom providers // Hermes ACP session/resume is broken: _restore fails for custom providers
@@ -0,0 +1,247 @@
import { mkdir, readdir, readFile, rm, stat, writeFile } from "node:fs/promises";
import { dirname, join } from "node:path";
import type { ThreadId } from "@uncaged/workflow-protocol";
import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { getCachedSessionId, getCachePath, setCachedSessionId } from "../src/session-cache.js";
import { resolveStorageRoot } from "../src/storage.js";
describe("session-cache", () => {
let originalStorageRoot: string;
let testStorageRoot: string;
beforeEach(async () => {
// Create a temporary test storage root
originalStorageRoot = resolveStorageRoot();
testStorageRoot = join(originalStorageRoot, "test-cache", `test-${Date.now()}`);
await mkdir(testStorageRoot, { recursive: true });
// Override the storage root for testing
process.env.WORKFLOW_STORAGE_ROOT = testStorageRoot;
});
afterEach(async () => {
// Clean up test storage root
await rm(testStorageRoot, { recursive: true, force: true });
delete process.env.WORKFLOW_STORAGE_ROOT;
});
describe("getCachePath", () => {
test("returns agent-specific file path", () => {
const path = getCachePath("claude-code");
expect(path).toMatch(/\/cache\/claude-code-sessions\.json$/);
});
test("returns different paths for different agents", () => {
const pathClaudeCode = getCachePath("claude-code");
const pathHermes = getCachePath("hermes");
expect(pathClaudeCode).not.toBe(pathHermes);
expect(pathClaudeCode).toMatch(/claude-code-sessions\.json$/);
expect(pathHermes).toMatch(/hermes-sessions\.json$/);
});
test("handles agent names with special characters", () => {
const path1 = getCachePath("my-agent");
const path2 = getCachePath("my_agent");
expect(path1).toMatch(/my-agent-sessions\.json$/);
expect(path2).toMatch(/my_agent-sessions\.json$/);
});
});
describe("session isolation", () => {
const threadId = "01234567890123456789012345" as ThreadId;
const role = "developer";
test("sessions are isolated per agent", async () => {
// Cache different session IDs for each agent
await setCachedSessionId("claude-code", threadId, role, "session-cc-001");
await setCachedSessionId("hermes", threadId, role, "session-hermes-001");
// Each agent should retrieve its own session ID
const sessionCC = await getCachedSessionId("claude-code", threadId, role);
const sessionHermes = await getCachedSessionId("hermes", threadId, role);
expect(sessionCC).toBe("session-cc-001");
expect(sessionHermes).toBe("session-hermes-001");
});
test("updating one agent's cache does not affect another", async () => {
// Set initial sessions for both agents
await setCachedSessionId("claude-code", threadId, role, "session-cc-001");
await setCachedSessionId("hermes", threadId, role, "session-hermes-001");
// Update claude-code's session
await setCachedSessionId("claude-code", threadId, role, "session-cc-002");
// Hermes's session should remain unchanged
const sessionHermes = await getCachedSessionId("hermes", threadId, role);
expect(sessionHermes).toBe("session-hermes-001");
// Claude-code should have the new session
const sessionCC = await getCachedSessionId("claude-code", threadId, role);
expect(sessionCC).toBe("session-cc-002");
});
test("missing session returns null for specific agent", async () => {
const session = await getCachedSessionId("claude-code", threadId, role);
expect(session).toBeNull();
});
test("empty session ID is treated as missing", async () => {
await setCachedSessionId("claude-code", threadId, role, "");
const session = await getCachedSessionId("claude-code", threadId, role);
expect(session).toBeNull();
});
});
describe("file system operations", () => {
const threadId = "01234567890123456789012345" as ThreadId;
const role = "developer";
test("cache directory is created if missing", async () => {
const cachePath = getCachePath("claude-code");
const cacheDir = dirname(cachePath);
// Ensure cache dir doesn't exist
await rm(cacheDir, { recursive: true, force: true });
// Write a session
await setCachedSessionId("claude-code", threadId, role, "session-001");
// Cache directory should be created
const stats = await stat(cacheDir);
expect(stats.isDirectory()).toBe(true);
});
test("multiple agents create separate cache files", async () => {
// Cache sessions for multiple agents
await setCachedSessionId("claude-code", threadId, role, "session-cc-001");
await setCachedSessionId("hermes", threadId, role, "session-hermes-001");
// Separate cache files should exist
const pathCC = getCachePath("claude-code");
const pathHermes = getCachePath("hermes");
const contentCC = JSON.parse(await readFile(pathCC, "utf8")) as Record<string, string>;
const contentHermes = JSON.parse(await readFile(pathHermes, "utf8")) as Record<
string,
string
>;
expect(contentCC).toHaveProperty(`${threadId}:${role}`, "session-cc-001");
expect(contentHermes).toHaveProperty(`${threadId}:${role}`, "session-hermes-001");
});
test("atomic writes prevent partial reads", async () => {
// Write a session
await setCachedSessionId("claude-code", threadId, role, "session-001");
// The final file should exist (no .tmp files left behind)
const cachePath = getCachePath("claude-code");
const dir = dirname(cachePath);
const files = await readdir(dir);
expect(files).toContain("claude-code-sessions.json");
expect(files.every((f) => !f.endsWith(".tmp"))).toBe(true);
});
});
describe("legacy migration", () => {
const threadId = "01234567890123456789012345" as ThreadId;
const role = "developer";
test("old agent-sessions.json is ignored", async () => {
// Create old agent-sessions.json file
const oldCachePath = join(resolveStorageRoot(), "cache", "agent-sessions.json");
await mkdir(dirname(oldCachePath), { recursive: true });
await writeFile(
oldCachePath,
JSON.stringify({
"01234567890123456789012345:developer": "old-session-001",
}),
"utf8",
);
// Query with the new per-agent cache
const session = await getCachedSessionId("claude-code", threadId, role);
// Should return null (old cache is ignored)
expect(session).toBeNull();
});
test("new per-agent cache takes precedence", async () => {
// Create both old and new cache files
const oldPath = join(resolveStorageRoot(), "cache", "agent-sessions.json");
await mkdir(dirname(oldPath), { recursive: true });
await writeFile(
oldPath,
JSON.stringify({
[`${threadId}:${role}`]: "old-session",
}),
"utf8",
);
await setCachedSessionId("claude-code", threadId, role, "new-session");
// The new per-agent cache value should be returned
const session = await getCachedSessionId("claude-code", threadId, role);
expect(session).toBe("new-session");
});
});
describe("error handling", () => {
const threadId = "01234567890123456789012345" as ThreadId;
const role = "developer";
test("invalid JSON in cache file returns empty cache", async () => {
// Create a corrupted cache file
const cachePath = getCachePath("claude-code");
await mkdir(dirname(cachePath), { recursive: true });
await writeFile(cachePath, "{ invalid json }", "utf8");
// Should return null (treating corrupted cache as empty)
const session = await getCachedSessionId("claude-code", threadId, role);
expect(session).toBeNull();
});
test("non-object JSON in cache file returns empty cache", async () => {
// Create a cache file with non-object JSON
const cachePath = getCachePath("claude-code");
await mkdir(dirname(cachePath), { recursive: true });
await writeFile(cachePath, JSON.stringify(["not", "an", "object"]), "utf8");
// Should return null
const session = await getCachedSessionId("claude-code", threadId, role);
expect(session).toBeNull();
});
test("cache entries with non-string values are ignored", async () => {
// Create a cache file with mixed types
const cachePath = getCachePath("claude-code");
const cacheData = {
"thread1:role1": "valid-session",
"thread2:role2": 12345, // number
"thread3:role3": null, // null
"thread4:role4": "", // empty string
};
await mkdir(dirname(cachePath), { recursive: true });
await writeFile(cachePath, JSON.stringify(cacheData), "utf8");
// Valid string entries should be returned
const session1 = await getCachedSessionId("claude-code", "thread1" as ThreadId, "role1");
expect(session1).toBe("valid-session");
// Invalid entries should return null
const session2 = await getCachedSessionId("claude-code", "thread2" as ThreadId, "role2");
const session3 = await getCachedSessionId("claude-code", "thread3" as ThreadId, "role3");
const session4 = await getCachedSessionId("claude-code", "thread4" as ThreadId, "role4");
expect(session2).toBeNull();
expect(session3).toBeNull();
expect(session4).toBeNull(); // empty string is treated as missing
});
});
});
+1 -1
View File
@@ -12,7 +12,7 @@ export {
export type { FrontmatterFastPathResult } from "./frontmatter.js"; export type { FrontmatterFastPathResult } from "./frontmatter.js";
export { tryFrontmatterFastPath } from "./frontmatter.js"; export { tryFrontmatterFastPath } from "./frontmatter.js";
export { createAgent } from "./run.js"; export { createAgent } from "./run.js";
export { getCachedSessionId, setCachedSessionId } from "./session-cache.js"; export { getCachedSessionId, getCachePath, setCachedSessionId } from "./session-cache.js";
export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js"; export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
export type { export type {
AgentContext, AgentContext,
@@ -8,8 +8,8 @@ import { resolveStorageRoot } from "./storage.js";
type SessionCache = Record<string, string>; type SessionCache = Record<string, string>;
function getCachePath(): string { export function getCachePath(agentName: string): string {
return join(resolveStorageRoot(), "cache", "agent-sessions.json"); return join(resolveStorageRoot(), "cache", `${agentName}-sessions.json`);
} }
function cacheKey(threadId: ThreadId, role: string): string { function cacheKey(threadId: ThreadId, role: string): string {
@@ -20,8 +20,8 @@ function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null && !Array.isArray(value); return typeof value === "object" && value !== null && !Array.isArray(value);
} }
async function readCache(): Promise<SessionCache> { async function readCache(agentName: string): Promise<SessionCache> {
const path = getCachePath(); const path = getCachePath(agentName);
try { try {
const text = await readFile(path, "utf8"); const text = await readFile(path, "utf8");
const raw = JSON.parse(text) as unknown; const raw = JSON.parse(text) as unknown;
@@ -40,36 +40,45 @@ async function readCache(): Promise<SessionCache> {
if (err.code === "ENOENT") { if (err.code === "ENOENT") {
return {}; return {};
} }
// Treat JSON parse errors as empty cache
if (err.name === "SyntaxError") {
return {};
}
throw e; throw e;
} }
} }
async function writeCache(cache: SessionCache): Promise<void> { async function writeCache(agentName: string, cache: SessionCache): Promise<void> {
const path = getCachePath(); const path = getCachePath(agentName);
const dir = dirname(path); const dir = dirname(path);
await mkdir(dir, { recursive: true }); await mkdir(dir, { recursive: true });
// Atomic write: write to temp file then rename to avoid partial reads on concurrent access. // Atomic write: write to temp file then rename to avoid partial reads on concurrent access.
// NOTE: Current workflow execution is serial (execFileSync), so true concurrency doesn't occur. // NOTE: Current workflow execution is serial (execFileSync), so true concurrency doesn't occur.
// This is a safety net for future parallel execution. // This is a safety net for future parallel execution.
const tmpPath = join(dir, `.agent-sessions.${randomBytes(4).toString("hex")}.tmp`); const tmpPath = join(dir, `.${agentName}-sessions.${randomBytes(4).toString("hex")}.tmp`);
await writeFile(tmpPath, `${JSON.stringify(cache, null, 2)}\n`, "utf8"); await writeFile(tmpPath, `${JSON.stringify(cache, null, 2)}\n`, "utf8");
await rename(tmpPath, path); await rename(tmpPath, path);
} }
/** Read the cached session ID for a thread+role pair. */ /** Read the cached session ID for a thread+role pair. */
export async function getCachedSessionId(threadId: ThreadId, role: string): Promise<string | null> { export async function getCachedSessionId(
const cache = await readCache(); agentName: string,
threadId: ThreadId,
role: string,
): Promise<string | null> {
const cache = await readCache(agentName);
const sessionId = cache[cacheKey(threadId, role)]; const sessionId = cache[cacheKey(threadId, role)];
return sessionId ?? null; return sessionId ?? null;
} }
/** Write the session ID for a thread+role pair into the cache. */ /** Write the session ID for a thread+role pair into the cache. */
export async function setCachedSessionId( export async function setCachedSessionId(
agentName: string,
threadId: ThreadId, threadId: ThreadId,
role: string, role: string,
sessionId: string, sessionId: string,
): Promise<void> { ): Promise<void> {
const cache = await readCache(); const cache = await readCache(agentName);
cache[cacheKey(threadId, role)] = sessionId; cache[cacheKey(threadId, role)] = sessionId;
await writeCache(cache); await writeCache(agentName, cache);
} }
@@ -1,7 +1,6 @@
import path from "node:path"; import path from "node:path";
import { defineConfig } from "vitest/config"; import { defineConfig } from "vitest/config";
// biome-ignore lint/style/noDefaultExport: Vitest loads config from default export.
export default defineConfig({ export default defineConfig({
test: { test: {
environment: "node", environment: "node",