Compare commits

..

43 Commits

Author SHA1 Message Date
xiaoju 30e4e99908 feat: add refs tracking to RoleStep
- RoleOutput gains refs: string[] for CAS reference tracking
- RoleDefinition gains extractRefs: ((meta) => string[]) | null
- planner: phases.map(p => p.hash), coder: [completedPhase]
- Engine persists refs, fork preserves refs
- Backward compat: missing refs normalized to []
- 137 tests passing

Fixes #31
2026-05-07 10:44:25 +00:00
xiaoju a3c70a5041 Merge pull request 'feat: migrate CAS to global storage' (#34) from feat/30-global-cas into main 2026-05-07 10:40:41 +00:00
xiaoju 12d58a8206 feat: migrate CAS to global storage
- Add getGlobalCasDir() to storage-root.ts
- cmd-cas.ts uses global CAS dir, threadId kept for CLI compat
- thread rm no longer deletes .cas/ directories
- Rename createThreadCas → createCasStore (deprecated alias kept)
- 134 tests passing

BREAKING: CAS moves from <thread>.cas/ to <storageRoot>/cas/

Fixes #30
2026-05-07 10:40:14 +00:00
xiaoju c096f4d94e docs: add workflow-as-agent implementation plan
Covers 4 phases:
1. Global CAS migration
2. RoleStep refs tracking
3. CAS garbage collection
4. workflowAsAgent(name) factory

Refs #25
2026-05-07 10:35:26 +00:00
xiaoju 500401d93c feat(workflow): add preparer role to solve-issue workflow (#29, closes #28) 2026-05-07 10:18:05 +00:00
xiaoju 43f466eb67 style(solve-issue): fix biome formatting in test file
Refs #28
2026-05-07 10:15:10 +00:00
xiaoju fe829d9ae6 feat(workflow): add preparer role as first step in solve-issue workflow
- New package: @uncaged/workflow-role-preparer with PreparerMeta type,
  schema, system prompt, and extract prompt
- Preparer locates/clones repo, detects toolchain and conventions
- Moderator updated: preparer → planner → coder → reviewer → committer
- solve-issue template re-exports preparer types
- Tests updated for new flow (129 pass, 0 fail)

Fixes #28
2026-05-07 10:12:59 +00:00
xiaoju f80535d742 fix(planner,coder): clarify CAS CLI usage in agent prompts (#27) 2026-05-07 09:43:21 +00:00
xiaoju 0eab3b7001 chore: remove temporary bundle entry file 2026-05-07 09:43:00 +00:00
xiaoju 37c5b89c98 fix(planner,coder): clarify CAS CLI usage in agent prompts
- Planner: mandatory CAS put with complete command template, thread ID guidance
- Coder: CAS get command template with thread ID guidance
- Forbid inventing storage paths — must use uncaged-workflow cas CLI

Fixes #26
2026-05-07 09:42:52 +00:00
xiaoju 0fdf19879a Merge pull request 'test(workflow): add unit tests for validateWorkflowDescriptor' (#20) from test/19-validate-workflow-descriptor into main 2026-05-07 09:38:54 +00:00
xiaoju f73bf1e313 test(workflow): add unit tests for validateWorkflowDescriptor
Fixes #19
2026-05-07 09:38:18 +00:00
xiaoju 8c4441bf6b feat: thread-scoped CAS for phase tracking (#23) 2026-05-07 04:59:03 +00:00
xiaoju 341ff656dc feat(planner,coder,moderator): integrate CAS for phase tracking
Phase 2 of #23:
- Planner schema compact: {hash, title} only, details stored via CAS CLI
- Planner prompt instructs agent to shell out `cas put` for each phase
- Coder prompt instructs agent to `cas get` for phase details, report hash
- Moderator compares hashes instead of names
- Removed COMPLETED_PHASE_SENTINELS — hash matching eliminates ambiguity

Refs #23
2026-05-07 04:54:25 +00:00
xiaoju 4b44665c7e feat(workflow): add thread-scoped CAS (Content-Addressable Storage)
Phase 1 of #23:
- createThreadCas() core API: put/get/delete/list with XXH64 hashing
- hashString() utility for string → 13-char Crockford Base32
- CLI: uncaged-workflow cas get/put/list/rm subcommands
- thread rm now cleans up .cas/ directory
- 10 new tests for CAS operations

Refs #23
2026-05-07 04:30:19 +00:00
xiaoju 172e9b34cc feat(planner): add hash and title fields to phase schema
Each phase now carries a hash (Crockford Base32 identifier) and a
one-line title alongside the existing name/description/acceptance.
This gives agents immediate semantic context in the prompt without
needing to load full phase details from CAS.

Refs #23
2026-05-07 04:18:42 +00:00
xiaoju 96fc3e220a fix(solve-issue): handle one-shot coder completing all phases at once
The moderator now treats completedPhase matching the last planned phase name
as full completion. Also recognizes sentinel values (all-done, all_done,
complete) for robustness.

Fixes #21

小橘 🍊(NEKO Team)
2026-05-07 03:13:44 +00:00
xiaoju 7926751b01 docs: replace RFC-001 with up-to-date architecture doc
RFC-001 was severely outdated (pre-refactor types). New architecture.md
covers the current three-phase engine, pure data roles, AgentBinding,
ExtractFn, and all design decisions.

小橘 <xiaoju@shazhou.work>
2026-05-07 01:45:51 +00:00
xiaoju 43e1f82303 refactor: extractPrompt out of ExtractContext, into ExtractFn parameter
ExtractFn = (schema, prompt, ctx) => Promise<T>
extractPrompt stays in RoleDefinition (definition layer), not in context (state layer).
Callers pass their own prompt — engine uses roleDef.extractPrompt, cursor agent uses its own.

小橘 <xiaoju@shazhou.work>
2026-05-07 01:27:12 +00:00
xiaoju d472de1247 refactor: three-phase context (Moderator/Agent/Extract) + extractPrompt + unified ExtractFn
- ModeratorContext → AgentContext → ExtractContext progressive types
- ThreadContext is now alias for AgentContext
- RoleDefinition adds extractPrompt field
- ExtractFn = (schema, ctx: ExtractContext) => Promise<T>
- createWorkflow takes ExtractFn, engine loop: moderator → agent → extract
- Remove ExtractConfig, extractMetaOrThrow, extract-meta.ts

小橘 <xiaoju@shazhou.work>
2026-05-07 01:05:31 +00:00
xiaoju 99a137422c feat: add ExtractFn utility, cursor agent workspace from thread context
- New ExtractFn = <T>(schema, prompt) => (ctx) => Promise<T>
- createExtract(provider) creates an LLM-backed ExtractFn
- CursorAgent removes workdir config, uses ExtractFn to resolve workspace from ThreadContext at runtime
- buildAgentPrompt(ctx) — reads systemPrompt from ctx.currentRole

小橘 <xiaoju@shazhou.work>
2026-05-07 00:17:31 +00:00
xiaoju d351343aa8 fix: pure data role packages use 'echo no tests' instead of 'bun test'
workflow-role-coder and workflow-role-planner have no test files —
bun test exits 1 on empty. Changed to 'echo no tests' for clean CI.

小橘 <xiaoju@shazhou.work>
2026-05-06 14:33:16 +00:00
xiaoju 2482fb7e62 chore: remove all dryRun infrastructure
dryRun no longer needed — tests use mock agents + mock fetch instead.
Removes isDryRun from WorkflowFnOptions, dryRun from ExtractConfig,
dryRunMeta from RoleDefinition, --dry-run from CLI, and all related
plumbing in engine/worker/fork/extract.

小橘 <xiaoju@shazhou.work>
2026-05-06 14:25:44 +00:00
xiaoju fa9163e462 refactor: all-agentic architecture — roles as pure data, agent binding at runtime
BREAKING: Major architecture change.

- RoleDefinition = pure data (systemPrompt + schema + dryRunMeta)
- AgentFn = (ctx: ThreadContext) => Promise<string>, reads ctx.currentRole
- WorkflowDefinition decoupled from agents, bound via AgentBinding at runtime
- createWorkflow(def, binding, extract) replaces createRoleModerator
- Meta extraction moved into engine loop
- Delete workflow-util-role package (createRole, decorators, extract all gone)
- Role packages become pure data exports
- Agent packages updated to single-arg AgentFn

小橘 <xiaoju@shazhou.work>
2026-05-06 14:14:33 +00:00
xiaoju fce2bf7441 refactor: systemPrompt → pure string, threadId injected by agent layer
- CreateRoleArgs.systemPrompt simplified to string (no more ctx callback)
- buildAgentPrompt injects real ctx.threadId in Tools section
- All four roles compute prompt at construction time from config only
- Removed duplicate ctx.threadId injection from reviewer/committer prompts

小橘 <xiaoju@shazhou.work>
2026-05-06 12:25:41 +00:00
xiaoju c9cdfe37db refactor: extract planner and coder into standalone role packages
- New @uncaged/workflow-role-planner (phaseSchema, createPlannerRole)
- New @uncaged/workflow-role-coder (coderMetaSchema, createCoderRole)
- solve-issue template imports from new packages, keeps dry-run defaults

小橘 <xiaoju@shazhou.work>
2026-05-06 11:40:19 +00:00
xiaoju 45bb5af99a feat: per-role agent config + phased planner/coder in solve-issue template
- SolveIssueRolesConfig.agents allows per-role AgentFn overrides
- PlannerMeta now outputs phases (name, description, acceptance)
- CoderMeta reports completedPhase, works one phase at a time
- Moderator routes coder→coder until all phases done, then reviewer

小橘 <xiaoju@shazhou.work>
2026-05-06 11:35:45 +00:00
xiaoju c7b0beb6be refactor: unify RoleDefinition + WorkflowDefinition with description & schema
- Add RoleDefinition<Meta> = { description, run, schema } to core types
- WorkflowDefinition now carries description and RoleDefinition per role
- Add buildDescriptor() in core to derive WorkflowDescriptor from WorkflowDefinition
- Remove buildDescriptorFromRoles / RoleDescriptorInput from workflow-util-role
- Update solve-issue template, examples, and all tests

小橘 <xiaoju@shazhou.work>
2026-05-06 11:19:49 +00:00
xiaoju 79cf97e617 refactor: remove name from WorkflowDefinition, fix threadId type errors
WorkflowDefinition no longer carries 'name' — name is a registry-level
concern, not a bundle property. Also fixed all missing threadId in test
fixtures to match the updated ThreadContext type.

小橘 <xiaoju@shazhou.work>
2026-05-06 11:01:09 +00:00
xiaoju 196562c82a feat: committer distinguishes recoverable vs unrecoverable failures
CommitterMeta is now a 3-way discriminated union:
- committed: success with branch + commitSha
- recoverable: coder can fix (hook failures, lint, test, conflicts)
- unrecoverable: can't be fixed by code (auth, permissions, disk)

Moderator routes recoverable → coder for retry.
2026-05-06 10:53:17 +00:00
xiaoju 267ca73a1b fix: committer prompt — don't fix failures, just report them 2026-05-06 10:49:22 +00:00
xiaoju aee71fd2e7 refactor: simplify committer — minimal prompt, config simplified
- CommitterGitConfig → CommitterConfig { cwd } (remote dropped)
- threadId from ctx.threadId, not config
- Remove summarizeThreadContext — agent uses uncaged-workflow thread CLI
- Prompt doesn't teach git or require structured output
2026-05-06 10:39:09 +00:00
xiaoju 8ce1dd3cca refactor: remove JSON output requirement from reviewer prompt
Let extractMetaOrThrow do its job — agent outputs naturally,
LLM extract handles structuring.
2026-05-06 10:35:53 +00:00
xiaoju 4b27943871 refactor: simplify reviewer — discriminated union meta, minimal prompt
- ReviewerMeta → discriminated union: approved | rejected (with issues)
- Remove method-heavy prompt — agent has built-in code review capability
- Prompt now just says: project path, threadId for context, approve or reject
- No non-blocking suggestions (they get ignored anyway)
- ReviewerConfig simplified to just { cwd }
2026-05-06 10:29:48 +00:00
xiaoju 8d5b97c67e feat: add threadId to ThreadContext and WorkflowFnOptions
Agentic roles can now access ctx.threadId to query thread details
via uncaged-workflow CLI for richer context.
2026-05-06 10:23:25 +00:00
xiaoju 513c006ce3 refactor: rename workflow-role-llm → workflow-agent-llm
The package only contains createLlmAdapter (OpenAI chat → AgentFn),
which is an agent adapter, not a role. Aligns with workflow-agent-cursor
and workflow-agent-hermes naming.
2026-05-06 10:14:35 +00:00
xiaoju 2cd2a7d713 Merge pull request 'fix(workflow): add typecheck script and fix remaining type errors' (#18) from fix/review-feedback-and-typecheck into main 2026-05-06 10:11:47 +00:00
xingyue 94fa964b84 fix(workflow): add typecheck script and fix remaining type errors
- Add typecheck script (bunx tsc --build) to package.json
- Update check script to run typecheck before biome
- Fix mock fetch casts in test files (bun-types preconnect)
- Fix RequestInfo → Request | string | URL in llm-extract test
- Fix ThreadContext generic cast in solve-issue-template test
- Fix git-exec.ts missing return and module resolution
- Remove @types/node from workflow-role-committer
- Add exports field to workflow-util-agent/package.json
2026-05-06 18:10:25 +08:00
xiaoju 2c642b1a53 refactor: committer as pure agent role with discriminated union meta (#17)
- Remove hardcoded git commands (git-exec.ts deleted)
- CommitterMeta is now a discriminated union: committed | failed
- Agent handles git operations, committer extracts structured result
- Moderator can route on meta.status for retry logic

Closes #17
2026-05-06 10:06:27 +00:00
xiaoju c15a5554c0 refactor: replace gitExec with spawnCli from workflow-util-agent
Remove hand-rolled execFile + promisify, delegate to spawnCli.
Same throw-on-error interface, better error messages.
2026-05-06 09:51:06 +00:00
xiaoju e38852a761 Merge pull request 'fix(workflow): resolve type errors across all packages' (#16) from fix/type-errors-and-tsbuildinfo into main 2026-05-06 09:44:49 +00:00
xiaoju fd8f1f2491 refactor: move createRole to workflow-util-role
workflow-role-llm now only contains createLlmAdapter (OpenAI chat
completions → AgentFn). All role infrastructure lives in util-role.
2026-05-06 09:42:49 +00:00
xingyue 98b6153070 fix(workflow): resolve type errors across all packages and remove tsbuildinfo from tracking
- Add bun-types and @types/xxhashjs to root devDependencies
- Add types: ['bun-types'] to root and package tsconfigs
- Fix AST Node type narrowing in bundle-validator.ts with AcornNode type and narrowNode helper
- Fix generic ThreadContext variance in create-role-moderator.ts
- Add explicit parameter types in worker.ts
- Fix ChildProcess type in worker-spawn.ts to match spawn() stdio config
- Remove all tsconfig.tsbuildinfo from git tracking
- Add tsconfig.tsbuildinfo to .gitignore
2026-05-06 17:41:11 +08:00
94 changed files with 2760 additions and 2247 deletions
+2
View File
@@ -0,0 +1,2 @@
[test]
pathIgnorePatterns = ["dist/**"]
+259
View File
@@ -0,0 +1,259 @@
# @uncaged/workflow — Architecture
**Last updated:** 2026-05-06 by 小橘 🍊(NEKO Team)
---
## Overview
A workflow engine that executes single-file ESM bundles. Each workflow is a self-contained `.esm.js` file identified by its XXH64 hash (Crockford Base32). No daemon — processes start on demand and exit when done.
## Package Structure
| Package | npm Name | Purpose |
|---------|----------|---------|
| `workflow` | `@uncaged/workflow` | Core: types, engine, ExtractFn, hash/ULID/registry |
| `cli-workflow` | `@uncaged/cli-workflow` | CLI: `uncaged-workflow` command |
| `workflow-agent-cursor` | `@uncaged/workflow-agent-cursor` | Cursor CLI agent (extracts workspace from ctx) |
| `workflow-agent-hermes` | `@uncaged/workflow-agent-hermes` | Hermes CLI agent |
| `workflow-agent-llm` | `@uncaged/workflow-agent-llm` | OpenAI-compatible LLM agent |
| `workflow-role-planner` | `@uncaged/workflow-role-planner` | Pure data: phased planning prompt + schema |
| `workflow-role-coder` | `@uncaged/workflow-role-coder` | Pure data: coding prompt + schema |
| `workflow-role-reviewer` | `@uncaged/workflow-role-reviewer` | Pure data: code review prompt + schema |
| `workflow-role-committer` | `@uncaged/workflow-role-committer` | Pure data: git commit prompt + schema |
| `workflow-template-solve-issue` | `@uncaged/workflow-template-solve-issue` | Composes roles + moderator into a complete workflow |
| `workflow-util-agent` | `@uncaged/workflow-util-agent` | `buildAgentPrompt` + `spawnCli` utilities |
Monorepo with **bun workspace**, `workspace:*` protocol.
## Core Types
```typescript
// --- Sentinel values ---
const START = "__start__";
const END = "__end__";
// --- RoleMeta: maps role names → their meta types ---
type RoleMeta = Record<string, Record<string, unknown>>;
// --- Role Definition: pure data, no execution logic ---
type RoleDefinition<Meta> = {
description: string; // human-readable
systemPrompt: string; // given to agent
extractPrompt: string; // given to extractor
schema: z.ZodType<Meta>; // meta shape (Zod v4)
};
// --- Workflow Definition: pure data, no agent binding ---
type WorkflowDefinition<M extends RoleMeta> = {
description: string;
roles: { [K in keyof M & string]: RoleDefinition<M[K]> };
moderator: Moderator<M>;
};
// --- Agent: raw string output, reads role info from context ---
type AgentFn = (ctx: AgentContext) => Promise<string>;
// --- Agent Binding: runtime assignment ---
type AgentBinding = {
agent: AgentFn;
overrides?: Partial<Record<string, AgentFn>>;
};
// --- Extract: structured data from context ---
type ExtractFn = <T>(schema: z.ZodType<T>, prompt: string, ctx: ExtractContext) => Promise<T>;
// --- Moderator: pure routing function ---
type Moderator<M extends RoleMeta> = (ctx: ModeratorContext<M>) => (keyof M & string) | typeof END;
// --- Composition ---
// createWorkflow(def, binding, extract) => WorkflowFn
```
## Three-Phase Engine Loop
Each role execution has three distinct phases with progressive context:
```
┌─→ Phase 1: MODERATOR
│ Context: ModeratorContext { threadId, start, steps }
│ Action: moderator(ctx) → role name | END
│ Phase 2: AGENT
│ Context: AgentContext = ModeratorCtx + { currentRole: { name, systemPrompt } }
│ Action: agent(ctx) → raw string
│ Phase 3: EXTRACTOR
│ Context: ExtractContext = AgentCtx + { agentContent }
│ Action: extract(schema, extractPrompt, ctx) → typed meta
│ Merge: RoleStep { role, content, meta, timestamp }
│ Append to steps
└─────────────────────────────────────────────────────┘
```
### Context Types (progressive)
```typescript
// Phase 1: Moderator sees accumulated state only
type ModeratorContext<M> = {
threadId: string;
start: StartStep;
steps: RoleStep<M>[];
};
// Phase 2: Agent knows its identity
type AgentContext<M> = ModeratorContext<M> & {
currentRole: { name: string; systemPrompt: string };
};
// Phase 3: Extractor has agent output
type ExtractContext<M> = AgentContext<M> & {
agentContent: string;
};
// ThreadContext is an alias for AgentContext (backward compat)
type ThreadContext<M> = AgentContext<M>;
```
### Key Properties
- **Moderator is synchronous and pure** — no I/O, no state mutation
- **Agent gets context, not instructions** — reads `ctx.currentRole.systemPrompt`
- **Extractor is a general tool** — not limited to post-agent extraction; agents can use it too (e.g. Cursor agent extracts workspace path before execution)
- **extractPrompt is a call parameter**, not context state — different callers use different prompts
## Agent Information Sources
An agent has exactly three information sources:
1. **Prior knowledge** — LLM training, agent memory, agent skills
2. **Thread context**`AgentContext` (start, steps, currentRole)
3. **Derived information** — from 1 & 2 (e.g. tool calls, shell commands)
No hidden environment parameters. If an agent needs something (like a workspace path), it extracts it from context using `ExtractFn`.
## Bundle Contract
A workflow bundle is a single `.esm.js` file with two named exports:
```typescript
// Named exports (no default export)
export const descriptor: WorkflowDescriptor;
export const run: WorkflowFn;
type WorkflowFn = (
input: { prompt: string; steps: RoleOutput[] },
options: { threadId: string; maxRounds: number },
) => AsyncGenerator<RoleOutput, WorkflowResult>;
```
### Constraints
- Single `.esm.js` file
- No dynamic `import()`
- All static imports must be Node built-in modules only
- XXH64 hash (Crockford Base32) = globally unique version ID
### Why AsyncGenerator?
- Each `yield` → engine writes to `.data.jsonl`, checks abort/pause
- `return` → engine marks thread complete
- Fork = pass historical steps as `input.steps` to a new generator
- Zero injection — bundle doesn't import from the engine
## Storage Layout
```
~/.uncaged/workflow/
├── bundles/
│ ├── C9NMV6V2TQT81.esm.js # Crockford Base32 of XXH64
│ └── C9NMV6V2TQT81.yaml # Role descriptor
├── logs/ # One folder per bundle hash
│ └── C9NMV6V2TQT81/
│ ├── 01KQXKW…YG.data.jsonl # Thread state
│ └── 01KQXKW…YG.info.jsonl # Debug log
└── workflow.yaml # Registry
```
### ID Encoding: Crockford Base32
- Case-insensitive, filesystem-safe, no ambiguous chars (0/O, 1/I/L)
- Bundle hash: XXH64 → 13-char
- Thread ID: ULID → 26-char (10 timestamp + 16 random)
### Registry (`workflow.yaml`)
```yaml
workflows:
solve-issue:
hash: "C9NMV6V2TQT81"
timestamp: 1714963200000
history:
- hash: "A7BKR3M1NPQ40"
timestamp: 1714876800000
```
### Thread JSONL
**`.data.jsonl`** — Line 1: start record, Line 2+: role outputs
```jsonc
// Start record
{ "name": "solve-issue", "hash": "C9NMV6V2TQT81", "threadId": "01KQXKW…",
"parameters": { "prompt": "Fix bug #3", "options": { "maxRounds": 5 } },
"timestamp": 1714963200000 }
// Role output
{ "role": "planner", "content": "...", "meta": { "phases": [...] }, "timestamp": ... }
```
**`.info.jsonl`** — Structured debug log
```jsonc
{ "tag": "4KNMR2PX", "content": "Loading bundle...", "timestamp": ... }
```
Tags are 8-char Crockford Base32 (40-bit random), one per call site. `grep "4KNMR2PX"` → instant code location.
## Execution Model
- **No daemon.** `uncaged-workflow run <name>` starts a worker process
- Same bundle's threads share one process (memory efficiency)
- Process exits when all threads complete
- Thread termination via IPC within the process
## CLI Commands
| Priority | Command | Description |
|----------|---------|-------------|
| P1 | `add <name> <file.esm.js>` | Register a bundle |
| P1 | `list` | List registered workflows |
| P1 | `show <name>` | Show workflow details |
| P1 | `remove <name>` | Remove a workflow |
| P1 | `run <name> [--prompt] [--max-rounds]` | Start a thread |
| P1 | `threads [name]` | List threads |
| P1 | `thread <id>` | Show thread state |
| P1 | `thread rm <id>` | Delete a thread |
| P1 | `ps` | List running threads |
| P1 | `kill <thread-id>` | Terminate a running thread |
| P2 | `history <name>` | Show version history |
| P2 | `rollback <name> [hash]` | Switch to a previous version |
| P2 | `pause <thread-id>` | Pause a running thread |
| P2 | `resume <thread-id>` | Resume a paused thread |
| P3 | `fork <thread-id> [--from-role <role>]` | Fork from historical state |
All commands implemented and tested. ✅
## Design Decisions
| Decision | Rationale |
|----------|-----------|
| **Role = pure data** | Decouples definition from execution; same role with different agents |
| **Agent bound at runtime** | WorkflowDefinition is reusable; agent choice is deployment concern |
| **Three-phase context** | Each phase sees only what it needs; clean separation |
| **ExtractFn as general tool** | Agents use it for pre-execution extraction; engine uses it for meta |
| **Single-file ESM** | Hash = version, no dependency hell, self-contained |
| **No daemon** | OS handles process lifecycle; unnecessary complexity |
| **Crockford Base32** | Filesystem-safe, readable, compact |
| **No concurrency in registry** | Different workflows have different constraints; belongs at workflow/role level |
| **No dryRun** | Tests use mock agents + mock fetch; simpler architecture |
+315
View File
@@ -0,0 +1,315 @@
# Workflow-as-Agent Implementation Plan
> **For Hermes:** Use subagent-driven-development skill to implement this plan task-by-task.
**Goal:** Enable workflows to invoke other workflows as agents, backed by global CAS and refs tracking.
**Architecture:** Migrate CAS from thread-local to global (`~/.uncaged/workflow/cas/`), add `refs` to RoleStep for GC traceability, then build `workflowAsAgent(name)` factory that resolves workflow name → bundle via registry and spawns a child thread.
**Tech Stack:** TypeScript, Bun, Zod v4, monorepo with `packages/`
**Issue:** https://git.shazhou.work/uncaged/workflow/issues/25
---
## Phase 1: Global CAS Migration
Move CAS storage from `<threadDir>/<threadId>.cas/` to `~/.uncaged/workflow/cas/` (global, content-addressed, immutable). This is a **breaking change** — thread-local `.cas/` directories are abandoned.
### Task 1.1: Add `globalCasDir` helper to `storage-root.ts`
**Objective:** Provide a single function that returns the global CAS directory path.
**Files:**
- Modify: `packages/workflow/src/storage-root.ts`
- Test: `packages/workflow/__tests__/storage-root.test.ts`
**Implementation:**
```typescript
// storage-root.ts — add export
export function getGlobalCasDir(storageRoot?: string): string {
const root = storageRoot ?? getDefaultWorkflowStorageRoot();
return join(root, "cas");
}
```
Export from `packages/workflow/src/index.ts`.
### Task 1.2: Update `cmd-cas.ts` to use global CAS
**Objective:** CLI `cas get/put/list/rm` no longer needs threadId for storage location — CAS is global. But keep threadId in CLI for backward compat of planner/coder prompts (they pass threadId).
**Files:**
- Modify: `packages/cli-workflow/src/cmd-cas.ts`
**Changes:**
- `resolveCasDir` → use `getGlobalCasDir(storageRoot)` instead of deriving from thread data path
- `cmdCasPut` / `cmdCasGet` / `cmdCasList` / `cmdCasRm`: threadId is still accepted (prompts pass it) but storage goes to global dir
- Remove the `resolveThreadDataPath` dependency for CAS operations — thread doesn't need to exist to read CAS
```typescript
import { createThreadCas, getGlobalCasDir } from "@uncaged/workflow";
export async function cmdCasGet(
storageRoot: string,
_threadId: string, // kept for CLI compat, not used for path
hash: string,
): Promise<Result<string, string>> {
const cas = createThreadCas(getGlobalCasDir(storageRoot));
const content = await cas.get(hash);
if (content === null) {
return err(`cas entry not found: ${hash}`);
}
return ok(content);
}
// ... same pattern for put/list/rm
```
### Task 1.3: Update `cmd-thread.ts` — thread rm no longer deletes `.cas/`
**Objective:** Since CAS is global, `thread rm` should NOT delete CAS entries. CAS cleanup is GC's job.
**Files:**
- Modify: `packages/cli-workflow/src/cmd-thread.ts`
- Check: remove any `rmdir` / `unlink` of `<threadId>.cas/` directory
### Task 1.4: Rename `createThreadCas` → `createCasStore`
**Objective:** The name `createThreadCas` is misleading now. Rename to `createCasStore`.
**Files:**
- Modify: `packages/workflow/src/cas.ts` — rename function
- Modify: `packages/workflow/src/index.ts` — update export (keep `createThreadCas` as deprecated alias for one release)
- Modify: all consumers (`cmd-cas.ts`)
### Task 1.5: Update tests
**Objective:** All CAS-related tests use global dir instead of thread-local.
**Files:**
- Modify: `packages/cli-workflow/__tests__/commands.test.ts`
- Verify: `bun test` passes
### Task 1.6: Clean up old thread-local `.cas/` references
**Objective:** Remove dead code that creates/reads thread-local `.cas/` directories.
**Files:**
- Search all `*.ts` for `.cas` path construction patterns
- Remove orphaned helpers
---
## Phase 2: RoleStep `refs` Tracking
Add `refs: string[]` to persisted role steps so GC can trace which CAS entries are alive.
### Task 2.1: Add `refs` to `RoleOutput` and engine persistence
**Objective:** Every role step can declare which CAS hashes it produced or consumed.
**Files:**
- Modify: `packages/workflow/src/types.ts`
- Modify: `packages/workflow/src/engine.ts`
**Changes to `types.ts`:**
```typescript
export type RoleOutput = {
role: string;
content: string;
meta: Record<string, unknown>;
refs: string[]; // CAS hashes produced/consumed by this step
};
```
**Changes to `engine.ts`:**
- `appendDataLine` for role steps: include `refs` field (default `[]` if not provided)
### Task 2.2: Auto-populate refs from meta hashes
**Objective:** The engine should automatically extract CAS hashes from `meta` to populate `refs`, so roles don't need to manually track them.
**Strategy:** After meta extraction, walk the meta object and collect any string that looks like a CAS hash (Crockford Base32, 13 chars). This is a heuristic but works because CAS hashes are distinctive.
Alternative (simpler): Let each `RoleDefinition` optionally declare a `extractRefs(meta: M) => string[]` function. For planner, this returns `meta.phases.map(p => p.hash)`. For coder, `[meta.completedPhase]`.
**Recommended:** The explicit `extractRefs` approach — no magic, no false positives.
**Files:**
- Modify: `packages/workflow/src/types.ts` — add optional `extractRefs` to `RoleDefinition`
- Modify: `packages/workflow/src/create-workflow.ts` — call `extractRefs` after meta extraction, set on `RoleOutput.refs`
- Modify: `packages/workflow-role-planner/src/planner.ts` — implement `extractRefs`
- Modify: `packages/workflow-role-coder/src/coder.ts` — implement `extractRefs`
```typescript
// types.ts — RoleDefinition addition
export type RoleDefinition<Meta extends Record<string, unknown>> = {
description: string;
systemPrompt: string;
extractPrompt: string;
schema: z.ZodType<Meta>;
extractRefs?: (meta: Meta) => string[]; // CAS hashes to track
};
// planner.ts
extractRefs: (meta) => meta.phases.map(p => p.hash),
// coder.ts
extractRefs: (meta) => [meta.completedPhase],
```
### Task 2.3: Update fork logic to preserve refs
**Objective:** When forking a thread, `refs` from historical steps must be carried over.
**Files:**
- Modify: `packages/workflow/src/fork-thread.ts`
- Verify: `ForkHistoricalStep` / `PrefilledDiskStep` include `refs`
### Task 2.4: Tests for refs tracking
**Files:**
- Add: `packages/workflow/__tests__/refs-tracking.test.ts`
- Verify: refs appear in `.data.jsonl` output
---
## Phase 3: CAS Garbage Collection
### Task 3.1: Implement `gc.ts` in `@uncaged/workflow`
**Objective:** Mark-and-sweep GC — scan all thread `.data.jsonl` files, collect `refs`, delete orphaned CAS entries.
**Files:**
- Create: `packages/workflow/src/gc.ts`
- Export from: `packages/workflow/src/index.ts`
```typescript
export type GcResult = {
scannedThreads: number;
activeRefs: number;
deletedEntries: number;
deletedHashes: string[];
};
export async function garbageCollectCas(storageRoot: string): Promise<GcResult> {
// 1. Find all .data.jsonl files under storageRoot
// 2. Parse each, flatMap step.refs → Set<string>
// 3. List all CAS entries via createCasStore(globalCasDir).list()
// 4. Delete entries not in active set
// 5. Return stats
}
```
### Task 3.2: Add `uncaged-workflow gc` CLI command
**Files:**
- Create: `packages/cli-workflow/src/cmd-gc.ts`
- Modify: `packages/cli-workflow/src/cli-dispatch.ts` — add `gc` subcommand
### Task 3.3: Run GC on `thread rm`
**Files:**
- Modify: `packages/cli-workflow/src/cmd-thread.ts` — after deleting thread data, optionally run GC
### Task 3.4: Tests for GC
**Files:**
- Create: `packages/cli-workflow/__tests__/gc-cli.test.ts`
---
## Phase 4: `workflowAsAgent` Factory
### Task 4.1: Create `workflowAsAgent` in `@uncaged/workflow`
**Objective:** Factory function that takes a workflow name, resolves to bundle, returns an `AgentFn`.
**Files:**
- Create: `packages/workflow/src/workflow-as-agent.ts`
- Export from: `packages/workflow/src/index.ts`
```typescript
import type { AgentFn } from "./types.js";
export type WorkflowAsAgentOptions = {
storageRoot?: string;
};
export function workflowAsAgent(
workflowName: string,
options?: WorkflowAsAgentOptions,
): AgentFn {
return async (ctx) => {
const storageRoot = options?.storageRoot ?? getDefaultWorkflowStorageRoot();
// 1. Read registry → resolve name to bundle hash + path
const registry = await readWorkflowRegistry(storageRoot);
const entry = getRegisteredWorkflow(registry, workflowName);
if (entry === null) {
return `ERROR: workflow "${workflowName}" not found in registry`;
}
// 2. Load bundle
const bundlePath = join(storageRoot, "bundles", `${entry.hash}.esm.js`);
const bundleExports = await extractBundleExports(bundlePath);
// 3. Create child thread input from ctx.start.content (parent prompt)
const input: ThreadInput = {
prompt: ctx.start.content,
steps: [],
};
// 4. Generate child threadId
const childThreadId = generateUlid();
// 5. Execute — collect all yields, return final content
const io: ExecuteThreadIo = { ... };
const result = await executeThread(bundleExports.run, workflowName, input, ...);
// 6. Return summary as agent content
return result.summary;
};
}
```
### Task 4.2: System-level depth limit
**Objective:** Prevent infinite recursion. Track depth via thread metadata, enforce a global max (default 3, configurable in `workflow.yaml`).
**Files:**
- Modify: `packages/workflow/src/types.ts` — add `depth` to `WorkflowFnOptions`
- Modify: `packages/workflow/src/workflow-as-agent.ts` — increment depth, check limit
- Modify: registry or config types for `maxDepth` setting
### Task 4.3: Tests for workflowAsAgent
**Files:**
- Create: `packages/workflow/__tests__/workflow-as-agent.test.ts`
- Test: name resolution, depth limit, child thread execution
### Task 4.4: Integration test — nested workflow
**Objective:** Create a minimal test workflow that calls another workflow via `workflowAsAgent`.
**Files:**
- Create: `packages/workflow/__tests__/workflow-as-agent-integration.test.ts`
---
## Execution Order
```
Phase 1 (Global CAS) → Phase 2 (refs) → Phase 3 (GC) → Phase 4 (workflowAsAgent)
```
Each phase is independently mergeable. Phase 3 depends on Phase 2 (needs refs to know what's alive). Phase 4 depends on Phase 1 (global CAS for cross-thread sharing).
## Breaking Changes
- CAS storage location moves from `<thread>.cas/` to `~/.uncaged/workflow/cas/`
- `RoleOutput` gains required `refs: string[]` field
- Existing threads with thread-local CAS will lose access to old CAS data (acceptable — those are short-lived workflow artifacts)
- `createThreadCas` renamed to `createCasStore` (alias kept temporarily)
-438
View File
@@ -1,438 +0,0 @@
# RFC-001: Workflow Engine Design
**Author:** 小橘 🍊(NEKO Team)
**Date:** 2026-05-06
**Status:** Draft
---
## 1. Package Structure
| Package | npm Name | Binary |
|---------|----------|--------|
| Core lib | `@uncaged/workflow` | — |
| CLI | `@uncaged/cli-workflow` | `uncaged-workflow` |
Future: `@uncaged/cli` umbrella, invoke via `uncaged workflow <subcommand>`.
Monorepo uses **bun workspace**.
## 2. Workflow Physical Implementation
A **Workflow** is a single-file ESM module that **named-exports** an **AsyncGenerator** function as `run` and workflow metadata as `descriptor`:
```typescript
/** What each yield produces — one role's output. */
type RoleOutput = {
role: string;
content: string;
meta: Record<string, unknown>;
};
/** What the generator returns when done. */
type WorkflowResult = {
returnCode: number;
summary: string;
};
/** Input to a workflow — prompt + optional historical steps for fork/resume. */
type ThreadInput = {
prompt: string;
steps: RoleOutput[]; // [] for new thread, pre-filled for fork/resume
};
/** The bundle contract — an AsyncGenerator, not a Promise. */
type WorkflowFn = (
input: ThreadInput,
options: { isDryRun: boolean; maxRounds: number }
) => AsyncGenerator<RoleOutput, WorkflowResult>;
```
### Why AsyncGenerator?
The workflow **yields** each role output instead of writing to an injected writer or
exporting a framework-specific shape:
```typescript
// Example bundle — zero framework dependency (named exports only)
export const descriptor = {
description: "Fix auth bug",
roles: {
planner: {
description: "Plans the fix",
schema: { type: "object", properties: { files: { type: "array", items: { type: "string" } } } },
},
coder: {
description: "Implements the plan",
schema: { type: "object", properties: { diff: { type: "string" } } },
},
},
};
export const run = async function* (input, options) {
const plan = await callLLM("plan: " + input.prompt);
yield { role: "planner", content: plan, meta: { files: ["src/auth.ts"] } };
const code = await callLLM("implement: " + plan);
yield { role: "coder", content: code, meta: { diff: "..." } };
return { returnCode: 0, summary: "Fixed auth bug" };
};
```
**Engine controls the loop**, not the bundle:
- Each `yield` → engine writes to `.data.jsonl`, checks `AbortSignal`, handles pause/resume
- `return` → engine writes the final result, marks thread complete
- **Fork** = read historical steps from `.data.jsonl`, pass as `input.steps` to a new generator
- **Zero injection** — the bundle doesn't import or receive anything from the engine
### Fork/Resume via ThreadInput
When using the `createRoleModerator` helper, fork is **naturally handled**:
```typescript
// The moderator receives ThreadContext with historical steps
// It sees planner already ran → routes to coder automatically
const gen = workflow(
{ prompt: "fix bug #3", steps: [{ role: "planner", content: "...", meta: {} }] },
{ isDryRun: false, maxRounds: 10 }
);
// First yield will be coder's output, not planner's
```
No special replay logic needed — the moderator/role pattern inherently supports
resuming from any snapshot, because moderator routing is a pure function of the
accumulated steps.
This follows the **Dependency Inversion Principle**: the engine depends on the
generator protocol (a language primitive), not on a framework-specific `WorkflowDefinition`.
Bundles remain pure functions with no coupling to `@uncaged/workflow`.
### Relationship to Role/Moderator Pattern
The Role + Moderator pattern from Section 8 is one **implementation strategy** inside a
bundle, not the bundle contract itself. A helper like `createRoleModerator(roles, moderator)`
can produce the AsyncGenerator internally, but simple workflows can yield directly without
any framework types.
### Constraints
- Single `.esm.js` file
- Named exports `run` (callable AsyncGenerator workflow) and `descriptor` (metadata object)
- No default export
- No dynamic `import()`
- All static imports must be Node built-in modules only
This guarantees the file is self-contained, and its **XXH64 hash** (encoded as Crockford Base32) serves as a globally unique version identifier.
### Role Descriptor (`export const descriptor`)
The bundle **must** export a `descriptor` object describing roles for tooling/agent consumption.
Shape: `{ description: string, roles: Record<string, { description: string, schema: JSONSchema }> }`
When you register a bundle via `uncaged-workflow add`, the engine imports the module, validates `descriptor`, and writes `{hash}.yaml` next to `{hash}.esm.js` under `bundles/` (same serialized shape as below):
```yaml
description: "Workflow brief introduction"
roles:
planner:
description: "Analyzes the issue and creates a plan"
schema:
type: object
properties:
plan:
type: string
files:
type: array
items:
type: string
coder:
description: "Implements the plan"
schema:
type: object
properties:
diff:
type: string
```
Execution uses `run` only; YAML is for tooling and introspection.
## 3. Storage Layout
All data lives under `~/.uncaged/workflow/`:
```
~/.uncaged/workflow/
├── bundles/ # ESM bundles
│ ├── C9NMV6V2TQT81.esm.js # Crockford Base32 of XXH64 hash
│ └── C9NMV6V2TQT81.yaml # Role descriptor (from bundle export, at register time)
├── logs/ # Thread data, one folder per bundle hash
│ └── C9NMV6V2TQT81/
│ ├── 01KQXKW18CT8G75T53R8F4G7YG.data.jsonl
│ └── 01KQXKW18CT8G75T53R8F4G7YG.info.jsonl
└── workflow.yaml # Registry
```
**Not** a git repo. **Not** an npm package. Bundles are self-contained single files.
### ID Encoding
All IDs use **Crockford Base32**:
- Better readability than Base64
- Higher density than hex (shorter filenames)
- ULID: 10 chars timestamp (high 2 bits zero-padded for future use) + 16 chars random
## 4. Registry (`workflow.yaml`)
```yaml
workflows:
solve-issue:
hash: "C9NMV6V2TQT81"
timestamp: 1714963200000
history:
- hash: "A7BKR3M1NPQ40"
timestamp: 1714876800000
- hash: "X2FGH8J4KLM56"
timestamp: 1714790400000
```
Type:
```typescript
{
workflows: Record<string, {
hash: string; // Crockford Base32 of current XXH64
timestamp: number;
history: { hash: string; timestamp: number }[];
}>
}
```
No concurrency control or timeout settings in the registry — those belong to each workflow/role/adapter.
## 5. Thread JSONL Format
### `.data.jsonl` — Thread State
**Line 1: Start record**
```jsonc
{
"name": "solve-issue",
"hash": "C9NMV6V2TQT81",
"threadId": "01KQXKW18CT8G75T53R8F4G7YG",
"parameters": {
"prompt": "Fix the login redirect bug in #3",
"options": {
"isDryRun": false,
"maxRounds": 5
}
},
"timestamp": 1714963200000
}
```
**Line 2+: Role outputs**
```jsonc
{
"role": "planner",
"content": "Plan: modify auth middleware...",
"meta": { "plan": "...", "files": ["src/auth.ts"] },
"timestamp": 1714963201000
}
```
### `.info.jsonl` — Debug Log
```jsonc
{
"tag": "4KNMR2PX", // 40-bit random, Crockford Base32 (8 chars)
"content": "Loading workflow bundle...",
"timestamp": 1714963200500
}
```
## 6. Execution Model
- **No daemon.** `uncaged-workflow run <name>` starts a worker process.
- Same bundle's threads share one process (memory efficiency).
- Process exits automatically when all threads complete.
- Thread termination requires **IPC** within the process (not just kill PID).
## 7. CLI Requirements
### P1 (Must Have)
| Command | Description |
|---------|-------------|
| `uncaged-workflow add <name> <file.esm.js> [--types <path>]` | Register a compiled `.esm.js` bundle (descriptor extracted from `export const descriptor`) |
| `uncaged-workflow list` | List registered workflows |
| `uncaged-workflow show <name>` | Show workflow details |
| `uncaged-workflow remove <name>` | Remove a workflow |
| `uncaged-workflow run <name> [--prompt] [--dry-run] [--max-rounds]` | Start a thread |
| `uncaged-workflow threads [name]` | List threads (optionally filter by workflow) |
| `uncaged-workflow thread <id>` | Show thread state |
| `uncaged-workflow thread rm <id>` | Delete a thread |
| `uncaged-workflow ps` | List running threads |
| `uncaged-workflow kill <thread-id>` | Terminate a running thread (via IPC) |
### P2 (Should Have)
| Command | Description |
|---------|-------------|
| `uncaged-workflow history <name>` | Show version history |
| `uncaged-workflow rollback <name> [hash]` | Switch to a previous version |
| `uncaged-workflow pause <thread-id>` | Pause a running thread |
| `uncaged-workflow resume <thread-id>` | Resume a paused thread |
### P3 (Nice to Have)
| Command | Description |
|---------|-------------|
| `uncaged-workflow fork <thread-id> [--from-role <role>]` | Fork from a historical thread state |
## 8. Role/Moderator Pattern (Helper, Not Contract)
The bundle contract is the AsyncGenerator from Section 2. The Role + Moderator pattern
below is a **convenience helper** for the common case of multi-role workflows with a
routing function. It lives in `@uncaged/workflow` as an optional utility.
### Helper Function
```typescript
function createRoleModerator<M extends RoleMeta>(
def: { roles: { [K in keyof M & string]: Role<M[K]> }; moderator: Moderator<M> }
): WorkflowFn; // returns (input: ThreadInput, options) => AsyncGenerator
```
Usage in a bundle:
```typescript
import { createRoleModerator, END } from "@uncaged/workflow";
export const descriptor = {
description: "Example multi-role workflow",
roles: {
planner: { description: "Plans work", schema: {} },
coder: { description: "Writes code", schema: {} },
},
};
export const run = createRoleModerator({
roles: { planner, coder },
moderator(ctx) { return ctx.steps.length === 0 ? "planner" : END; },
});
// Accepts ThreadInput — fork with pre-filled steps works automatically
```
### Supporting Types
```typescript
/** Sentinel values for automaton control flow. */
const START = "__start__" as const;
const END = "__end__" as const;
/** Maps role names → their meta types. Single generic drives all inference. */
type RoleMeta = Record<string, Record<string, unknown>>;
/** Typed output of a Role execution. */
type RoleResult<Meta> = { content: string; meta: Meta };
/** Engine start frame: initial prompt + thread identity. */
type StartStep = {
role: START;
content: string; // the user prompt
meta: { maxRounds: number; threadId: string };
timestamp: number;
};
/** A completed role step in the thread. */
type RoleStep<M extends RoleMeta> = {
[K in keyof M & string]: { role: K; meta: M[K]; content: string; timestamp: number };
}[keyof M & string];
/** Thread-scoped context passed to roles and moderator. */
type ThreadContext<M extends RoleMeta = RoleMeta> = {
threadId: string;
start: StartStep;
steps: RoleStep<M>[];
};
/**
* A Role — receives full thread context, returns typed content + meta.
* Implementation can be an agent, LLM call, script, HTTP request, etc.
*/
type Role<Meta> = (ctx: ThreadContext) => Promise<RoleResult<Meta>>;
/**
* An Agent — raw string output interface for LLM/CLI adapters.
* Structured meta is extracted by the role's extract layer.
*/
type AgentFn = (ctx: ThreadContext, systemPrompt: string) => Promise<string>;
/**
* The Moderator — a pure routing function.
* Receives the full thread context (start + all prior steps).
* On initial call, `steps` is empty.
* Returns the next role name or END to terminate.
*/
type Moderator<M extends RoleMeta> = (ctx: ThreadContext<M>) => (keyof M & string) | END;
/** Complete workflow definition as authored by users. */
type WorkflowDefinition<M extends RoleMeta> = {
name: string;
roles: { [K in keyof M & string]: Role<M[K]> };
moderator: Moderator<M>;
};
```
### Execution Flow (when using createRoleModerator)
```
START (prompt) → Moderator → Role A → Moderator → Role B → ... → Moderator → END
```
1. Engine creates a `StartStep` with the user prompt and maxRounds
2. Moderator is called with `steps = []`, returns the first role name
3. Role executes, appends a `RoleStep` to the thread
4. Moderator is called again with updated steps, returns next role or END
5. Repeat until END or maxRounds reached
### Responsibilities
| Component | Responsibility | Purity |
|-----------|---------------|--------|
| **Moderator** | Route to next role based on thread state | Pure function, no side effects |
| **Role** | Execute a step (call LLM, run script, etc.) | Async, may have side effects |
| **AgentFn** | Low-level LLM/CLI invocation adapter | Async, side effects |
### Key Constraints
- Moderator is **synchronous and pure** — no I/O, no state mutation
- Roles receive the **full thread context** (not just the last message)
- Round count = `steps.length`; max rounds in `start.meta.maxRounds`
- The `meta` field on each step is **typed per role** via the `RoleMeta` generic
## 9. Design Decisions & Rationale
### Why single-file ESM?
- Hash = version. No ambiguity.
- No dependency hell. Self-contained.
- Simple to distribute, store, and verify.
### Why no daemon?
- Unnecessary complexity for process-per-bundle model.
- OS process management (systemd, etc.) handles restarts.
- IPC within process handles thread lifecycle.
### Why Crockford Base32?
- Case-insensitive, filesystem-safe.
- No ambiguous characters (0/O, 1/I/L).
- More compact than hex (13 chars for 64-bit vs 16).
### Why not control concurrency in registry?
- Different workflows have different constraints.
- Same workflow may allow cross-project concurrency but not intra-project.
- Concurrency belongs at workflow/role/adapter level.
+29 -9
View File
@@ -1,9 +1,14 @@
import { createRoleModerator, END, type Role } from "@uncaged/workflow";
import { createExtract, createWorkflow, END, type RoleDefinition } from "@uncaged/workflow";
import * as z from "zod/v4";
type Roles = {
greeter: { greeting: string };
};
const greeterMetaSchema = z.object({
greeting: z.string(),
});
export const descriptor = {
description: "A simple hello world workflow",
roles: {
@@ -18,14 +23,29 @@ export const descriptor = {
},
};
const greeter: Role<Roles["greeter"]> = async (ctx) => ({
content: `Hello, ${ctx.start.content}`,
meta: { greeting: "Hello!" },
const greeter: RoleDefinition<Roles["greeter"]> = {
description: "Generates a greeting",
systemPrompt: "You greet the user briefly.",
extractPrompt: "Extract the greeting string produced for the user.",
schema: greeterMetaSchema,
extractRefs: null,
};
const extract = createExtract({
baseUrl: "http://127.0.0.1:9",
apiKey: "",
model: "",
});
export const run = createRoleModerator<Roles>({
roles: { greeter },
moderator(ctx) {
return ctx.steps.length === 0 ? "greeter" : END;
export const run = createWorkflow<Roles>(
{
roles: { greeter },
moderator(ctx) {
return ctx.steps.length === 0 ? "greeter" : END;
},
},
});
{
agent: async (ctx) => `Hello, ${ctx.start.content}`,
},
extract,
);
+2 -1
View File
@@ -3,6 +3,7 @@
"private": true,
"type": "module",
"dependencies": {
"@uncaged/workflow": "workspace:*"
"@uncaged/workflow": "workspace:*",
"zod": "^4.0.0"
}
}
@@ -3,8 +3,9 @@ import { mkdir, mkdtemp, readFile, rm, unlink, writeFile } from "node:fs/promise
import { tmpdir } from "node:os";
import { join } from "node:path";
import { getRegisteredWorkflow, readWorkflowRegistry } from "@uncaged/workflow";
import { getGlobalCasDir, getRegisteredWorkflow, readWorkflowRegistry } from "@uncaged/workflow";
import { cmdAdd } from "../src/cmd-add.js";
import { cmdCasGet, cmdCasList, cmdCasPut, cmdCasRm } from "../src/cmd-cas.js";
import { cmdHistory } from "../src/cmd-history.js";
import { cmdList, formatListLines } from "../src/cmd-list.js";
import { cmdRemove } from "../src/cmd-remove.js";
@@ -371,6 +372,37 @@ export const run = async function* (input) {
expect(bad.ok).toBe(false);
});
test("cas put/get/list/rm use global cas dir (thread id not required for storage)", async () => {
const put = await cmdCasPut(storageRoot, "nonexistent-thread-id", "phase doc");
expect(put.ok).toBe(true);
if (!put.ok) {
return;
}
const hash = put.value;
const blobPath = join(getGlobalCasDir(storageRoot), `${hash}.txt`);
expect(await readFile(blobPath, "utf8")).toBe("phase doc");
const got = await cmdCasGet(storageRoot, "other-thread", hash);
expect(got.ok).toBe(true);
if (!got.ok) {
return;
}
expect(got.value).toBe("phase doc");
const listed = await cmdCasList(storageRoot, "another-thread");
expect(listed.ok).toBe(true);
if (!listed.ok) {
return;
}
expect(listed.value).toContain(hash);
const removed = await cmdCasRm(storageRoot, "rm-thread", hash);
expect(removed.ok).toBe(true);
const missing = await cmdCasGet(storageRoot, "after-rm", hash);
expect(missing.ok).toBe(false);
});
test("rollback rejects missing bundle file for target hash", async () => {
const bundleDir = join(storageRoot, "src");
await mkdir(bundleDir, { recursive: true });
@@ -98,7 +98,7 @@ describe("cli fork", () => {
}
const hash = added.value.hash;
const ran = await cmdRun(storageRoot, "solve-issue", "hello", false, 5);
const ran = await cmdRun(storageRoot, "solve-issue", "hello", 5);
expect(ran.ok).toBe(true);
if (!ran.ok) {
return;
@@ -148,7 +148,7 @@ describe("cli fork", () => {
}
const hash = added.value.hash;
const ran = await cmdRun(storageRoot, "solve-issue", "hello", false, 5);
const ran = await cmdRun(storageRoot, "solve-issue", "hello", 5);
expect(ran.ok).toBe(true);
if (!ran.ok) {
return;
@@ -198,7 +198,7 @@ describe("cli fork", () => {
return;
}
const ran = await cmdRun(storageRoot, "solve-issue", "hello", false, 5);
const ran = await cmdRun(storageRoot, "solve-issue", "hello", 5);
expect(ran.ok).toBe(true);
if (!ran.ok) {
return;
@@ -4,7 +4,9 @@ import { mkdir, mkdtemp, readFile, rm, writeFile } from "node:fs/promises";
import { tmpdir } from "node:os";
import { dirname, join } from "node:path";
import { fileURLToPath } from "node:url";
import { getGlobalCasDir } from "@uncaged/workflow";
import { cmdAdd } from "../src/cmd-add.js";
import { cmdCasPut } from "../src/cmd-cas.js";
import { cmdKill } from "../src/cmd-kill.js";
import { cmdPause } from "../src/cmd-pause.js";
import { cmdPs } from "../src/cmd-ps.js";
@@ -12,7 +14,7 @@ import { cmdResume } from "../src/cmd-resume.js";
import { cmdRun } from "../src/cmd-run.js";
import { cmdThreadRemove, cmdThreadShow } from "../src/cmd-thread.js";
import { cmdThreads } from "../src/cmd-threads.js";
import { pathExists } from "../src/fs-utils.js";
import { pathExists, readTextFileIfExists } from "../src/fs-utils.js";
import { addCliArgs } from "./bundle-fixture.js";
const threadFixtureDescriptor = `export const descriptor = {
@@ -138,7 +140,7 @@ describe("cli thread commands", () => {
return;
}
const ran = await cmdRun(storageRoot, "solve-issue", "hello", false, 5);
const ran = await cmdRun(storageRoot, "solve-issue", "hello", 5);
expect(ran.ok).toBe(true);
if (!ran.ok) {
return;
@@ -175,6 +177,55 @@ describe("cli thread commands", () => {
expect(await pathExists(dataPath)).toBe(false);
});
test("thread rm does not delete global cas blobs for that thread id", async () => {
const bundleDir = join(storageRoot, "src");
await mkdir(bundleDir, { recursive: true });
const bundlePath = join(bundleDir, "demo.esm.js");
await writeFile(bundlePath, fastBundleSource, "utf8");
const added = await cmdAdd(storageRoot, addCliArgs("solve-issue", bundlePath));
expect(added.ok).toBe(true);
if (!added.ok) {
return;
}
const ran = await cmdRun(storageRoot, "solve-issue", "hello", 5);
expect(ran.ok).toBe(true);
if (!ran.ok) {
return;
}
const threadId = ran.value.threadId;
let threads = await cmdThreads(storageRoot, []);
for (
let attempt = 0;
attempt < 50 && threads.ok && !threads.value.some((l) => l.includes(threadId));
attempt++
) {
await new Promise((r) => setTimeout(r, 20));
threads = await cmdThreads(storageRoot, []);
}
const dataPath = join(storageRoot, "logs", added.value.hash, `${threadId}.data.jsonl`);
const runningPath = join(dirname(dataPath), `${threadId}.running`);
await waitUntilRunningFileAbsent(runningPath, 120);
const put = await cmdCasPut(storageRoot, threadId, "keep-after-thread-rm");
expect(put.ok).toBe(true);
if (!put.ok) {
return;
}
const hash = put.value;
const casBlob = join(getGlobalCasDir(storageRoot), `${hash}.txt`);
const removed = await cmdThreadRemove(storageRoot, threadId);
expect(removed.ok).toBe(true);
const stillThere = await readTextFileIfExists(casBlob);
expect(stillThere).toBe("keep-after-thread-rm");
});
test("cli entrypoint dispatches threads / ps (spawn)", () => {
const env = { ...process.env, UNCAGED_WORKFLOW_STORAGE_ROOT: storageRoot };
const threads = spawnSync(process.execPath, [cliEntryPath, "threads"], {
@@ -199,7 +250,7 @@ describe("cli thread commands", () => {
return;
}
const ran = await cmdRun(storageRoot, "solve-issue", "hello", false, 5);
const ran = await cmdRun(storageRoot, "solve-issue", "hello", 5);
expect(ran.ok).toBe(true);
if (!ran.ok) {
return;
@@ -229,7 +280,7 @@ describe("cli thread commands", () => {
return;
}
const ran = await cmdRun(storageRoot, "solve-issue", "hello", false, 5);
const ran = await cmdRun(storageRoot, "solve-issue", "hello", 5);
expect(ran.ok).toBe(true);
if (!ran.ok) {
return;
@@ -268,7 +319,7 @@ describe("cli thread commands", () => {
return;
}
const ran = await cmdRun(storageRoot, "solve-issue", "hello", false, 5);
const ran = await cmdRun(storageRoot, "solve-issue", "hello", 5);
expect(ran.ok).toBe(true);
if (!ran.ok) {
return;
@@ -309,7 +360,7 @@ describe("cli thread commands", () => {
return;
}
const ran = await cmdRun(storageRoot, "solve-issue", "hello", false, 5);
const ran = await cmdRun(storageRoot, "solve-issue", "hello", 5);
expect(ran.ok).toBe(true);
if (!ran.ok) {
return;
@@ -338,7 +389,7 @@ describe("cli thread commands", () => {
return;
}
const ran = await cmdRun(storageRoot, "solve-issue", "hello", false, 5);
const ran = await cmdRun(storageRoot, "solve-issue", "hello", 5);
expect(ran.ok).toBe(true);
if (!ran.ok) {
return;
+96 -2
View File
@@ -1,5 +1,6 @@
import { printCliError, printCliLine, printCliWarn } from "./cli-output.js";
import { cmdAdd, formatAddSuccess, parseAddArgv } from "./cmd-add.js";
import { cmdCasGet, cmdCasList, cmdCasPut, cmdCasRm } from "./cmd-cas.js";
import { cmdFork, parseForkArgv } from "./cmd-fork.js";
import { cmdHistory } from "./cmd-history.js";
import { cmdKill } from "./cmd-kill.js";
@@ -22,7 +23,7 @@ function usage(): string {
" uncaged-workflow list",
" uncaged-workflow show <name>",
" uncaged-workflow remove <name>",
" uncaged-workflow run <name> [--prompt <text>] [--dry-run] [--max-rounds N]",
" uncaged-workflow run <name> [--prompt <text>] [--max-rounds N]",
" uncaged-workflow ps",
" uncaged-workflow kill <thread-id>",
" uncaged-workflow history <name>",
@@ -33,6 +34,10 @@ function usage(): string {
" uncaged-workflow thread <id>",
" uncaged-workflow thread rm <id>",
" uncaged-workflow fork <thread-id> [--from-role <role>]",
" uncaged-workflow cas get <thread-id> <hash>",
" uncaged-workflow cas put <thread-id> <content>",
" uncaged-workflow cas list <thread-id>",
" uncaged-workflow cas rm <thread-id> <hash>",
].join("\n");
}
@@ -111,7 +116,6 @@ async function dispatchRun(storageRoot: string, argv: string[]): Promise<number>
storageRoot,
parsed.value.name,
parsed.value.prompt,
parsed.value.dryRun,
parsed.value.maxRounds,
);
if (!result.ok) {
@@ -277,6 +281,95 @@ async function dispatchFork(storageRoot: string, argv: string[]): Promise<number
return 0;
}
async function dispatchCasGet(storageRoot: string, rest: string[]): Promise<number> {
const threadId = rest[0];
const hash = rest[1];
if (threadId === undefined || hash === undefined || rest.length > 2) {
printCliError(`${usage()}\n\nerror: cas get requires <thread-id> <hash>`);
return 1;
}
const result = await cmdCasGet(storageRoot, threadId, hash);
if (!result.ok) {
printCliError(result.error);
return 1;
}
printCliLine(result.value);
return 0;
}
async function dispatchCasPut(storageRoot: string, rest: string[]): Promise<number> {
const threadId = rest[0];
const content = rest[1];
if (threadId === undefined || content === undefined || rest.length > 2) {
printCliError(`${usage()}\n\nerror: cas put requires <thread-id> <content>`);
return 1;
}
const result = await cmdCasPut(storageRoot, threadId, content);
if (!result.ok) {
printCliError(result.error);
return 1;
}
printCliLine(result.value);
return 0;
}
async function dispatchCasList(storageRoot: string, rest: string[]): Promise<number> {
const threadId = rest[0];
if (threadId === undefined || rest.length > 1) {
printCliError(`${usage()}\n\nerror: cas list requires <thread-id>`);
return 1;
}
const result = await cmdCasList(storageRoot, threadId);
if (!result.ok) {
printCliError(result.error);
return 1;
}
for (const hash of result.value) {
printCliLine(hash);
}
return 0;
}
async function dispatchCasRm(storageRoot: string, rest: string[]): Promise<number> {
const threadId = rest[0];
const hash = rest[1];
if (threadId === undefined || hash === undefined || rest.length > 2) {
printCliError(`${usage()}\n\nerror: cas rm requires <thread-id> <hash>`);
return 1;
}
const result = await cmdCasRm(storageRoot, threadId, hash);
if (!result.ok) {
printCliError(result.error);
return 1;
}
printCliLine(`removed cas entry ${hash}`);
return 0;
}
const CAS_SUBCOMMAND_TABLE: Record<
string,
(storageRoot: string, rest: string[]) => Promise<number>
> = {
get: dispatchCasGet,
put: dispatchCasPut,
list: dispatchCasList,
rm: dispatchCasRm,
};
async function dispatchCas(storageRoot: string, argv: string[]): Promise<number> {
const sub = argv[0];
if (sub === undefined) {
printCliError(`${usage()}\n\nerror: unknown cas subcommand: (none)`);
return 1;
}
const handler = CAS_SUBCOMMAND_TABLE[sub];
if (handler === undefined) {
printCliError(`${usage()}\n\nerror: unknown cas subcommand: ${sub}`);
return 1;
}
return handler(storageRoot, argv.slice(1));
}
type DispatchFn = (storageRoot: string, argv: string[]) => Promise<number>;
const COMMAND_TABLE: Record<string, DispatchFn> = {
@@ -294,6 +387,7 @@ const COMMAND_TABLE: Record<string, DispatchFn> = {
threads: dispatchThreads,
thread: dispatchThreadBranch,
fork: dispatchFork,
cas: dispatchCas,
};
export async function runCli(storageRoot: string, argv: string[]): Promise<number> {
+43
View File
@@ -0,0 +1,43 @@
import { createCasStore, err, getGlobalCasDir, ok, type Result } from "@uncaged/workflow";
export async function cmdCasGet(
storageRoot: string,
_threadId: string,
hash: string,
): Promise<Result<string, string>> {
const cas = createCasStore(getGlobalCasDir(storageRoot));
const content = await cas.get(hash);
if (content === null) {
return err(`cas entry not found: ${hash}`);
}
return ok(content);
}
export async function cmdCasPut(
storageRoot: string,
_threadId: string,
content: string,
): Promise<Result<string, string>> {
const cas = createCasStore(getGlobalCasDir(storageRoot));
const hash = await cas.put(content);
return ok(hash);
}
export async function cmdCasList(
storageRoot: string,
_threadId: string,
): Promise<Result<string[], string>> {
const cas = createCasStore(getGlobalCasDir(storageRoot));
const hashes = await cas.list();
return ok(hashes);
}
export async function cmdCasRm(
storageRoot: string,
_threadId: string,
hash: string,
): Promise<Result<void, string>> {
const cas = createCasStore(getGlobalCasDir(storageRoot));
await cas.delete(hash);
return ok(undefined);
}
+1 -2
View File
@@ -15,7 +15,6 @@ export async function cmdRun(
storageRoot: string,
name: string,
prompt: string,
isDryRun: boolean,
maxRounds: number,
): Promise<Result<{ threadId: string }, string>> {
const nameOk = validateCliWorkflowName(name);
@@ -47,7 +46,7 @@ export async function cmdRun(
threadId,
workflowName: name,
prompt,
options: { isDryRun, maxRounds },
options: { maxRounds },
},
{ awaitResponseLine: false },
);
+2 -15
View File
@@ -3,20 +3,13 @@ import { err, ok, type Result } from "@uncaged/workflow";
export type ParsedRunArgv = {
name: string;
prompt: string;
dryRun: boolean;
maxRounds: number;
};
type FlagOk =
| { kind: "dry-run" }
| { kind: "prompt"; value: string }
| { kind: "max-rounds"; value: number };
type FlagOk = { kind: "prompt"; value: string } | { kind: "max-rounds"; value: number };
function parseFlagAt(argv: string[], index: number): Result<FlagOk, string> | null {
const flag = argv[index];
if (flag === "--dry-run") {
return ok({ kind: "dry-run" });
}
if (flag === "--prompt") {
const value = argv[index + 1];
if (value === undefined) {
@@ -41,7 +34,6 @@ function parseFlagAt(argv: string[], index: number): Result<FlagOk, string> | nu
export function parseRunArgv(argv: string[]): Result<ParsedRunArgv, string> {
let name: string | undefined;
let prompt = "";
let dryRun = false;
let maxRounds = 5;
let i = 0;
@@ -62,11 +54,6 @@ export function parseRunArgv(argv: string[]): Result<ParsedRunArgv, string> {
}
const flag = parsed.value;
if (flag.kind === "dry-run") {
dryRun = true;
i += 1;
continue;
}
if (flag.kind === "prompt") {
prompt = flag.value;
i += 2;
@@ -80,5 +67,5 @@ export function parseRunArgv(argv: string[]): Result<ParsedRunArgv, string> {
return err("run requires <name>");
}
return ok({ name, prompt, dryRun, maxRounds });
return ok({ name, prompt, maxRounds });
}
@@ -1,33 +1,41 @@
import { describe, expect, test } from "bun:test";
import type { ExtractContext, ExtractFn } from "@uncaged/workflow";
import type * as z from "zod/v4";
import { createCursorAgent, validateCursorAgentConfig } from "../src/index.js";
const testExtract: ExtractFn = async <T extends Record<string, unknown>>(
_schema: z.ZodType<T>,
_prompt: string,
_ctx: ExtractContext,
): Promise<T> => ({ workspace: "/tmp" }) as unknown as T;
describe("validateCursorAgentConfig", () => {
test("accepts valid config", () => {
const r = validateCursorAgentConfig({
workdir: "/tmp",
model: null,
timeout: null,
timeout: 0,
extract: testExtract,
});
expect(r.ok).toBe(true);
});
test("rejects empty workdir", () => {
test("rejects non-function extract", () => {
const r = validateCursorAgentConfig({
workdir: " ",
model: null,
timeout: null,
timeout: 0,
extract: null as unknown as ExtractFn,
});
expect(r.ok).toBe(false);
if (!r.ok) {
expect(r.error).toContain("workdir");
expect(r.error).toContain("extract");
}
});
test("rejects negative timeout", () => {
const r = validateCursorAgentConfig({
workdir: "/tmp",
model: null,
timeout: -1,
extract: testExtract,
});
expect(r.ok).toBe(false);
});
@@ -36,9 +44,9 @@ describe("validateCursorAgentConfig", () => {
describe("createCursorAgent", () => {
test("returns an AgentFn", () => {
const agent = createCursorAgent({
workdir: "/tmp",
model: null,
timeout: null,
timeout: 0,
extract: testExtract,
});
expect(typeof agent).toBe("function");
});
@@ -46,9 +54,9 @@ describe("createCursorAgent", () => {
test("throws on invalid config at construction", () => {
expect(() =>
createCursorAgent({
workdir: "",
model: null,
timeout: null,
timeout: -1,
extract: testExtract,
}),
).toThrow();
});
+2 -1
View File
@@ -10,6 +10,7 @@
},
"dependencies": {
"@uncaged/workflow": "workspace:*",
"@uncaged/workflow-util-agent": "workspace:*"
"@uncaged/workflow-util-agent": "workspace:*",
"zod": "^4.0.0"
}
}
+24 -6
View File
@@ -1,5 +1,6 @@
import type { AgentFn } from "@uncaged/workflow";
import type { AgentFn, ExtractContext } from "@uncaged/workflow";
import { buildAgentPrompt, type SpawnCliError, spawnCli } from "@uncaged/workflow-util-agent";
import * as z from "zod/v4";
import type { CursorAgentConfig } from "./types.js";
import { validateCursorAgentConfig } from "./validate-config.js";
@@ -8,6 +9,12 @@ export { buildAgentPrompt } from "@uncaged/workflow-util-agent";
export type { CursorAgentConfig } from "./types.js";
export { validateCursorAgentConfig } from "./validate-config.js";
const cursorWorkspaceSchema = z.object({
workspace: z
.string()
.describe("Absolute path to the project/repository directory the agent should work in"),
});
function throwCursorSpawnError(error: SpawnCliError): never {
if (error.kind === "non_zero_exit") {
throw new Error(
@@ -27,7 +34,7 @@ function resolveCursorModel(model: string | null): string {
return model === null ? "auto" : model;
}
/** Runs `cursor-agent` in {@link CursorAgentConfig.workdir} with a prompt built from context + system prompt. */
/** Runs `cursor-agent` with workspace from {@link CursorAgentConfig.extract} and prompt from context. */
export function createCursorAgent(config: CursorAgentConfig): AgentFn {
const validated = validateCursorAgentConfig(config);
if (!validated.ok) {
@@ -35,22 +42,33 @@ export function createCursorAgent(config: CursorAgentConfig): AgentFn {
}
const modelFlag = resolveCursorModel(config.model);
const timeoutMs = config.timeout;
const timeoutMs = config.timeout > 0 ? config.timeout : null;
return async (ctx, systemPrompt) => {
const fullPrompt = buildAgentPrompt(systemPrompt, ctx);
return async (ctx) => {
const extractCtx: ExtractContext = {
...ctx,
agentContent: "",
};
const { workspace } = await config.extract(
cursorWorkspaceSchema,
"From the thread context, determine the absolute filesystem path where the project/repository is located.",
extractCtx,
);
const fullPrompt = buildAgentPrompt(ctx);
const args = [
"-p",
fullPrompt,
"--model",
modelFlag,
"--workspace",
workspace,
"--output-format",
"text",
"--trust",
"--force",
];
const run = await spawnCli("cursor-agent", args, {
cwd: config.workdir,
cwd: workspace,
timeoutMs,
});
if (!run.ok) {
+4 -2
View File
@@ -1,5 +1,7 @@
import type { ExtractFn } from "@uncaged/workflow";
export type CursorAgentConfig = {
workdir: string;
model: string | null;
timeout: number | null;
timeout: number;
extract: ExtractFn;
};
@@ -3,11 +3,11 @@ import { err, ok, type Result } from "@uncaged/workflow";
import type { CursorAgentConfig } from "./types.js";
export function validateCursorAgentConfig(config: CursorAgentConfig): Result<void, string> {
if (config.workdir.trim() === "") {
return err("workdir must be a non-empty string");
if (typeof config.extract !== "function") {
return err("extract must be a function");
}
if (config.timeout !== null && config.timeout < 0) {
return err("timeout must be null or a non-negative number (milliseconds)");
if (config.timeout < 0) {
return err("timeout must be a non-negative number (milliseconds); use 0 for no limit");
}
return ok(undefined);
}
+2 -2
View File
@@ -34,8 +34,8 @@ export function createHermesAgent(config: HermesAgentConfig): AgentFn {
const timeoutMs = config.timeout;
return async (ctx, systemPrompt) => {
const fullPrompt = buildAgentPrompt(systemPrompt, ctx);
return async (ctx) => {
const fullPrompt = buildAgentPrompt(ctx);
const args = [
"chat",
"-q",
@@ -12,6 +12,8 @@ function makeCtx(userContent: string): ThreadContext {
timestamp: 1,
},
steps: [],
threadId: "01TEST000000000000000000TR",
currentRole: { name: "planner", systemPrompt: "system instructions" },
};
}
@@ -29,7 +31,7 @@ describe("createLlmAdapter", () => {
const provider = { baseUrl: "https://api.example/v1", apiKey: "k", model: "m" };
const adapter = createLlmAdapter(provider);
const out = await adapter(makeCtx("trigger text"), "system instructions");
const out = await adapter(makeCtx("trigger text"));
globalThis.fetch = originalFetch;
@@ -48,7 +50,7 @@ describe("createLlmAdapter", () => {
const provider = { baseUrl: "https://api.example/v1", apiKey: "k", model: "m" };
const adapter = createLlmAdapter(provider);
await expect(adapter(makeCtx("hi"), "sys")).rejects.toThrow("llm:");
await expect(adapter(makeCtx("hi"))).rejects.toThrow("llm:");
globalThis.fetch = originalFetch;
});
@@ -58,7 +60,7 @@ describe("createLlmAdapter", () => {
const provider = { baseUrl: "https://api.example/v1", apiKey: "k", model: "m" };
const adapter = createLlmAdapter(provider);
await expect(adapter(makeCtx("hi"), "sys")).rejects.toThrow();
await expect(adapter(makeCtx("hi"))).rejects.toThrow();
globalThis.fetch = originalFetch;
});
});
+14
View File
@@ -0,0 +1,14 @@
{
"name": "@uncaged/workflow-agent-llm",
"version": "0.1.0",
"type": "module",
"main": "src/index.ts",
"types": "src/index.ts",
"scripts": {
"build": "echo 'TODO'",
"test": "bun test"
},
"dependencies": {
"@uncaged/workflow": "workspace:*"
}
}
@@ -1,6 +1,14 @@
import { type AgentFn, err, ok, type Result, type ThreadContext } from "@uncaged/workflow";
import {
type AgentContext,
type AgentFn,
err,
type LlmProvider,
ok,
type Result,
} from "@uncaged/workflow";
import type { LlmMessage, LlmProvider } from "@uncaged/workflow-util-role";
/** OpenAI chat completion message shape (passed to `/chat/completions`). */
export type LlmMessage = { role: "system" | "user" | "assistant"; content: string };
export type LlmChatError =
| { kind: "http_error"; status: number; body: string }
@@ -89,13 +97,13 @@ export async function chatCompletionText(options: {
return parseAssistantText(res.value);
}
/** Single-turn chat adapter: system comes from `createRole` prompt; user is the thread start frame. */
/** Single-turn chat adapter: system prompt comes from {@link AgentContext.currentRole}. */
export function createLlmAdapter(provider: LlmProvider): AgentFn {
return async (ctx: ThreadContext, systemPrompt: string) => {
return async (ctx: AgentContext) => {
const result = await chatCompletionText({
provider,
messages: [
{ role: "system", content: systemPrompt },
{ role: "system", content: ctx.currentRole.systemPrompt },
{ role: "user", content: ctx.start.content },
],
});
+6
View File
@@ -0,0 +1,6 @@
export {
chatCompletionText,
createLlmAdapter,
type LlmChatError,
type LlmMessage,
} from "./create-llm-adapter.js";
@@ -1,15 +1,12 @@
{
"name": "@uncaged/workflow-util-role",
"name": "@uncaged/workflow-role-coder",
"version": "0.1.0",
"type": "module",
"main": "src/index.ts",
"types": "src/index.ts",
"exports": {
".": "./src/index.ts"
},
"scripts": {
"build": "echo 'TODO'",
"test": "bun test"
"test": "echo no tests"
},
"dependencies": {
"@uncaged/workflow": "workspace:*",
+42
View File
@@ -0,0 +1,42 @@
import type { RoleDefinition } from "@uncaged/workflow";
import * as z from "zod/v4";
export const coderMetaSchema = z.object({
completedPhase: z.string(),
filesChanged: z.array(z.string()),
summary: z.string(),
});
export type CoderMeta = z.infer<typeof coderMetaSchema>;
const CODER_SYSTEM = `You are a **coder**. Read the thread for the plan and work on the NEXT incomplete phase only.
## Finding the current thread ID
The thread ID is a 26-character Crockford Base32 string (e.g. \`06F03H5V6JTMDST6P3TVH42RWM\`). It appears in the first message of this conversation. If you are unsure, run:
uncaged-workflow threads
and use the ID of the active thread.
## Reading phase details
Each planner phase is identified by a content-hash and a title. To read a phase's full details (name, description, acceptance criteria), run:
uncaged-workflow cas get <THREAD_ID> <HASH>
Replace \`<THREAD_ID>\` with the actual thread ID and \`<HASH>\` with the phase hash from the plan.
## Completing a phase
Report which phase you completed using the phase **hash** (not the title). If you legitimately finish every remaining phase in this single turn, set completedPhase to the **last** phase hash in the plan (the workflow treats that as full completion). List the files you changed and summarize what you did.`;
export const coderRole: RoleDefinition<CoderMeta> = {
description:
"Implements the next incomplete planner phase and reports structured completion metadata.",
systemPrompt: CODER_SYSTEM,
extractPrompt:
"Extract completedPhase: the planner phase hash finished this round (exact hash string from the plan). If multiple phases were finished in one round, use the last finished phase hash. Extract filesChanged and a summary of the work.",
schema: coderMetaSchema,
extractRefs: (meta) => [meta.completedPhase],
};
@@ -0,0 +1 @@
export { type CoderMeta, coderMetaSchema, coderRole } from "./coder.js";
@@ -6,5 +6,5 @@
"composite": true
},
"include": ["src/**/*.ts"],
"references": [{ "path": "../workflow" }, { "path": "../workflow-util-role" }]
"references": [{ "path": "../workflow" }]
}
@@ -1,110 +1,19 @@
import { describe, expect, spyOn, test } from "bun:test";
import { execFile } from "node:child_process";
import { appendFile, mkdir, mkdtemp, writeFile } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { promisify } from "node:util";
import { describe, expect, test } from "bun:test";
import type { AgentFn, ThreadContext } from "@uncaged/workflow";
import { START } from "@uncaged/workflow";
import * as utilRole from "@uncaged/workflow-util-role";
import { committerMetaSchema, committerRole } from "../src/committer.js";
import { createCommitterRole } from "../src/committer.js";
import { gitExec } from "../src/git-exec.js";
const execFileAsync = promisify(execFile);
async function git(repo: string, args: string[]): Promise<void> {
await gitExec(repo, args);
}
async function setupRepoWithRemote(): Promise<{ repo: string }> {
const base = await mkdtemp(join(tmpdir(), "wf-committer-"));
const bare = join(base, "origin.git");
const repo = join(base, "work");
await mkdir(repo, { recursive: true });
await mkdir(bare, { recursive: true });
await execFileAsync("git", ["init"], { cwd: repo, encoding: "utf8" });
await git(repo, ["config", "user.email", "t@t"]);
await git(repo, ["config", "user.name", "t"]);
await writeFile(join(repo, "README.md"), "# hi\n", "utf8");
await git(repo, ["add", "README.md"]);
await git(repo, ["commit", "-m", "init"]);
await execFileAsync("git", ["init", "--bare"], { cwd: bare, encoding: "utf8" });
await git(repo, ["remote", "add", "origin", bare]);
await git(repo, ["push", "-u", "origin", "HEAD"]);
return { repo };
}
function makeCtx(): ThreadContext {
return {
start: {
role: START,
content: "do thing",
meta: { maxRounds: 10 },
timestamp: Date.now(),
},
steps: [],
};
}
const provider = { baseUrl: "https://example.com/v1", apiKey: "k", model: "m" };
describe("createCommitterRole", () => {
test("returns committed false when working tree clean", async () => {
const { repo } = await setupRepoWithRemote();
const agent: AgentFn = async () => {
throw new Error("agent should not run");
};
const role = createCommitterRole(
agent,
{ provider, dryRun: null, dryRunMeta: { branch: "dry-run", message: "chore: dry run" } },
{ cwd: repo, remote: "origin", threadId: null },
);
const out = await role(makeCtx());
expect(out.meta.committed).toBe(false);
describe("committerRole", () => {
test("committed sample validates against schema", () => {
const parsed = committerMetaSchema.safeParse({
status: "committed" as const,
branch: "feat/example",
commitSha: "abc1234",
});
expect(parsed.success).toBe(true);
});
test("dry-run skips pipeline", async () => {
const agent: AgentFn = async () => {
throw new Error("agent should not run");
};
const role = createCommitterRole(agent, {
provider,
dryRun: true,
dryRunMeta: { branch: "dry-run", message: "chore: dry run" },
});
const out = await role(makeCtx());
expect(out.content).toBe("[dry-run] committer skipped");
expect(out.meta).toEqual({ committed: true });
});
test("commits and pushes when changes exist", async () => {
const { repo } = await setupRepoWithRemote();
await appendFile(join(repo, "README.md"), "\nmore\n", "utf8");
const spy = spyOn(utilRole, "extractMetaOrThrow").mockResolvedValue({
branch: "feat/test-commit",
message: "feat: add more",
});
const agent: AgentFn = async () => "plan text";
const role = createCommitterRole(
agent,
{ provider, dryRun: null, dryRunMeta: { branch: "dry-run", message: "chore: dry run" } },
{ cwd: repo, remote: "origin", threadId: null },
);
const out = await role(makeCtx());
expect(out.meta.committed).toBe(true);
expect(spy).toHaveBeenCalled();
const branches = await gitExec(repo, ["branch", "--list", "feat/test-commit"]);
expect(branches).toContain("feat/test-commit");
const remoteRefs = await gitExec(repo, ["ls-remote", "--heads", "origin", "feat/test-commit"]);
expect(remoteRefs.trim().length).toBeGreaterThan(0);
spy.mockRestore();
test("exposes generic committer system prompt", () => {
expect(committerRole.systemPrompt).toContain("git committer");
expect(committerRole.systemPrompt).not.toContain("project is at");
});
});
@@ -10,10 +10,6 @@
},
"dependencies": {
"@uncaged/workflow": "workspace:*",
"@uncaged/workflow-util-role": "workspace:*",
"zod": "^4.0.0"
},
"devDependencies": {
"@types/node": "^25.6.0"
}
}
+28 -146
View File
@@ -1,153 +1,35 @@
import type { AgentFn, Role, RoleResult, ThreadContext } from "@uncaged/workflow";
import {
decorateRole,
extractMetaOrThrow,
type LlmProvider,
onFail,
withDryRun,
} from "@uncaged/workflow-util-role";
import type { RoleDefinition } from "@uncaged/workflow";
import * as z from "zod/v4";
import { gitExec } from "./git-exec.js";
export const committerMetaSchema = z.discriminatedUnion("status", [
z.object({
status: z.literal("committed"),
branch: z.string(),
commitSha: z.string(),
}),
z.object({
status: z.literal("recoverable"),
error: z.string(),
logRef: z.string().nullable(),
}),
z.object({
status: z.literal("unrecoverable"),
error: z.string(),
logRef: z.string().nullable(),
}),
]);
export const committerMetaSchema = z.object({
committed: z
.boolean()
.describe("true if branch created, changes committed, and pushed successfully"),
});
export type CommitterMeta = z.infer<typeof committerMetaSchema>;
const committerPlanSchema = z.object({
branch: z.string().describe("Feature branch name, e.g. feat/slug or fix/slug"),
message: z.string().describe("Single-line conventional commit subject"),
});
const COMMITTER_SYSTEM = `You are the git committer. Create a branch, commit the changes, and push.
Report the branch name and commit SHA. On failure, classify as recoverable or unrecoverable.
Do not attempt to fix failures yourself.`;
export type CommitterPlanMeta = z.infer<typeof committerPlanSchema>;
export type CommitterGitConfig = {
cwd: string;
remote: string;
/** When non-null, prompts mention `uncaged-workflow thread <id>` for extra context. */
threadId: string | null;
export const committerRole: RoleDefinition<CommitterMeta> = {
description: "Creates branch, commits, and pushes when review passes.",
systemPrompt: COMMITTER_SYSTEM,
extractPrompt:
"Extract the commit result: committed (with branch and SHA), recoverable failure, or unrecoverable failure. Include error details and log references if applicable.",
schema: committerMetaSchema,
extractRefs: null,
};
export const DEFAULT_COMMITTER_GIT_CONFIG: CommitterGitConfig = {
cwd: ".",
remote: "origin",
threadId: null,
};
function resolveExtractDryRun(extractDryRun: boolean | null): boolean {
return extractDryRun === true;
}
function summarizeThreadContext(ctx: ThreadContext): string {
const lines: string[] = [`Initial prompt:\n${ctx.start.content}`];
for (const step of ctx.steps) {
const snippet = step.content.length > 800 ? `${step.content.slice(0, 800)}` : step.content;
lines.push(`\n### ${step.role}\n${snippet}`);
}
return lines.join("\n");
}
function sanitizeBranch(branch: string): string {
const t = branch.trim();
if (
t === "" ||
t.includes("..") ||
t.includes(" ") ||
t.startsWith("-") ||
t.includes("\n") ||
t.includes("\t")
) {
throw new Error(`invalid branch name: ${branch}`);
}
return t;
}
function sanitizeCommitMessage(message: string): string {
const line = message.trim().split(/\r?\n/)[0] ?? "";
if (line === "") {
throw new Error("commit message is empty");
}
return line;
}
function committerPlanPrompt(ctx: ThreadContext, gitConfig: CommitterGitConfig): string {
const threadLine =
gitConfig.threadId !== null
? `Optional CLI context: run \`uncaged-workflow thread ${gitConfig.threadId}\` if available.\n`
: "";
return `You plan a git branch and a single-line conventional commit message for the following workflow thread.
${threadLine}
## Thread context
${summarizeThreadContext(ctx)}
## Your task
Infer a good branch name (\`feat/<slug>\` or \`fix/<slug>\`) and a conventional commit **subject** (one line, no body).
Reply with enough detail that a maintainer understands the change; structured extraction will read \`branch\` and \`message\` from your answer.`;
}
async function runCommitterPipeline(
ctx: ThreadContext,
agent: AgentFn,
extract: { provider: LlmProvider; dryRun: boolean | null; dryRunMeta: CommitterPlanMeta },
gitConfig: CommitterGitConfig,
): Promise<RoleResult<CommitterMeta>> {
const cwd = gitConfig.cwd;
const porcelain = await gitExec(cwd, ["status", "--porcelain"]);
if (porcelain.trim() === "") {
return {
content: "Working tree clean; nothing to commit.",
meta: { committed: false },
};
}
const prompt = committerPlanPrompt(ctx, gitConfig);
const raw = await agent(ctx, prompt);
const plan = await extractMetaOrThrow("committer-plan", raw, committerPlanSchema, {
provider: extract.provider,
dryRun: resolveExtractDryRun(extract.dryRun),
dryRunMeta: extract.dryRunMeta,
});
const branch = sanitizeBranch(plan.branch);
const message = sanitizeCommitMessage(plan.message);
await gitExec(cwd, ["checkout", "-b", branch]);
await gitExec(cwd, ["add", "-A"]);
await gitExec(cwd, ["commit", "-m", message]);
await gitExec(cwd, ["push", "-u", gitConfig.remote, branch]);
return {
content: raw,
meta: { committed: true },
};
}
/**
* Git committer role: LLM proposes branch + message; this package runs git via `child_process`.
* Decorators match nerve semantics: dry-run skips work with `committed: true`; failures yield `committed: false`.
*/
export function createCommitterRole(
adapter: AgentFn,
extract: { provider: LlmProvider; dryRun: boolean | null; dryRunMeta: CommitterPlanMeta },
gitConfig: CommitterGitConfig = DEFAULT_COMMITTER_GIT_CONFIG,
): Role<CommitterMeta> {
const inner: Role<CommitterMeta> = async (ctx) =>
runCommitterPipeline(ctx, adapter, extract, gitConfig);
return decorateRole(inner, [
withDryRun<CommitterMeta>({
label: "committer",
meta: { committed: true },
dryRun: resolveExtractDryRun(extract.dryRun),
}),
onFail<CommitterMeta>({ label: "committer", meta: { committed: false } }),
]);
}
@@ -1,26 +0,0 @@
import { execFile } from "node:child_process";
import { promisify } from "node:util";
const execFileAsync = promisify(execFile);
/** Runs `git` with args in `cwd`; throws if git exits non-zero. */
export async function gitExec(cwd: string, args: readonly string[]): Promise<string> {
try {
const r = await execFileAsync("git", [...args], {
cwd,
encoding: "utf8",
maxBuffer: 10 * 1024 * 1024,
});
return r.stdout;
} catch (e) {
const stderr =
typeof e === "object" &&
e !== null &&
"stderr" in e &&
typeof (e as { stderr: unknown }).stderr === "string"
? (e as { stderr: string }).stderr
: "";
const msg = e instanceof Error ? e.message : String(e);
throw new Error(`git ${args.join(" ")} failed: ${msg}${stderr ? ` (${stderr.trim()})` : ""}`);
}
}
@@ -1,9 +1 @@
export {
type CommitterGitConfig,
type CommitterMeta,
type CommitterPlanMeta,
committerMetaSchema,
createCommitterRole,
DEFAULT_COMMITTER_GIT_CONFIG,
} from "./committer.js";
export { gitExec } from "./git-exec.js";
export { type CommitterMeta, committerMetaSchema, committerRole } from "./committer.js";
@@ -3,9 +3,8 @@
"compilerOptions": {
"rootDir": "src",
"outDir": "dist",
"composite": true,
"types": ["bun-types"]
"composite": true
},
"include": ["src/**/*.ts"],
"references": [{ "path": "../workflow" }, { "path": "../workflow-util-role" }]
"references": [{ "path": "../workflow" }]
}
@@ -1,159 +0,0 @@
import { afterEach, describe, expect, mock, spyOn, test } from "bun:test";
import type { AgentFn, ThreadContext } from "@uncaged/workflow";
import { START } from "@uncaged/workflow";
import * as extractMetaModule from "@uncaged/workflow-util-role";
import * as z from "zod/v4";
import { createRole } from "../src/create-role.js";
const provider = {
baseUrl: "https://example.com/v1",
apiKey: "k",
model: "m",
};
function toolCallResponse(argsJson: string): Response {
return new Response(
JSON.stringify({
choices: [
{
message: {
tool_calls: [
{
function: {
name: "extract",
arguments: argsJson,
},
},
],
},
},
],
}),
{ status: 200, headers: { "Content-Type": "application/json" } },
);
}
function makeCtx(): ThreadContext {
return {
start: {
role: START,
content: "",
meta: { maxRounds: 10 },
timestamp: Date.now(),
},
steps: [],
};
}
describe("createRole", () => {
const originalFetch = globalThis.fetch;
afterEach(() => {
globalThis.fetch = originalFetch;
mock.restore();
});
test("runs AgentFn then structured extract", async () => {
globalThis.fetch = (() =>
Promise.resolve(toolCallResponse(JSON.stringify({ n: 3 })))) as unknown as typeof fetch;
const schema = z.object({ n: z.number() });
const agent: AgentFn = async (_ctx, prompt) => prompt;
const role = createRole({
name: "test",
schema,
systemPrompt: "hello",
agent,
extract: { provider, dryRun: null, dryRunMeta: { n: 0 } },
});
const out = await role(makeCtx());
expect(out.content).toBe("hello");
expect(out.meta).toEqual({ n: 3 });
});
test("passes ThreadContext to AgentFn", async () => {
globalThis.fetch = (() =>
Promise.resolve(toolCallResponse(JSON.stringify({ n: 0 })))) as unknown as typeof fetch;
const seen: ThreadContext[] = [];
const agent: AgentFn = async (ctx, _prompt) => {
seen.push(ctx);
return "x";
};
const role = createRole({
name: "test",
schema: z.object({ n: z.number() }),
systemPrompt: "p",
agent,
extract: { provider, dryRun: null, dryRunMeta: { n: 0 } },
});
await role(makeCtx());
expect(seen).toHaveLength(1);
expect(seen[0].steps).toEqual([]);
});
test("resolves dynamic systemPrompt functions before AgentFn", async () => {
globalThis.fetch = (() =>
Promise.resolve(toolCallResponse(JSON.stringify({ n: 99 })))) as unknown as typeof fetch;
const schema = z.object({ n: z.number() });
const agent: AgentFn = async (_ctx, prompt) => prompt;
const role = createRole({
name: "test",
schema,
systemPrompt: async (ctx) => `rounds=${ctx.steps.length}`,
agent,
extract: { provider, dryRun: null, dryRunMeta: { n: 0 } },
});
const ctx = makeCtx();
const out = await role(ctx);
expect(out.content).toBe("rounds=0");
expect(out.meta).toEqual({ n: 99 });
});
test("extract dryRun null runs live extract path", async () => {
const spy = spyOn(extractMetaModule, "extractMetaOrThrow").mockResolvedValue({ n: 0 });
const agent: AgentFn = async () => "raw";
const role = createRole({
name: "r1",
schema: z.object({ n: z.number() }),
systemPrompt: "p",
agent,
extract: { provider, dryRun: null, dryRunMeta: { n: 0 } },
});
await role(makeCtx());
expect(spy).toHaveBeenCalledWith(
"r1",
"raw",
expect.anything(),
expect.objectContaining({ provider, dryRun: false, dryRunMeta: { n: 0 } }),
);
});
test("extract.dryRun true uses structured extract dry-run", async () => {
const spy = spyOn(extractMetaModule, "extractMetaOrThrow").mockResolvedValue({ n: 0 });
const agent: AgentFn = async () => "raw";
const role = createRole({
name: "r2",
schema: z.object({ n: z.number() }),
systemPrompt: "p",
agent,
extract: { provider, dryRun: true, dryRunMeta: { n: 0 } },
});
await role(makeCtx());
expect(spy).toHaveBeenCalledWith(
"r2",
"raw",
expect.anything(),
expect.objectContaining({ dryRun: true, dryRunMeta: { n: 0 } }),
);
});
});
@@ -1,35 +0,0 @@
import type { AgentFn, Role, ThreadContext } from "@uncaged/workflow";
import { extractMetaOrThrow, type LlmProvider } from "@uncaged/workflow-util-role";
import type * as z from "zod/v4";
export type CreateRoleArgs<M extends Record<string, unknown>> = {
name: string;
schema: z.ZodType<M>;
systemPrompt: string | ((ctx: ThreadContext) => Promise<string>);
agent: AgentFn;
extract: {
provider: LlmProvider;
/** When `true`, structured extract returns `dryRunMeta`. When `null`, live API extract. */
dryRun: boolean | null;
dryRunMeta: M;
};
};
function resolveExtractDryRun(extractDryRun: boolean | null): boolean {
return extractDryRun === true;
}
/** Builds a {@link Role} from an {@link AgentFn}, system prompt, Zod meta schema, and extract wiring. */
export function createRole<M extends Record<string, unknown>>(args: CreateRoleArgs<M>): Role<M> {
return async (ctx: ThreadContext) => {
const promptText =
typeof args.systemPrompt === "string" ? args.systemPrompt : await args.systemPrompt(ctx);
const raw = await args.agent(ctx, promptText);
const meta = await extractMetaOrThrow(args.name, raw, args.schema, {
provider: args.extract.provider,
dryRun: resolveExtractDryRun(args.extract.dryRun),
dryRunMeta: args.extract.dryRunMeta,
});
return { content: raw, meta };
};
}
-21
View File
@@ -1,21 +0,0 @@
export {
buildDescriptorFromRoles,
decorateRole,
extractMetaOrThrow,
type LlmError,
type LlmExtractArgs,
type LlmMessage,
type LlmProvider,
llmErrorToCause,
llmExtract,
llmExtractWithRetry,
type MetaExtractConfig,
type OnFailOptions,
onFail,
type RoleDecorator,
type RoleDescriptorInput,
type WithDryRunOptions,
withDryRun,
} from "@uncaged/workflow-util-role";
export { chatCompletionText, createLlmAdapter, type LlmChatError } from "./create-llm-adapter.js";
export { type CreateRoleArgs, createRole } from "./create-role.js";
@@ -1,19 +1,15 @@
{
"name": "@uncaged/workflow-role-llm",
"name": "@uncaged/workflow-role-planner",
"version": "0.1.0",
"type": "module",
"main": "src/index.ts",
"types": "src/index.ts",
"exports": {
".": "./src/index.ts"
},
"scripts": {
"build": "echo 'TODO'",
"test": "bun test"
"test": "echo no tests"
},
"dependencies": {
"@uncaged/workflow": "workspace:*",
"@uncaged/workflow-util-role": "workspace:*",
"zod": "^4.0.0"
}
}
@@ -0,0 +1,6 @@
export {
type PlannerMeta,
phaseSchema,
plannerMetaSchema,
plannerRole,
} from "./planner.js";
@@ -0,0 +1,53 @@
import type { RoleDefinition } from "@uncaged/workflow";
import * as z from "zod/v4";
export const phaseSchema = z.object({
hash: z.string(),
title: z.string(),
});
export const plannerMetaSchema = z.object({
phases: z.array(phaseSchema),
});
export type PlannerMeta = z.infer<typeof plannerMetaSchema>;
const PLANNER_SYSTEM = `You are a **planner** for a software task. Break the work into **sequential phases** the coder will execute one at a time.
## Finding the current thread ID
The thread ID is a 26-character Crockford Base32 string (e.g. \`06F03H5V6JTMDST6P3TVH42RWM\`). It appears in the first message of this conversation. If you are unsure, run:
uncaged-workflow threads
and use the ID of the active thread.
## Storing phase details — MANDATORY
For each phase you MUST store its full detail text in CAS using this exact CLI command:
uncaged-workflow cas put <THREAD_ID> '# <name>
Description: <description>
Acceptance: <acceptance>'
Replace \`<THREAD_ID>\` with the actual thread ID you found above. The command prints a content-hash to stdout — use that hash as the phase identifier.
**Do NOT store phase details in any other way** (no temp files, no invented paths). The CLI command is the only supported storage mechanism.
## Output format
After storing all phases via the CLI, output compact JSON only:
{ "phases": [{ "hash": "<hash-from-cas-put>", "title": "<one-line-summary>" }] }
Order phases so earlier steps unblock later ones. Cover root cause, edge cases, and verification across the phases.`;
export const plannerRole: RoleDefinition<PlannerMeta> = {
description: "Breaks the task into sequential phases for the coder.",
systemPrompt: PLANNER_SYSTEM,
extractPrompt:
"Extract the implementation phases from the agent's output. Each phase has a hash (the CAS content-hash returned by the cas put command) and a title (one-line summary).",
schema: plannerMetaSchema,
extractRefs: (meta) => meta.phases.map((p) => p.hash),
};
@@ -0,0 +1,10 @@
{
"extends": "../../tsconfig.json",
"compilerOptions": {
"rootDir": "src",
"outDir": "dist",
"composite": true
},
"include": ["src/**/*.ts"],
"references": [{ "path": "../workflow" }]
}
@@ -0,0 +1,15 @@
{
"name": "@uncaged/workflow-role-preparer",
"version": "0.1.0",
"type": "module",
"main": "src/index.ts",
"types": "src/index.ts",
"scripts": {
"build": "echo 'TODO'",
"test": "echo no tests"
},
"dependencies": {
"@uncaged/workflow": "workspace:*",
"zod": "^4.0.0"
}
}
@@ -0,0 +1,5 @@
export {
type PreparerMeta,
preparerMetaSchema,
preparerRole,
} from "./preparer.js";
@@ -0,0 +1,51 @@
import type { RoleDefinition } from "@uncaged/workflow";
import * as z from "zod/v4";
const toolchainSchema = z.object({
packageManager: z.union([z.string(), z.null()]),
testCommand: z.union([z.string(), z.null()]),
lintCommand: z.union([z.string(), z.null()]),
buildCommand: z.union([z.string(), z.null()]),
});
export const preparerMetaSchema = z.object({
repoPath: z.string(),
defaultBranch: z.string(),
conventions: z.union([z.string(), z.null()]),
toolchain: toolchainSchema,
});
export type PreparerMeta = z.infer<typeof preparerMetaSchema>;
const PREPARER_SYSTEM = `You are a **preparer** for a software task. Your job is to locate (or clone) the target repository locally, ensure it is up to date, and gather project context before work begins.
## Responsibilities
1. Parse the issue/task prompt to identify the target repository (URL, org/repo, or name).
2. Search for an existing local clone in these locations (in order):
- ~/Code/<repo-name>/
- ~/repos/<repo-name>/
- ~/Code/<org>/<repo-name>/
- ~/repos/<org>/<repo-name>/
3. If not found locally, \`git clone\` it into ~/repos/<repo-name>/.
4. \`git checkout main && git pull\` (or the default branch) to ensure latest.
5. Read project conventions: \`CLAUDE.md\`, \`CONTRIBUTING.md\`, \`.cursor/rules/*.mdc\`, \`CONVENTIONS.md\`.
6. Detect toolchain: package manager, test runner, linter, build system.
## Output
Report your findings as structured data:
- **repoPath**: absolute path to the local repo
- **defaultBranch**: the default branch name (e.g. "main")
- **conventions**: a summary of project conventions found, or null if none
- **toolchain**: detected commands for packageManager, testCommand, lintCommand, buildCommand (null if not detected)`;
export const preparerRole: RoleDefinition<PreparerMeta> = {
description:
"Locates or clones the target repository, ensures it is up to date, and gathers project context (conventions, toolchain).",
systemPrompt: PREPARER_SYSTEM,
extractPrompt:
"Extract repoPath (absolute path), defaultBranch, conventions (summary string or null), and toolchain (packageManager, testCommand, lintCommand, buildCommand — each string or null).",
schema: preparerMetaSchema,
extractRefs: null,
};
@@ -0,0 +1,8 @@
{
"extends": "../../tsconfig.json",
"compilerOptions": {
"rootDir": "src",
"outDir": "dist"
},
"include": ["src"]
}
@@ -1,98 +1,15 @@
import { afterEach, describe, expect, mock, test } from "bun:test";
import { describe, expect, test } from "bun:test";
import type { AgentFn, ThreadContext } from "@uncaged/workflow";
import { START } from "@uncaged/workflow";
import { reviewerMetaSchema, reviewerRole } from "../src/reviewer.js";
import { createReviewerRole, DEFAULT_REVIEWER_CONFIG } from "../src/reviewer.js";
function toolCallResponse(argsJson: string): Response {
return new Response(
JSON.stringify({
choices: [
{
message: {
tool_calls: [
{
function: {
name: "extract",
arguments: argsJson,
},
},
],
},
},
],
}),
{ status: 200, headers: { "Content-Type": "application/json" } },
);
}
function makeCtx(): ThreadContext {
return {
start: {
role: START,
content: "task",
meta: { maxRounds: 10 },
timestamp: Date.now(),
},
steps: [],
};
}
const provider = { baseUrl: "https://example.com/v1", apiKey: "k", model: "m" };
describe("createReviewerRole", () => {
const originalFetch = globalThis.fetch;
afterEach(() => {
globalThis.fetch = originalFetch;
mock.restore();
describe("reviewerRole", () => {
test("approved sample validates against schema", () => {
const parsed = reviewerMetaSchema.safeParse({ status: "approved" as const });
expect(parsed.success).toBe(true);
});
test("runs reviewer extract", async () => {
globalThis.fetch = (() =>
Promise.resolve(
toolCallResponse(JSON.stringify({ approved: true })),
)) as unknown as typeof fetch;
const agent: AgentFn = async (_ctx, prompt) => {
expect(prompt).toContain("git diff");
expect(prompt).toContain(DEFAULT_REVIEWER_CONFIG.cwd);
return "review done";
};
const role = createReviewerRole(agent, {
provider,
dryRun: null,
dryRunMeta: { approved: true },
});
const out = await role(makeCtx());
expect(out.meta).toEqual({ approved: true });
});
test("includes uncaged-workflow thread hint when threadId set", async () => {
globalThis.fetch = (() =>
Promise.resolve(
toolCallResponse(JSON.stringify({ approved: false })),
)) as unknown as typeof fetch;
let seen = "";
const agent: AgentFn = async (_ctx, prompt) => {
seen = prompt;
return "x";
};
const role = createReviewerRole(
agent,
{ provider, dryRun: null, dryRunMeta: { approved: false } },
{
cwd: "/proj",
conventionsPath: null,
extraChecks: [],
threadId: "01ABCDEF234567890ABCDEFGH",
},
);
await role(makeCtx());
expect(seen).toContain("uncaged-workflow thread 01ABCDEF234567890ABCDEFGH");
test("system prompt is generic (no cwd)", () => {
expect(reviewerRole.systemPrompt).toContain("code reviewer");
expect(reviewerRole.systemPrompt).not.toContain("project is at");
});
});
@@ -10,8 +10,6 @@
},
"dependencies": {
"@uncaged/workflow": "workspace:*",
"@uncaged/workflow-role-llm": "workspace:*",
"@uncaged/workflow-util-role": "workspace:*",
"zod": "^4.0.0"
}
}
+1 -7
View File
@@ -1,7 +1 @@
export {
createReviewerRole,
DEFAULT_REVIEWER_CONFIG,
type ReviewerConfig,
type ReviewerMeta,
reviewerMetaSchema,
} from "./reviewer.js";
export { type ReviewerMeta, reviewerMetaSchema, reviewerRole } from "./reviewer.js";
+20 -103
View File
@@ -1,108 +1,25 @@
import type { AgentFn, Role, ThreadContext } from "@uncaged/workflow";
import { createRole } from "@uncaged/workflow-role-llm";
import type { LlmProvider } from "@uncaged/workflow-util-role";
import type { RoleDefinition } from "@uncaged/workflow";
import * as z from "zod/v4";
export const reviewerMetaSchema = z.object({
approved: z.boolean().describe("true if the diff is clean and ready to merge"),
});
export const reviewerMetaSchema = z.discriminatedUnion("status", [
z.object({
status: z.literal("approved"),
}),
z.object({
status: z.literal("rejected"),
issues: z.array(z.string()).describe("blocking issues that must be fixed"),
}),
]);
export type ReviewerMeta = z.infer<typeof reviewerMetaSchema>;
export type ReviewerConfig = {
cwd: string;
conventionsPath: string | null;
extraChecks: ReadonlyArray<string>;
/** When non-null, prompts reference `uncaged-workflow thread <id>`. */
threadId: string | null;
const REVIEWER_SYSTEM = `You are a code reviewer. Review the current git diff. Give a clear approve or reject verdict.
Only reject for blocking issues. End with your verdict.`;
export const reviewerRole: RoleDefinition<ReviewerMeta> = {
description: "Runs git diff checks and sets approved when the change is ready.",
systemPrompt: REVIEWER_SYSTEM,
extractPrompt:
"Extract the review verdict: approved or rejected. If rejected, list the blocking issues.",
schema: reviewerMetaSchema,
extractRefs: null,
};
export const DEFAULT_REVIEWER_CONFIG: ReviewerConfig = {
cwd: ".",
conventionsPath: "CONVENTIONS.md",
extraChecks: [],
threadId: null,
};
function summarizeThreadContext(ctx: ThreadContext): string {
const lines: string[] = [`Initial prompt:\n${ctx.start.content}`];
for (const step of ctx.steps) {
const snippet = step.content.length > 600 ? `${step.content.slice(0, 600)}` : step.content;
lines.push(`\n### ${step.role}\n${snippet}`);
}
return lines.join("\n");
}
function reviewerPrompt(config: ReviewerConfig, ctx: ThreadContext): string {
const { cwd, conventionsPath, extraChecks, threadId } = config;
const conventionsBlock =
conventionsPath !== null ? `Read project conventions: \`cat ${cwd}/${conventionsPath}\`\n` : "";
const threadBlock =
threadId !== null
? `Read the workflow thread for context: \`uncaged-workflow thread ${threadId}\`\n`
: `## Thread context (no thread id)\n\n${summarizeThreadContext(ctx)}\n`;
const extraBlock =
extraChecks.length > 0
? `\n### Project-specific checks\n${extraChecks.map((c) => `- ${c}`).join("\n")}\n`
: "";
return `You are a **code reviewer**. You run after the coder and before the tester.
**IMPORTANT: The project is at \`${cwd}\`. Always \`cd ${cwd}\` first.**
${threadBlock}
${conventionsBlock}
## Your job — static analysis of the git diff
Run these commands and analyze the output:
1. **\`cd ${cwd} && git diff --stat\`** — see what files changed
2. **\`cd ${cwd} && git diff\`** — read the actual diff
3. **\`cd ${cwd} && git status --short\`** — check for untracked files
## Checklist
### Reject (approved: false) — tell coder exactly what to fix
- **Garbage files**: build artifacts, lockfiles, IDE config that should not be committed
- **Secrets/credentials**: API keys, tokens, passwords hardcoded in the diff
- **Unrelated changes**: files modified outside the scope of the task
${
conventionsPath !== null
? `- **Convention violations**: patterns that contradict ${conventionsPath}\n`
: ""
}${extraBlock}
### Approve (approved: true) — no comment needed
- Diff is clean, focused, follows project standards
End with:
\`\`\`json
{ "approved": true }
\`\`\`
or
\`\`\`json
{ "approved": false }
\`\`\``;
}
/**
* Code review role: agent inspects git diffs; structured extract yields `approved`.
*/
export function createReviewerRole(
adapter: AgentFn,
extract: { provider: LlmProvider; dryRun: boolean | null; dryRunMeta: ReviewerMeta },
config: ReviewerConfig = DEFAULT_REVIEWER_CONFIG,
): Role<ReviewerMeta> {
return createRole({
name: "reviewer",
schema: reviewerMetaSchema,
systemPrompt: async (ctx) => reviewerPrompt(config, ctx),
agent: adapter,
extract: {
provider: extract.provider,
dryRun: extract.dryRun,
dryRunMeta: extract.dryRunMeta,
},
});
}
@@ -6,9 +6,5 @@
"composite": true
},
"include": ["src/**/*.ts"],
"references": [
{ "path": "../workflow" },
{ "path": "../workflow-role-llm" },
{ "path": "../workflow-util-role" }
]
"references": [{ "path": "../workflow" }]
}
@@ -1,17 +1,95 @@
import { describe, expect, test } from "bun:test";
import type { RoleMeta } from "@uncaged/workflow";
import { afterEach, describe, expect, test } from "bun:test";
import {
createExtract,
END,
type ModeratorContext,
type RoleStep,
START,
type ThreadContext,
validateWorkflowDescriptor,
} from "@uncaged/workflow";
import { buildSolveIssueDescriptor } from "../src/descriptor.js";
import { solveIssueModerator } from "../src/moderator.js";
import { createSolveIssueRoles, type SolveIssueMeta } from "../src/roles.js";
function makeStart(maxRounds: number): ThreadContext<SolveIssueMeta>["start"] {
import type { CoderMeta } from "@uncaged/workflow-role-coder";
import type { PlannerMeta } from "@uncaged/workflow-role-planner";
import type { PreparerMeta } from "@uncaged/workflow-role-preparer";
import { buildSolveIssueDescriptor } from "../src/descriptor.js";
import { createSolveIssueRun, solveIssueModerator } from "../src/index.js";
import type { SolveIssueMeta } from "../src/roles.js";
const DEFAULT_PHASES: PlannerMeta["phases"] = [
{
hash: "4KNMR2PX",
title: "Do the work",
},
];
const EXPECT_PLANNER_META: PlannerMeta = {
phases: [
{
hash: "7BQST3VW",
title: "placeholder phase",
},
],
};
const EXPECT_CODER_META: CoderMeta = {
completedPhase: "7BQST3VW",
filesChanged: [],
summary: "",
};
function installMockChatCompletions(sequence: ReadonlyArray<Record<string, unknown>>): () => void {
const origFetch = globalThis.fetch;
let i = 0;
const mockFetch = async (
input: Parameters<typeof fetch>[0],
init?: RequestInit,
): Promise<Response> => {
const args = sequence[i] ?? sequence[sequence.length - 1];
if (args === undefined) {
throw new Error("installMockChatCompletions: empty sequence");
}
i += 1;
void input;
const body = init?.body ? (JSON.parse(String(init.body)) as Record<string, unknown>) : {};
const tools = body.tools;
const firstTool =
Array.isArray(tools) && tools.length > 0 && tools[0] !== null && typeof tools[0] === "object"
? (tools[0] as Record<string, unknown>)
: null;
const fn =
firstTool !== null ? (firstTool.function as Record<string, unknown> | undefined) : undefined;
const toolName = typeof fn?.name === "string" ? fn.name : "extract";
return new Response(
JSON.stringify({
choices: [
{
message: {
tool_calls: [
{
type: "function",
function: {
name: toolName,
arguments: JSON.stringify(args),
},
},
],
},
},
],
}),
{ status: 200, headers: { "Content-Type": "application/json" } },
);
};
globalThis.fetch = Object.assign(mockFetch, {
preconnect: origFetch.preconnect.bind(origFetch),
}) as typeof fetch;
return () => {
globalThis.fetch = origFetch;
};
}
function makeStart(maxRounds: number): ModeratorContext<SolveIssueMeta>["start"] {
return {
role: START,
content: "Fix the flaky login test",
@@ -22,28 +100,51 @@ function makeStart(maxRounds: number): ThreadContext<SolveIssueMeta>["start"] {
function makeCtx(
maxRounds: number,
steps: ThreadContext<SolveIssueMeta>["steps"],
): ThreadContext<SolveIssueMeta> {
steps: ModeratorContext<SolveIssueMeta>["steps"],
): ModeratorContext<SolveIssueMeta> {
return {
threadId: "01TEST000000000000000000TR",
start: makeStart(maxRounds),
steps,
};
}
function plannerStep(): RoleStep<SolveIssueMeta> {
function preparerStep(): RoleStep<SolveIssueMeta> {
return {
role: "preparer",
content: "prepared",
meta: {
repoPath: "/home/user/repos/test",
defaultBranch: "main",
conventions: null,
toolchain: {
packageManager: "bun",
testCommand: "bun test",
lintCommand: null,
buildCommand: "bun run build",
},
},
refs: [],
timestamp: 0,
};
}
function plannerStep(phases: PlannerMeta["phases"] = DEFAULT_PHASES): RoleStep<SolveIssueMeta> {
return {
role: "planner",
content: "plan",
meta: { plan: "do work", files: ["a.ts"], approach: "minimal fix" },
meta: { phases },
refs: phases.map((p) => p.hash),
timestamp: 1,
};
}
function coderStep(): RoleStep<SolveIssueMeta> {
function coderStep(completedPhase = "4KNMR2PX"): RoleStep<SolveIssueMeta> {
return {
role: "coder",
content: "code",
meta: { filesChanged: ["a.ts"], summary: "fixed" },
meta: { completedPhase, filesChanged: ["a.ts"], summary: "fixed" },
refs: [completedPhase],
timestamp: 2,
};
}
@@ -52,7 +153,10 @@ function reviewerStep(approved: boolean): RoleStep<SolveIssueMeta> {
return {
role: "reviewer",
content: "rev",
meta: { approved },
meta: approved
? { status: "approved" as const }
: { status: "rejected" as const, issues: ["needs fix"] },
refs: [],
timestamp: 3,
};
}
@@ -61,28 +165,46 @@ function committerStep(): RoleStep<SolveIssueMeta> {
return {
role: "committer",
content: "commit",
meta: { committed: true },
meta: { status: "committed", branch: "feat/issue-1", commitSha: "abc1234" },
refs: [],
timestamp: 4,
};
}
const stubExtract = createExtract({
baseUrl: "http://127.0.0.1:9",
apiKey: "",
model: "test",
});
describe("solveIssueModerator", () => {
test("routes planner → coder → reviewer → committer → END", () => {
expect(solveIssueModerator(makeCtx(20, []))).toBe("planner");
expect(solveIssueModerator(makeCtx(20, [plannerStep()]))).toBe("coder");
expect(solveIssueModerator(makeCtx(20, [plannerStep(), coderStep()]))).toBe("reviewer");
expect(solveIssueModerator(makeCtx(20, [plannerStep(), coderStep(), reviewerStep(true)]))).toBe(
"committer",
test("routes preparer → planner → coder → reviewer → committer → END", () => {
expect(solveIssueModerator(makeCtx(20, []))).toBe("preparer");
expect(solveIssueModerator(makeCtx(20, [preparerStep()]))).toBe("planner");
expect(solveIssueModerator(makeCtx(20, [preparerStep(), plannerStep()]))).toBe("coder");
expect(solveIssueModerator(makeCtx(20, [preparerStep(), plannerStep(), coderStep()]))).toBe(
"reviewer",
);
expect(
solveIssueModerator(
makeCtx(20, [plannerStep(), coderStep(), reviewerStep(true), committerStep()]),
makeCtx(20, [preparerStep(), plannerStep(), coderStep(), reviewerStep(true)]),
),
).toBe("committer");
expect(
solveIssueModerator(
makeCtx(20, [
preparerStep(),
plannerStep(),
coderStep(),
reviewerStep(true),
committerStep(),
]),
),
).toBe(END);
});
test("reviewer rejects → coder retry when budget allows", () => {
const steps: ThreadContext<SolveIssueMeta>["steps"] = [
const steps: ModeratorContext<SolveIssueMeta>["steps"] = [
plannerStep(),
coderStep(),
reviewerStep(false),
@@ -91,33 +213,166 @@ describe("solveIssueModerator", () => {
});
test("reviewer rejects → END when max rounds exhausted", () => {
const steps: ThreadContext<SolveIssueMeta>["steps"] = [
const steps: ModeratorContext<SolveIssueMeta>["steps"] = [
plannerStep(),
coderStep(),
reviewerStep(false),
];
expect(solveIssueModerator(makeCtx(4, steps))).toBe(END);
});
test("multiple planner phases → coder until all complete, then reviewer", () => {
const phases: PlannerMeta["phases"] = [
{
hash: "AA000001",
title: "first phase",
},
{
hash: "AA000002",
title: "second phase",
},
];
expect(solveIssueModerator(makeCtx(20, [plannerStep(phases)]))).toBe("coder");
expect(solveIssueModerator(makeCtx(20, [plannerStep(phases), coderStep("AA000001")]))).toBe(
"coder",
);
expect(
solveIssueModerator(
makeCtx(20, [plannerStep(phases), coderStep("AA000001"), coderStep("AA000002")]),
),
).toBe("reviewer");
});
test("one-shot coder reports only last phase hash → reviewer (moderator treats as all phases done)", () => {
const phases: PlannerMeta["phases"] = [
{ hash: "BB000001", title: "setup branch" },
{ hash: "BB000002", title: "write tests" },
{ hash: "BB000003", title: "verify" },
{ hash: "BB000004", title: "commit and pr" },
];
expect(solveIssueModerator(makeCtx(20, [plannerStep(phases), coderStep("BB000004")]))).toBe(
"reviewer",
);
});
test("unrecognised completedPhase hash → coder retry when budget allows", () => {
const phases: PlannerMeta["phases"] = [
{ hash: "CC000001", title: "first phase" },
{ hash: "CC000002", title: "second phase" },
];
expect(solveIssueModerator(makeCtx(20, [plannerStep(phases), coderStep("all-done")]))).toBe(
"coder",
);
});
test("incomplete phases → END when max rounds exhausted", () => {
const phases: PlannerMeta["phases"] = [
{ hash: "DD000001", title: "first phase" },
{ hash: "DD000002", title: "second phase" },
];
const steps: ModeratorContext<SolveIssueMeta>["steps"] = [
plannerStep(phases),
coderStep("DD000001"),
];
expect(solveIssueModerator(makeCtx(3, steps))).toBe(END);
});
});
describe("createSolveIssueRoles", () => {
test("returns all four role callables", async () => {
const agent = async () => '{"plan":"x","files":[],"approach":"y"}';
const roles = createSolveIssueRoles({
agent,
workdir: "/tmp/repo",
extract: null,
});
describe("createSolveIssueRun", () => {
let restoreFetch: (() => void) | null = null;
expect(typeof roles.planner).toBe("function");
expect(typeof roles.coder).toBe("function");
expect(typeof roles.reviewer).toBe("function");
expect(typeof roles.committer).toBe("function");
afterEach(() => {
restoreFetch?.();
restoreFetch = null;
});
const ctx = makeCtx(10, []) as unknown as ThreadContext<RoleMeta>;
const plannerOut = await roles.planner(ctx);
expect(plannerOut.meta.plan).toBe("");
expect(Array.isArray(plannerOut.meta.files)).toBe(true);
test("structured extraction yields preparer then planner meta from mocked chat completions", async () => {
const EXPECT_PREPARER_META: PreparerMeta = {
repoPath: "/home/user/repos/test",
defaultBranch: "main",
conventions: null,
toolchain: {
packageManager: "bun",
testCommand: "bun test",
lintCommand: null,
buildCommand: "bun run build",
},
};
restoreFetch = installMockChatCompletions([EXPECT_PREPARER_META, EXPECT_PLANNER_META]);
const run = createSolveIssueRun({ agent: async () => "" }, stubExtract);
const gen = run(
{ prompt: "task", steps: [] },
{ threadId: "01TEST000000000000000000TR", maxRounds: 20 },
);
const first = await gen.next();
expect(first.done).toBe(false);
if (first.done) {
throw new Error("expected yield");
}
expect(first.value.role).toBe("preparer");
expect(first.value.meta).toEqual(EXPECT_PREPARER_META);
const second = await gen.next();
expect(second.done).toBe(false);
if (second.done) {
throw new Error("expected yield");
}
expect(second.value.role).toBe("planner");
expect(second.value.meta).toEqual(EXPECT_PLANNER_META);
});
test("per-role agent overrides default", async () => {
const PREPARER_META: PreparerMeta = {
repoPath: "/tmp/r",
defaultBranch: "main",
conventions: null,
toolchain: { packageManager: null, testCommand: null, lintCommand: null, buildCommand: null },
};
restoreFetch = installMockChatCompletions([
PREPARER_META,
EXPECT_PLANNER_META,
EXPECT_CODER_META,
]);
const calls: string[] = [];
const run = createSolveIssueRun(
{
agent: async () => {
calls.push("default");
return "";
},
overrides: {
preparer: async () => {
calls.push("preparer");
return "";
},
planner: async () => {
calls.push("planner");
return "";
},
coder: async () => {
calls.push("coder");
return "";
},
},
},
stubExtract,
);
const gen = run(
{ prompt: "task", steps: [] },
{ threadId: "01TEST000000000000000000TR", maxRounds: 20 },
);
await gen.next();
expect(calls).toEqual(["preparer"]);
calls.length = 0;
await gen.next();
expect(calls).toEqual(["planner"]);
calls.length = 0;
await gen.next();
expect(calls).toEqual(["coder"]);
});
});
@@ -133,9 +388,10 @@ describe("buildSolveIssueDescriptor", () => {
"coder",
"committer",
"planner",
"preparer",
"reviewer",
]);
for (const key of ["planner", "coder", "reviewer", "committer"] as const) {
for (const key of ["preparer", "planner", "coder", "reviewer", "committer"] as const) {
const role = validated.value.roles[key];
expect(role).toBeDefined();
expect(typeof role.schema).toBe("object");
@@ -10,11 +10,10 @@
},
"dependencies": {
"@uncaged/workflow": "workspace:*",
"@uncaged/workflow-agent-cursor": "workspace:*",
"@uncaged/workflow-role-committer": "workspace:*",
"@uncaged/workflow-role-llm": "workspace:*",
"@uncaged/workflow-role-reviewer": "workspace:*",
"@uncaged/workflow-util-role": "workspace:*",
"zod": "^4.0.0"
"@uncaged/workflow-role-coder": "workspace:*",
"@uncaged/workflow-role-planner": "workspace:*",
"@uncaged/workflow-role-preparer": "workspace:*",
"@uncaged/workflow-role-reviewer": "workspace:*"
}
}
@@ -1,34 +1,12 @@
import { committerMetaSchema } from "@uncaged/workflow-role-committer";
import { reviewerMetaSchema } from "@uncaged/workflow-role-reviewer";
import { buildDescriptorFromRoles } from "@uncaged/workflow-util-role";
import { buildDescriptor } from "@uncaged/workflow";
import { coderMetaSchema, plannerMetaSchema } from "./roles.js";
import { solveIssueModerator } from "./moderator.js";
import { SOLVE_ISSUE_WORKFLOW_DESCRIPTION, solveIssueRoles } from "./roles.js";
export function buildSolveIssueDescriptor() {
return buildDescriptorFromRoles({
description:
"Plan, implement, review, and commit changes to resolve an issue end-to-end (planner → coder → reviewer → committer).",
roles: {
planner: {
name: "planner",
schema: plannerMetaSchema,
description: "Analyzes the issue and proposes plan, files, and approach.",
},
coder: {
name: "coder",
schema: coderMetaSchema,
description: "Implements the planner output and summarizes touched files.",
},
reviewer: {
name: "reviewer",
schema: reviewerMetaSchema,
description: "Runs git diff checks and sets approved when the change is ready.",
},
committer: {
name: "committer",
schema: committerMetaSchema,
description: "Creates branch, commits, and pushes when review passes.",
},
},
return buildDescriptor({
description: SOLVE_ISSUE_WORKFLOW_DESCRIPTION,
roles: solveIssueRoles,
moderator: solveIssueModerator,
});
}
@@ -1,29 +1,55 @@
import { createRoleModerator, type WorkflowFn } from "@uncaged/workflow";
import {
type AgentBinding,
createWorkflow,
type ExtractFn,
type WorkflowDefinition,
type WorkflowFn,
} from "@uncaged/workflow";
import { solveIssueModerator } from "./moderator.js";
import { createSolveIssueRoles, type SolveIssueMeta, type SolveIssueRolesConfig } from "./roles.js";
import { SOLVE_ISSUE_WORKFLOW_DESCRIPTION, type SolveIssueMeta, solveIssueRoles } from "./roles.js";
export { type CursorAgentConfig, createCursorAgent } from "@uncaged/workflow-agent-cursor";
export { buildSolveIssueDescriptor } from "./descriptor.js";
export { solveIssueModerator } from "./moderator.js";
export {
type CoderMeta,
coderMetaSchema,
createSolveIssueRoles,
coderRole,
} from "@uncaged/workflow-role-coder";
export {
type CommitterMeta,
committerMetaSchema,
committerRole,
} from "@uncaged/workflow-role-committer";
export {
type PlannerMeta,
phaseSchema,
plannerMetaSchema,
plannerRole,
} from "@uncaged/workflow-role-planner";
export {
type PreparerMeta,
preparerMetaSchema,
preparerRole,
} from "@uncaged/workflow-role-preparer";
export {
type ReviewerMeta,
reviewerMetaSchema,
reviewerRole,
} from "@uncaged/workflow-role-reviewer";
export { buildSolveIssueDescriptor } from "./descriptor.js";
export { solveIssueModerator } from "./moderator.js";
export {
SOLVE_ISSUE_WORKFLOW_DESCRIPTION,
type SolveIssueMeta,
type SolveIssueRoles,
type SolveIssueRolesConfig,
solveIssueRoles,
} from "./roles.js";
/**
* Factory for a {@link WorkflowFn}: supply an agent and repo paths at runtime, then pass the result
* to the bundle `run` export pattern (`createRoleModerator` is already applied).
*/
export function createSolveIssueRun(config: SolveIssueRolesConfig): WorkflowFn {
return createRoleModerator<SolveIssueMeta>({
roles: createSolveIssueRoles(config),
moderator: solveIssueModerator,
});
export const solveIssueWorkflowDefinition: WorkflowDefinition<SolveIssueMeta> = {
description: SOLVE_ISSUE_WORKFLOW_DESCRIPTION,
roles: solveIssueRoles,
moderator: solveIssueModerator,
};
export function createSolveIssueRun(binding: AgentBinding, extract: ExtractFn): WorkflowFn {
return createWorkflow(solveIssueWorkflowDefinition, binding, extract);
}
@@ -1,27 +1,72 @@
import type { Moderator } from "@uncaged/workflow";
import type { Moderator, ModeratorContext } from "@uncaged/workflow";
import { END } from "@uncaged/workflow";
import type { SolveIssueMeta } from "./roles.js";
function coderFinishedAllPlannedPhases(
phases: ReadonlyArray<{ hash: string }>,
coderCompletedPhases: ReadonlyArray<string>,
): boolean {
if (phases.length === 0) {
return true;
}
const plannedHashes = new Set(phases.map((p) => p.hash));
const lastHash = phases[phases.length - 1].hash;
const explicit = new Set(coderCompletedPhases.filter((h) => plannedHashes.has(h)));
if (phases.every((p) => explicit.has(p.hash))) {
return true;
}
if (coderCompletedPhases.some((h) => h === lastHash)) {
return true;
}
return false;
}
function nextAfterCoder(
ctx: ModeratorContext<SolveIssueMeta>,
maxRounds: number,
): (keyof SolveIssueMeta & string) | typeof END {
const plannerStep = ctx.steps.find((s) => s.role === "planner");
if (plannerStep === undefined) {
return "reviewer";
}
const phases = plannerStep.meta.phases;
const coderCompletedPhases = ctx.steps
.filter((s) => s.role === "coder")
.map((s) => s.meta.completedPhase);
const allDone = coderFinishedAllPlannedPhases(phases, coderCompletedPhases);
if (allDone) {
return "reviewer";
}
if (ctx.steps.length < maxRounds - 1) {
return "coder";
}
return END;
}
export const solveIssueModerator: Moderator<SolveIssueMeta> = (ctx) => {
const maxRounds = ctx.start.meta.maxRounds;
if (ctx.steps.length === 0) {
return "planner";
return "preparer";
}
const last = ctx.steps[ctx.steps.length - 1];
if (last.role === "preparer") {
return "planner";
}
if (last.role === "planner") {
return "coder";
}
if (last.role === "coder") {
return "reviewer";
return nextAfterCoder(ctx, maxRounds);
}
if (last.role === "reviewer") {
if (last.meta.approved === true) {
if (last.meta.status === "approved") {
return "committer";
}
if (ctx.steps.length < maxRounds - 1) {
@@ -31,6 +76,9 @@ export const solveIssueModerator: Moderator<SolveIssueMeta> = (ctx) => {
}
if (last.role === "committer") {
if (last.meta.status === "recoverable" && ctx.steps.length < maxRounds - 1) {
return "coder";
}
return END;
}
@@ -1,151 +1,29 @@
import type { AgentFn, Role } from "@uncaged/workflow";
import {
type CommitterMeta,
type CommitterPlanMeta,
createCommitterRole,
} from "@uncaged/workflow-role-committer";
import { createRole } from "@uncaged/workflow-role-llm";
import { createReviewerRole, type ReviewerMeta } from "@uncaged/workflow-role-reviewer";
import type { LlmProvider } from "@uncaged/workflow-util-role";
import * as z from "zod/v4";
import type { RoleDefinition } from "@uncaged/workflow";
import { type CoderMeta, coderRole } from "@uncaged/workflow-role-coder";
import { type CommitterMeta, committerRole } from "@uncaged/workflow-role-committer";
import { type PlannerMeta, plannerRole } from "@uncaged/workflow-role-planner";
import { type PreparerMeta, preparerRole } from "@uncaged/workflow-role-preparer";
import { type ReviewerMeta, reviewerRole } from "@uncaged/workflow-role-reviewer";
const DRY_RUN_PROVIDER: LlmProvider = {
baseUrl: "http://127.0.0.1:9",
apiKey: "",
model: "template-dry-run",
};
const PLANNER_SYSTEM = `You are a **planner** for a software task. Analyze the issue, list relevant files, and produce a clear step-by-step approach.
Focus on: root cause, edge cases, and how the implementation will be verified. Output enough detail for a coding agent to implement without guessing.`;
const CODER_SYSTEM = `You are a **coder**. The previous step produced a plan: read the thread and implement that plan in the repository.
Make focused changes, follow project conventions, and explain what you changed.`;
export const plannerMetaSchema = z.object({
plan: z.string(),
files: z.array(z.string()),
approach: z.string(),
});
export const coderMetaSchema = z.object({
filesChanged: z.array(z.string()),
summary: z.string(),
});
export type PlannerMeta = z.infer<typeof plannerMetaSchema>;
export type CoderMeta = z.infer<typeof coderMetaSchema>;
const PLANNER_DRY_RUN_META: PlannerMeta = {
plan: "",
files: [],
approach: "",
};
const CODER_DRY_RUN_META: CoderMeta = {
filesChanged: [],
summary: "",
};
const REVIEWER_DRY_RUN_META: ReviewerMeta = {
approved: true,
};
const COMMITTER_PLAN_DRY_RUN_META: CommitterPlanMeta = {
branch: "dry-run",
message: "chore: dry run",
};
export const SOLVE_ISSUE_WORKFLOW_DESCRIPTION =
"Prepare repo context, plan phases, implement incrementally, review, and commit to resolve an issue end-to-end (preparer → planner → coder [repeat per phase] → reviewer → committer).";
export type SolveIssueMeta = {
preparer: PreparerMeta;
planner: PlannerMeta;
coder: CoderMeta;
reviewer: ReviewerMeta;
committer: CommitterMeta;
};
/** Wiring for workflow-role LLM structured extraction. Use `null` for stub extract (dry-run meta from built-in placeholders). */
export type SolveIssueRolesConfig = {
agent: AgentFn;
workdir: string;
extract: { provider: LlmProvider; dryRun: boolean | null } | null;
};
function resolveExtract(config: SolveIssueRolesConfig): {
provider: LlmProvider;
dryRun: boolean | null;
} {
if (config.extract === null) {
return { provider: DRY_RUN_PROVIDER, dryRun: true };
}
return config.extract;
}
export type SolveIssueRoles = {
planner: Role<PlannerMeta>;
coder: Role<CoderMeta>;
reviewer: Role<ReviewerMeta>;
committer: Role<CommitterMeta>;
[K in keyof SolveIssueMeta]: RoleDefinition<SolveIssueMeta[K]>;
};
export function createSolveIssueRoles(config: SolveIssueRolesConfig): SolveIssueRoles {
const extract = resolveExtract(config);
const reviewerGit = {
cwd: config.workdir,
conventionsPath: null,
extraChecks: [],
threadId: null,
};
const committerGit = {
cwd: config.workdir,
remote: "origin",
threadId: null,
};
const planner: Role<PlannerMeta> = createRole({
name: "planner",
schema: plannerMetaSchema,
systemPrompt: PLANNER_SYSTEM,
agent: config.agent,
extract: {
provider: extract.provider,
dryRun: extract.dryRun,
dryRunMeta: PLANNER_DRY_RUN_META,
},
});
const coder: Role<CoderMeta> = createRole({
name: "coder",
schema: coderMetaSchema,
systemPrompt: CODER_SYSTEM,
agent: config.agent,
extract: {
provider: extract.provider,
dryRun: extract.dryRun,
dryRunMeta: CODER_DRY_RUN_META,
},
});
const reviewer: Role<ReviewerMeta> = createReviewerRole(
config.agent,
{
provider: extract.provider,
dryRun: extract.dryRun,
dryRunMeta: REVIEWER_DRY_RUN_META,
},
reviewerGit,
);
const committer: Role<CommitterMeta> = createCommitterRole(
config.agent,
{
provider: extract.provider,
dryRun: extract.dryRun,
dryRunMeta: COMMITTER_PLAN_DRY_RUN_META,
},
committerGit,
);
return { planner, coder, reviewer, committer };
}
export const solveIssueRoles: SolveIssueRoles = {
preparer: preparerRole,
planner: plannerRole,
coder: coderRole,
reviewer: reviewerRole,
committer: committerRole,
};
@@ -8,7 +8,9 @@
"include": ["src/**/*.ts"],
"references": [
{ "path": "../workflow" },
{ "path": "../workflow-role-llm" },
{ "path": "../workflow-util-role" }
{ "path": "../workflow-role-coder" },
{ "path": "../workflow-role-committer" },
{ "path": "../workflow-role-planner" },
{ "path": "../workflow-role-reviewer" }
]
}
@@ -17,8 +17,10 @@ describe("buildAgentPrompt", () => {
const ctx: ThreadContext = {
start: startTask("fix the bug"),
steps: [],
threadId: "01TEST000000000000000000TR",
currentRole: { name: START, systemPrompt: "You are an agent." },
};
const text = buildAgentPrompt("You are an agent.", ctx);
const text = buildAgentPrompt(ctx);
expect(text).toContain("You are an agent.");
expect(text).toContain("## Task");
expect(text).toContain("fix the bug");
@@ -28,44 +30,51 @@ describe("buildAgentPrompt", () => {
test("single step shows full content and meta, and includes tools", () => {
const ctx: ThreadContext = {
start: startTask("user task"),
threadId: "01TEST000000000000000000TR",
currentRole: { name: "coder", systemPrompt: "Be helpful." },
steps: [
{
role: "coder",
content: "only step full body",
meta: { files: ["a.ts"] },
refs: [],
timestamp: 2,
},
],
};
const text = buildAgentPrompt("Be helpful.", ctx);
const text = buildAgentPrompt(ctx);
expect(text).toContain("## Task");
expect(text).toContain("user task");
expect(text).toContain("## Step: coder");
expect(text).toContain("only step full body");
expect(text).toContain('Meta: {"files":["a.ts"]}');
expect(text).toContain("## Tools");
expect(text).toContain("uncaged-workflow thread <threadId>");
expect(text).toContain("uncaged-workflow thread 01TEST000000000000000000TR");
});
test("two or more steps: previous steps are meta-only; latest step is full", () => {
const ctx: ThreadContext = {
start: startTask("first message full: task content here"),
threadId: "01TEST000000000000000000TR",
currentRole: { name: "coder", systemPrompt: "System." },
steps: [
{
role: "planner",
content: "PLANNER_SECRET_FULL_TEXT",
meta: { plan: "short" },
refs: [],
timestamp: 2,
},
{
role: "coder",
content: "last step full content",
meta: { done: true },
refs: [],
timestamp: 3,
},
],
};
const text = buildAgentPrompt("System.", ctx);
const text = buildAgentPrompt(ctx);
expect(text).toContain("first message full: task content here");
expect(text).toContain("## Previous Steps");
expect(text).toContain("### Step 1: planner");
@@ -75,33 +84,39 @@ describe("buildAgentPrompt", () => {
expect(text).toContain("last step full content");
expect(text).toContain('Meta: {"done":true}');
expect(text).toContain("## Tools");
expect(text).toContain("uncaged-workflow thread 01TEST000000000000000000TR");
});
test("middle steps show meta summary only, not full content", () => {
const ctx: ThreadContext = {
start: startTask("start"),
threadId: "01TEST000000000000000000TR",
currentRole: { name: "c", systemPrompt: "S" },
steps: [
{
role: "a",
content: "HIDDEN_A",
meta: { n: 1 },
refs: [],
timestamp: 2,
},
{
role: "b",
content: "HIDDEN_B_MIDDLE",
meta: { n: 2 },
refs: [],
timestamp: 3,
},
{
role: "c",
content: "VISIBLE_LAST",
meta: { n: 3 },
refs: [],
timestamp: 4,
},
],
};
const text = buildAgentPrompt("S", ctx);
const text = buildAgentPrompt(ctx);
expect(text).not.toContain("HIDDEN_A");
expect(text).not.toContain("HIDDEN_B_MIDDLE");
expect(text).toContain('Summary: {"n":1}');
@@ -4,6 +4,12 @@
"type": "module",
"main": "src/index.ts",
"types": "src/index.ts",
"exports": {
".": {
"types": "./src/index.ts",
"default": "./src/index.ts"
}
},
"scripts": {
"build": "echo 'TODO'",
"test": "bun test"
@@ -1,9 +1,9 @@
import type { ThreadContext } from "@uncaged/workflow";
import type { AgentContext } from "@uncaged/workflow";
/** Builds the full agent prompt: system instructions plus summarized thread history. */
export function buildAgentPrompt(systemPrompt: string, ctx: ThreadContext): string {
export function buildAgentPrompt(ctx: AgentContext): string {
const lines: string[] = [];
lines.push(systemPrompt);
lines.push(ctx.currentRole.systemPrompt);
lines.push("");
lines.push("## Task");
lines.push(ctx.start.content);
@@ -41,7 +41,9 @@ export function buildAgentPrompt(systemPrompt: string, ctx: ThreadContext): stri
lines.push("");
lines.push("## Tools");
lines.push("Use `uncaged-workflow thread <threadId>` to read full details of any previous step.");
lines.push(
`Use \`uncaged-workflow thread ${ctx.threadId}\` to read full details of any previous step.`,
);
return lines.join("\n");
}
@@ -1,74 +0,0 @@
import { describe, expect, test } from "bun:test";
import { validateWorkflowDescriptor } from "@uncaged/workflow";
import * as z from "zod/v4";
import { buildDescriptorFromRoles } from "../src/build-descriptor.js";
describe("buildDescriptorFromRoles", () => {
test("produces a descriptor that validates and includes JSON schemas per role", () => {
const schema = z.object({
title: z.string(),
count: z.number(),
});
const descriptor = buildDescriptorFromRoles({
description: "Demo workflow",
roles: {
analyst: {
name: "analyst",
schema,
description: "Analyzes input",
},
},
});
const validated = validateWorkflowDescriptor(descriptor);
expect(validated.ok).toBe(true);
if (!validated.ok) {
return;
}
expect(validated.value.description).toBe("Demo workflow");
const analyst = validated.value.roles.analyst;
expect(analyst.description).toBe("Analyzes input");
expect(analyst.schema.type).toBe("object");
const props = analyst.schema.properties as Record<string, unknown>;
expect(props.title).toMatchObject({ type: "string" });
expect(props.count).toMatchObject({ type: "number" });
});
test("uses empty description when spec.description is null", () => {
const descriptor = buildDescriptorFromRoles({
description: "W",
roles: {
x: {
name: "x",
schema: z.object({ n: z.number() }),
description: null,
},
},
});
const validated = validateWorkflowDescriptor(descriptor);
expect(validated.ok).toBe(true);
if (!validated.ok) {
return;
}
expect(validated.value.roles.x.description).toBe("");
});
test("throws when role key and spec.name diverge", () => {
expect(() =>
buildDescriptorFromRoles({
description: "W",
roles: {
a: {
name: "b",
schema: z.object({ n: z.number() }),
description: null,
},
},
}),
).toThrow(/must match spec.name/);
});
});
@@ -1,100 +0,0 @@
import { describe, expect, test } from "bun:test";
import type { Role, ThreadContext } from "@uncaged/workflow";
import { START } from "@uncaged/workflow";
import { decorateRole, onFail, withDryRun } from "../src/decorators.js";
type TestMeta = Record<string, unknown> & { ok: boolean };
function fakeCtx(): ThreadContext {
return {
start: {
role: START,
content: "",
meta: {
maxRounds: 10,
},
timestamp: Date.now(),
},
steps: [],
};
}
const successRole: Role<TestMeta> = async () => ({
content: "done",
meta: { ok: true },
});
const failRole: Role<TestMeta> = async () => {
throw new Error("boom");
};
const failNonErrorRole: Role<TestMeta> = async () => {
throw "string error";
};
describe("withDryRun", () => {
test("short-circuits on dry-run", async () => {
const dec = withDryRun<TestMeta>({ label: "test", meta: { ok: true }, dryRun: true });
const role = dec(successRole);
const result = await role(fakeCtx());
expect(result.content).toBe("[dry-run] test skipped");
expect(result.meta).toEqual({ ok: true });
});
test("delegates when not dry-run", async () => {
const innerDec = withDryRun<TestMeta>({ label: "test", meta: { ok: true }, dryRun: false });
const role = innerDec(successRole);
const result = await role(fakeCtx());
expect(result.content).toBe("done");
expect(result.meta).toEqual({ ok: true });
});
});
describe("onFail", () => {
test("passes through on success", async () => {
const dec = onFail<TestMeta>({ label: "test", meta: { ok: false } });
const role = dec(successRole);
const result = await role(fakeCtx());
expect(result.content).toBe("done");
expect(result.meta).toEqual({ ok: true });
});
test("catches Error and returns structured failure", async () => {
const dec = onFail<TestMeta>({ label: "test", meta: { ok: false } });
const role = dec(failRole);
const result = await role(fakeCtx());
expect(result.content).toBe("test failed: boom");
expect(result.meta).toEqual({ ok: false });
});
test("catches non-Error throws", async () => {
const dec = onFail<TestMeta>({ label: "test", meta: { ok: false } });
const role = dec(failNonErrorRole);
const result = await role(fakeCtx());
expect(result.content).toBe("test failed: string error");
expect(result.meta).toEqual({ ok: false });
});
});
describe("decorateRole", () => {
test("applies decorators left-to-right", async () => {
const role = decorateRole(failRole, [
withDryRun<TestMeta>({ label: "x", meta: { ok: true }, dryRun: false }),
onFail<TestMeta>({ label: "x", meta: { ok: false } }),
]);
const result = await role(fakeCtx());
expect(result.content).toBe("x failed: boom");
expect(result.meta).toEqual({ ok: false });
});
test("dry-run short-circuits before onFail", async () => {
const role = decorateRole(failRole, [
withDryRun<TestMeta>({ label: "x", meta: { ok: true }, dryRun: true }),
onFail<TestMeta>({ label: "x", meta: { ok: false } }),
]);
const result = await role(fakeCtx());
expect(result.content).toBe("[dry-run] x skipped");
expect(result.meta).toEqual({ ok: true });
});
});
@@ -1,102 +0,0 @@
import { describe, expect, test } from "bun:test";
import * as z from "zod/v4";
import { extractMetaOrThrow } from "../src/extract-meta.js";
const provider = {
baseUrl: "https://example.com/v1",
apiKey: "k",
model: "m",
};
describe("extractMetaOrThrow", () => {
const originalFetch = globalThis.fetch;
test("dryRun returns dryRunMeta without calling fetch", async () => {
let calls = 0;
globalThis.fetch = (() => {
calls += 1;
return Promise.resolve(new Response("{}", { status: 200 }));
}) as unknown as typeof fetch;
const schema = z.object({ n: z.number() });
const out = await extractMetaOrThrow("r", "raw", schema, {
provider,
dryRun: true,
dryRunMeta: { n: 7 },
});
globalThis.fetch = originalFetch;
expect(calls).toBe(0);
expect(out).toEqual({ n: 7 });
});
test("throws when extraction fails after retry", async () => {
globalThis.fetch = (() =>
Promise.resolve(
new Response(
JSON.stringify({
choices: [
{
message: {
tool_calls: [
{ function: { name: "extract", arguments: JSON.stringify({ n: "bad" }) } },
],
},
},
],
}),
{ status: 200, headers: { "Content-Type": "application/json" } },
),
)) as unknown as typeof fetch;
const schema = z.object({ n: z.number() });
await expect(
extractMetaOrThrow("plan", "text", schema, { provider, dryRun: false, dryRunMeta: { n: 0 } }),
).rejects.toThrow(/structured extraction failed after retry/);
globalThis.fetch = originalFetch;
});
test("returns validated meta on successful tool call", async () => {
globalThis.fetch = (() =>
Promise.resolve(
new Response(
JSON.stringify({
choices: [
{
message: {
tool_calls: [
{
function: {
name: "extract",
arguments: JSON.stringify({ branch: "feat/x", message: "feat: y" }),
},
},
],
},
},
],
}),
{ status: 200, headers: { "Content-Type": "application/json" } },
),
)) as unknown as typeof fetch;
const schema = z.object({
branch: z.string(),
message: z.string(),
});
const out = await extractMetaOrThrow("committer-plan", "plan text", schema, {
provider,
dryRun: false,
dryRunMeta: { branch: "", message: "" },
});
globalThis.fetch = originalFetch;
expect(out).toEqual({ branch: "feat/x", message: "feat: y" });
});
});
@@ -1,146 +0,0 @@
import { describe, expect, test } from "bun:test";
import * as z from "zod/v4";
import { llmExtract } from "../src/llm-extract.js";
describe("llmExtract", () => {
const originalFetch = globalThis.fetch;
test("parses tool call arguments and validates with the zod schema", async () => {
const schema = z
.object({
name: z.string(),
description: z.string(),
})
.describe("Extract sense metadata from plan");
let capturedUrl = "";
let capturedInit: RequestInit = {};
globalThis.fetch = ((input: Request | string | URL, init?: RequestInit) => {
capturedUrl = typeof input === "string" ? input : input.toString();
capturedInit = init ?? {};
return Promise.resolve(
new Response(
JSON.stringify({
choices: [
{
message: {
tool_calls: [
{
function: {
name: "extract",
arguments: JSON.stringify({
name: "cpu-usage",
description: "CPU load",
}),
},
},
],
},
},
],
}),
{ status: 200, headers: { "Content-Type": "application/json" } },
),
);
}) as unknown as typeof fetch;
const result = await llmExtract({
text: "some plan",
schema,
provider: {
baseUrl: "https://example.com/v1",
apiKey: "k",
model: "m",
},
dryRun: false,
dryRunMeta: { name: "", description: "" },
});
globalThis.fetch = originalFetch;
expect(result.ok).toBe(true);
if (!result.ok) {
return;
}
expect(result.value).toEqual({ name: "cpu-usage", description: "CPU load" });
expect(capturedUrl).toBe("https://example.com/v1/chat/completions");
expect(capturedInit.method).toBe("POST");
expect(capturedInit.headers).toMatchObject({
Authorization: "Bearer k",
"Content-Type": "application/json",
});
const body = JSON.parse(capturedInit.body as string) as {
model: string;
tool_choice: { function: { name: string } };
};
expect(body.model).toBe("m");
expect(body.tool_choice.function.name).toBeDefined();
});
test("returns schema_validation_failed when arguments do not match the schema", async () => {
const schema = z.object({ n: z.number() });
globalThis.fetch = (() =>
Promise.resolve(
new Response(
JSON.stringify({
choices: [
{
message: {
tool_calls: [
{ function: { name: "extract", arguments: JSON.stringify({ n: "oops" }) } },
],
},
},
],
}),
{ status: 200, headers: { "Content-Type": "application/json" } },
),
)) as unknown as typeof fetch;
const result = await llmExtract({
text: "x",
schema,
provider: { baseUrl: "https://example.com", apiKey: "k", model: "m" },
dryRun: false,
dryRunMeta: { n: 0 },
});
globalThis.fetch = originalFetch;
expect(result.ok).toBe(false);
if (result.ok) {
return;
}
expect(result.error.kind).toBe("schema_validation_failed");
});
test("dryRun skips fetch and returns dryRunMeta", async () => {
let calls = 0;
globalThis.fetch = (() => {
calls += 1;
return Promise.resolve(new Response("{}", { status: 200 }));
}) as unknown as typeof fetch;
const schema = z.object({ n: z.number() });
const result = await llmExtract({
text: "ignored",
schema,
provider: { baseUrl: "https://example.com", apiKey: "k", model: "m" },
dryRun: true,
dryRunMeta: { n: 42 },
});
globalThis.fetch = originalFetch;
expect(calls).toBe(0);
expect(result.ok).toBe(true);
if (!result.ok) {
return;
}
expect(result.value).toEqual({ n: 42 });
});
});
@@ -1,38 +0,0 @@
import type { WorkflowDescriptor, WorkflowRoleSchema } from "@uncaged/workflow";
import * as z from "zod/v4";
export type RoleDescriptorInput<M extends Record<string, unknown> = Record<string, unknown>> = {
name: string;
schema: z.ZodType<M>;
/** Human-readable role description; use empty string when unknown. */
description: string | null;
};
function stripJsonSchemaMeta(json: Record<string, unknown>): WorkflowRoleSchema {
const { $schema: _drop, ...rest } = json;
return rest as WorkflowRoleSchema;
}
/**
* Builds a {@link WorkflowDescriptor} from role specs, emitting JSON Schema per role via
* `z.toJSONSchema`.
*/
export function buildDescriptorFromRoles(args: {
description: string;
roles: Record<string, RoleDescriptorInput>;
}): WorkflowDescriptor {
const roles: WorkflowDescriptor["roles"] = {};
for (const [key, spec] of Object.entries(args.roles)) {
if (spec.name !== key) {
throw new Error(
`buildDescriptorFromRoles: role key "${key}" must match spec.name "${spec.name}"`,
);
}
const rawJsonSchema = z.toJSONSchema(spec.schema) as Record<string, unknown>;
roles[key] = {
description: spec.description === null ? "" : spec.description,
schema: stripJsonSchemaMeta(rawJsonSchema),
};
}
return { description: args.description, roles };
}
@@ -1,63 +0,0 @@
import type { Role, ThreadContext } from "@uncaged/workflow";
/** A role decorator: takes a role, returns an enhanced role. */
export type RoleDecorator<M extends Record<string, unknown>> = (role: Role<M>) => Role<M>;
/**
* Apply an ordered list of decorators to a role.
* Decorators are applied left-to-right (first in list wraps innermost).
*/
export function decorateRole<M extends Record<string, unknown>>(
role: Role<M>,
decorators: RoleDecorator<M>[],
): Role<M> {
return decorators.reduce((r, dec) => dec(r), role);
}
export type WithDryRunOptions<M extends Record<string, unknown>> = {
/** Used in skip message (e.g. "committer", "publish"). */
label: string;
/** Meta returned when dry-run skips execution. */
meta: M;
/** Adapter-level dry-run flag (e.g. from extract / wiring config). */
dryRun: boolean;
};
/** Short-circuits with a stable result when `dryRun` is true. */
export function withDryRun<M extends Record<string, unknown>>(
opts: WithDryRunOptions<M>,
): RoleDecorator<M> {
return (role) => async (ctx: ThreadContext) => {
if (opts.dryRun) {
return {
content: `[dry-run] ${opts.label} skipped`,
meta: opts.meta,
};
}
return role(ctx);
};
}
export type OnFailOptions<M extends Record<string, unknown>> = {
/** Used in failure message (e.g. "committer", "publish"). */
label: string;
/** Meta returned when the inner role throws. */
meta: M;
};
/** Catches thrown errors and converts them into a structured {@link Role} result instead of propagating. */
export function onFail<M extends Record<string, unknown>>(
opts: OnFailOptions<M>,
): RoleDecorator<M> {
return (role) => async (ctx: ThreadContext) => {
try {
return await role(ctx);
} catch (e) {
const msg = e instanceof Error ? e.message : String(e);
return {
content: `${opts.label} failed: ${msg}`,
meta: opts.meta,
};
}
};
}
@@ -1,24 +0,0 @@
import type * as z from "zod/v4";
import { llmExtractWithRetry } from "./llm-extract.js";
import type { LlmProvider } from "./types.js";
export async function extractMetaOrThrow<T extends Record<string, unknown>>(
roleName: string,
raw: string,
schema: z.ZodType<T>,
options: { provider: LlmProvider; dryRun: boolean; dryRunMeta: T },
): Promise<T> {
const result = await llmExtractWithRetry({
text: raw,
schema,
provider: options.provider,
dryRun: options.dryRun,
dryRunMeta: options.dryRunMeta,
});
if (!result.ok) {
throw new Error(
`Role "${roleName}": structured extraction failed after retry: ${JSON.stringify(result.error)}`,
);
}
return result.value;
}
-18
View File
@@ -1,18 +0,0 @@
export { buildDescriptorFromRoles, type RoleDescriptorInput } from "./build-descriptor.js";
export {
decorateRole,
type OnFailOptions,
onFail,
type RoleDecorator,
type WithDryRunOptions,
withDryRun,
} from "./decorators.js";
export { extractMetaOrThrow } from "./extract-meta.js";
export {
type LlmError,
type LlmExtractArgs,
llmErrorToCause,
llmExtract,
llmExtractWithRetry,
} from "./llm-extract.js";
export type { LlmMessage, LlmProvider, MetaExtractConfig } from "./types.js";
-15
View File
@@ -1,15 +0,0 @@
import type * as z from "zod/v4";
export type LlmProvider = {
baseUrl: string;
apiKey: string;
model: string;
};
export type LlmMessage = { role: "system" | "user" | "assistant"; content: string };
/** Pairs an OpenAI-compatible provider with the Zod meta schema used for structured extraction. */
export type MetaExtractConfig<T> = {
provider: LlmProvider;
schema: z.ZodType<T>;
};
@@ -0,0 +1,45 @@
import { describe, expect, test } from "bun:test";
import * as z from "zod/v4";
import { buildDescriptor } from "../src/build-descriptor.js";
import { END } from "../src/types.js";
import { validateWorkflowDescriptor } from "../src/workflow-descriptor.js";
describe("buildDescriptor", () => {
test("produces a descriptor that validates and includes JSON schemas per role", () => {
const schema = z.object({
title: z.string(),
count: z.number(),
});
type M = { analyst: z.infer<typeof schema> };
const descriptor = buildDescriptor<M>({
description: "Demo workflow",
roles: {
analyst: {
description: "Analyzes input",
systemPrompt: "You are an analyst.",
extractPrompt: "Extract title and count from the analysis.",
schema,
extractRefs: null,
},
},
moderator: () => END,
});
const validated = validateWorkflowDescriptor(descriptor);
expect(validated.ok).toBe(true);
if (!validated.ok) {
return;
}
expect(validated.value.description).toBe("Demo workflow");
const analyst = validated.value.roles.analyst;
expect(analyst.description).toBe("Analyzes input");
expect(analyst.schema.type).toBe("object");
const props = analyst.schema.properties as Record<string, unknown>;
expect(props.title).toMatchObject({ type: "string" });
expect(props.count).toMatchObject({ type: "number" });
});
});
+98
View File
@@ -0,0 +1,98 @@
import { afterEach, beforeEach, describe, expect, test } from "bun:test";
import { mkdtemp, rm } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { createCasStore, createThreadCas } from "../src/cas.js";
import { hashString } from "../src/hash.js";
describe("cas module exports", () => {
test("createThreadCas is a deprecated alias of createCasStore", () => {
expect(createThreadCas).toBe(createCasStore);
});
});
describe("createCasStore", () => {
let casDir: string;
beforeEach(async () => {
casDir = await mkdtemp(join(tmpdir(), "cas-test-"));
});
afterEach(async () => {
await rm(casDir, { recursive: true, force: true });
});
test("put returns consistent hash for same content", async () => {
const cas = createCasStore(casDir);
const h1 = await cas.put("hello world");
const h2 = await cas.put("hello world");
expect(h1).toBe(h2);
expect(h1).toHaveLength(13);
});
test("put returns hash matching hashString", async () => {
const cas = createCasStore(casDir);
const content = "some content to store";
const h = await cas.put(content);
expect(h).toBe(hashString(content));
});
test("get returns stored content", async () => {
const cas = createCasStore(casDir);
const content = "line1\nline2\nline3";
const h = await cas.put(content);
const retrieved = await cas.get(h);
expect(retrieved).toBe(content);
});
test("get returns null for missing hash", async () => {
const cas = createCasStore(casDir);
const result = await cas.get("0000000000000");
expect(result).toBeNull();
});
test("delete removes entry", async () => {
const cas = createCasStore(casDir);
const h = await cas.put("to be deleted");
await cas.delete(h);
const result = await cas.get(h);
expect(result).toBeNull();
});
test("delete on missing hash does not throw", async () => {
const cas = createCasStore(casDir);
await cas.delete("0000000000000");
});
test("list returns all stored hashes", async () => {
const cas = createCasStore(casDir);
const h1 = await cas.put("aaa");
const h2 = await cas.put("bbb");
const h3 = await cas.put("ccc");
const hashes = await cas.list();
expect(hashes.sort()).toEqual([h1, h2, h3].sort());
});
test("list returns empty array when cas dir does not exist", async () => {
const cas = createCasStore(join(casDir, "nonexistent"));
const hashes = await cas.list();
expect(hashes).toEqual([]);
});
test("put is idempotent — same content written twice causes no error", async () => {
const cas = createCasStore(casDir);
const h1 = await cas.put("idempotent");
const h2 = await cas.put("idempotent");
expect(h1).toBe(h2);
const content = await cas.get(h1);
expect(content).toBe("idempotent");
});
test("different content produces different hashes", async () => {
const cas = createCasStore(casDir);
const h1 = await cas.put("alpha");
const h2 = await cas.put("beta");
expect(h1).not.toBe(h2);
});
});
+131 -31
View File
@@ -1,42 +1,138 @@
import { describe, expect, test } from "bun:test";
import { afterEach, describe, expect, test } from "bun:test";
import { mkdir, mkdtemp, readFile, rm } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import * as z from "zod/v4";
import { createRoleModerator } from "../src/create-role-moderator.js";
import { createWorkflow } from "../src/create-workflow.js";
import { executeThread } from "../src/engine.js";
import { createExtract } from "../src/extract-fn.js";
import { createLogger } from "../src/logger.js";
import { END } from "../src/types.js";
type DemoMeta = {
planner: Record<string, unknown>;
coder: Record<string, unknown>;
};
const demoWorkflow = createRoleModerator<DemoMeta>({
roles: {
planner: async () => ({
content: "plan-body",
meta: { plan: "do-it", files: ["a.ts"] },
}),
coder: async () => ({
content: "code-body",
meta: { diff: "+ok" },
}),
},
moderator: (ctx) => {
if (ctx.steps.length === 0) {
return "planner";
}
if (ctx.steps.length === 1) {
return "coder";
}
return END;
},
const plannerMetaSchema = z.object({
plan: z.string(),
files: z.array(z.string()),
});
const coderMetaSchema = z.object({
diff: z.string(),
});
type DemoMeta = {
planner: z.infer<typeof plannerMetaSchema>;
coder: z.infer<typeof coderMetaSchema>;
};
function installMockChatCompletions(sequence: ReadonlyArray<Record<string, unknown>>): () => void {
const origFetch = globalThis.fetch;
let i = 0;
const mockFetch = async (
input: Parameters<typeof fetch>[0],
init?: RequestInit,
): Promise<Response> => {
const args = sequence[i] ?? sequence[sequence.length - 1];
if (args === undefined) {
throw new Error("installMockChatCompletions: empty sequence");
}
i += 1;
void input;
const body = init?.body ? (JSON.parse(String(init.body)) as Record<string, unknown>) : {};
const tools = body.tools;
const firstTool =
Array.isArray(tools) && tools.length > 0 && tools[0] !== null && typeof tools[0] === "object"
? (tools[0] as Record<string, unknown>)
: null;
const fn =
firstTool !== null ? (firstTool.function as Record<string, unknown> | undefined) : undefined;
const toolName = typeof fn?.name === "string" ? fn.name : "extract";
return new Response(
JSON.stringify({
choices: [
{
message: {
tool_calls: [
{
type: "function",
function: {
name: toolName,
arguments: JSON.stringify(args),
},
},
],
},
},
],
}),
{ status: 200, headers: { "Content-Type": "application/json" } },
);
};
globalThis.fetch = Object.assign(mockFetch, {
preconnect: origFetch.preconnect.bind(origFetch),
}) as typeof fetch;
return () => {
globalThis.fetch = origFetch;
};
}
const demoExtract = createExtract({
baseUrl: "http://127.0.0.1:9",
apiKey: "test",
model: "test",
});
const demoWorkflow = createWorkflow<DemoMeta>(
{
roles: {
planner: {
description: "Demo planner",
systemPrompt: "You are a planner.",
extractPrompt: "Extract plan text and affected files list.",
schema: plannerMetaSchema,
extractRefs: null,
},
coder: {
description: "Demo coder",
systemPrompt: "You are a coder.",
extractPrompt: "Extract the code diff summary.",
schema: coderMetaSchema,
extractRefs: null,
},
},
moderator: (ctx) => {
if (ctx.steps.length === 0) {
return "planner";
}
if (ctx.steps.length === 1) {
return "coder";
}
return END;
},
},
{
agent: async () => "unused",
overrides: {
planner: async () => "plan-body",
coder: async () => "code-body",
},
},
demoExtract,
);
describe("executeThread", () => {
let restoreFetch: (() => void) | null = null;
afterEach(() => {
restoreFetch?.();
restoreFetch = null;
});
test("writes RFC-001 `.data.jsonl` start + role records and `.info.jsonl` logs", async () => {
restoreFetch = installMockChatCompletions([
{ plan: "do-it", files: ["a.ts"] },
{ diff: "+ok" },
]);
const root = await mkdtemp(join(tmpdir(), "wf-engine-"));
try {
const threadId = "01KQXKW18CT8G75T53R8F4G7YG";
@@ -53,7 +149,6 @@ describe("executeThread", () => {
"demo-flow",
{ prompt: "Fix the login redirect bug in #3", steps: [] },
{
isDryRun: false,
maxRounds: 5,
signal: ac.signal,
awaitAfterEachYield: async () => {},
@@ -82,17 +177,19 @@ describe("executeThread", () => {
const params = start.parameters as Record<string, unknown>;
expect(params.prompt).toBe("Fix the login redirect bug in #3");
const opts = params.options as Record<string, unknown>;
expect(opts.isDryRun).toBe(false);
expect(opts.maxRounds).toBe(5);
expect(Object.keys(opts).sort()).toEqual(["maxRounds"]);
const role1 = JSON.parse(lines[1] ?? "{}") as Record<string, unknown>;
expect(role1.role).toBe("planner");
expect(role1.content).toBe("plan-body");
expect(role1.meta).toEqual({ plan: "do-it", files: ["a.ts"] });
expect(role1.refs).toEqual([]);
expect(typeof role1.timestamp).toBe("number");
const role2 = JSON.parse(lines[2] ?? "{}") as Record<string, unknown>;
expect(role2.role).toBe("coder");
expect(role2.refs).toEqual([]);
const infoText = await readFile(infoPath, "utf8");
const infoLines = infoText
@@ -111,6 +208,8 @@ describe("executeThread", () => {
});
test("pre-filled ThreadInput.steps skips roles already present", async () => {
restoreFetch = installMockChatCompletions([{ diff: "+ok" }]);
const root = await mkdtemp(join(tmpdir(), "wf-engine-fork-"));
try {
const threadId = "01KQXKW18CT8G75T53R8F4G7YG";
@@ -133,11 +232,11 @@ describe("executeThread", () => {
role: "planner",
content: "plan-body",
meta: { plan: "do-it", files: ["a.ts"] },
refs: ["CAS111AAAAAAA"],
},
],
},
{
isDryRun: false,
maxRounds: 5,
signal: ac.signal,
awaitAfterEachYield: async () => {},
@@ -147,6 +246,7 @@ describe("executeThread", () => {
role: "planner",
content: "plan-body",
meta: { plan: "do-it", files: ["a.ts"] },
refs: ["CAS111AAAAAAA"],
timestamp: histTs,
},
],
@@ -170,6 +270,7 @@ describe("executeThread", () => {
const role0 = JSON.parse(lines[1] ?? "{}") as Record<string, unknown>;
expect(role0.role).toBe("planner");
expect(role0.timestamp).toBe(histTs);
expect(role0.refs).toEqual(["CAS111AAAAAAA"]);
const role1 = JSON.parse(lines[2] ?? "{}") as Record<string, unknown>;
expect(role1.role).toBe("coder");
@@ -196,7 +297,6 @@ describe("executeThread", () => {
"demo-flow",
{ prompt: "hello", steps: [] },
{
isDryRun: false,
maxRounds: 0,
signal: ac.signal,
awaitAfterEachYield: async () => {},
@@ -6,7 +6,7 @@ import {
selectForkHistoricalSteps,
} from "../src/fork-thread.js";
const sampleDataJsonl = `{"name":"demo","hash":"C9NMV6V2TQT81","threadId":"01AAA1111111111111111111","parameters":{"prompt":"hi","options":{"isDryRun":false,"maxRounds":5}},"timestamp":100}
const sampleDataJsonl = `{"name":"demo","hash":"C9NMV6V2TQT81","threadId":"01AAA1111111111111111111","parameters":{"prompt":"hi","options":{"maxRounds":5}},"timestamp":100}
{"role":"planner","content":"p","meta":{},"timestamp":101}
{"role":"coder","content":"c","meta":{},"timestamp":102}
{"role":"reviewer","content":"r","meta":{},"timestamp":103}
@@ -23,6 +23,7 @@ describe("fork-thread", () => {
expect(r.value.start.hash).toBe("C9NMV6V2TQT81");
expect(r.value.start.threadId).toBe("01AAA1111111111111111111");
expect(r.value.start.prompt).toBe("hi");
expect(r.value.start.maxRounds).toBe(5);
expect(r.value.roleSteps.length).toBe(3);
expect(r.value.roleSteps[0]?.role).toBe("planner");
});
@@ -82,5 +83,6 @@ describe("fork-thread", () => {
expect(r.value.workflowName).toBe("demo");
expect(r.value.historicalSteps.length).toBe(1);
expect(r.value.historicalSteps[0]?.timestamp).toBe(101);
expect(r.value.runOptions).toEqual({ maxRounds: 5 });
});
});
@@ -0,0 +1,191 @@
import { afterEach, describe, expect, test } from "bun:test";
import { mkdir, mkdtemp, readFile, rm } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import * as z from "zod/v4";
import { createWorkflow } from "../src/create-workflow.js";
import { executeThread } from "../src/engine.js";
import { createExtract } from "../src/extract-fn.js";
import { buildForkPlan, parseThreadDataJsonl } from "../src/fork-thread.js";
import { createLogger } from "../src/logger.js";
import { END } from "../src/types.js";
const phaseSchema = z.object({
hash: z.string(),
title: z.string(),
});
const plannerMetaSchema = z.object({
phases: z.array(phaseSchema),
});
type RefsDemoMeta = {
planner: z.infer<typeof plannerMetaSchema>;
};
function installMockChatCompletions(sequence: ReadonlyArray<Record<string, unknown>>): () => void {
const origFetch = globalThis.fetch;
let i = 0;
const mockFetch = async (
input: Parameters<typeof fetch>[0],
init?: RequestInit,
): Promise<Response> => {
const args = sequence[i] ?? sequence[sequence.length - 1];
if (args === undefined) {
throw new Error("installMockChatCompletions: empty sequence");
}
i += 1;
void input;
const body = init?.body ? (JSON.parse(String(init.body)) as Record<string, unknown>) : {};
const tools = body.tools;
const firstTool =
Array.isArray(tools) && tools.length > 0 && tools[0] !== null && typeof tools[0] === "object"
? (tools[0] as Record<string, unknown>)
: null;
const fn =
firstTool !== null ? (firstTool.function as Record<string, unknown> | undefined) : undefined;
const toolName = typeof fn?.name === "string" ? fn.name : "extract";
return new Response(
JSON.stringify({
choices: [
{
message: {
tool_calls: [
{
type: "function",
function: {
name: toolName,
arguments: JSON.stringify(args),
},
},
],
},
},
],
}),
{ status: 200, headers: { "Content-Type": "application/json" } },
);
};
globalThis.fetch = Object.assign(mockFetch, {
preconnect: origFetch.preconnect.bind(origFetch),
}) as typeof fetch;
return () => {
globalThis.fetch = origFetch;
};
}
const refsDemoExtract = createExtract({
baseUrl: "http://127.0.0.1:9",
apiKey: "test",
model: "test",
});
const refsDemoWorkflow = createWorkflow<RefsDemoMeta>(
{
roles: {
planner: {
description: "Planner with phase hashes",
systemPrompt: "Plan.",
extractPrompt: "Extract phases with CAS hashes.",
schema: plannerMetaSchema,
extractRefs: (meta) => meta.phases.map((p) => p.hash),
},
},
moderator: (ctx) => (ctx.steps.length === 0 ? "planner" : END),
},
{
agent: async () => "plan-output",
},
refsDemoExtract,
);
describe("RoleStep refs tracking", () => {
let restoreFetch: (() => void) | null = null;
afterEach(() => {
restoreFetch?.();
restoreFetch = null;
});
test("parseThreadDataJsonl reads refs and defaults missing refs to []", () => {
const text = `{"name":"demo","hash":"C9NMV6V2TQT81","threadId":"01AAA1111111111111111111","parameters":{"prompt":"hi","options":{"maxRounds":5}},"timestamp":100}
{"role":"planner","content":"p","meta":{},"refs":["H111AAAAAAAAA","H222AAAAAAAAA"],"timestamp":101}
{"role":"coder","content":"c","meta":{},"timestamp":102}
`;
const r = parseThreadDataJsonl(text);
expect(r.ok).toBe(true);
if (!r.ok) {
return;
}
expect(r.value.roleSteps[0]?.refs).toEqual(["H111AAAAAAAAA", "H222AAAAAAAAA"]);
expect(r.value.roleSteps[1]?.refs).toEqual([]);
});
test("executeThread persists refs from extractRefs on role yields", async () => {
restoreFetch = installMockChatCompletions([
{
phases: [
{ hash: "C9NMV6V2TQT81", title: "phase-a" },
{ hash: "C9NMV6V2TQT82", title: "phase-b" },
],
},
]);
const root = await mkdtemp(join(tmpdir(), "wf-refs-"));
try {
const threadId = "01KQXKW18CT8G75T53R8F4G7YG";
const hash = "C9NMV6V2TQT81";
const dataPath = join(root, "logs", hash, `${threadId}.data.jsonl`);
const infoPath = join(root, "logs", hash, `${threadId}.info.jsonl`);
await mkdir(join(root, "logs", hash), { recursive: true });
const logger = createLogger({ sink: { kind: "file", path: infoPath } });
const ac = new AbortController();
const result = await executeThread(
refsDemoWorkflow,
"refs-demo",
{ prompt: "task", steps: [] },
{
maxRounds: 5,
signal: ac.signal,
awaitAfterEachYield: async () => {},
forkSourceThreadId: null,
prefilledDiskSteps: null,
},
{ threadId, hash, dataJsonlPath: dataPath, infoJsonlPath: infoPath },
logger,
);
expect(result.returnCode).toBe(0);
const dataText = await readFile(dataPath, "utf8");
const lines = dataText
.trim()
.split("\n")
.filter((l) => l !== "");
expect(lines.length).toBe(2);
const role1 = JSON.parse(lines[1] ?? "{}") as Record<string, unknown>;
expect(role1.role).toBe("planner");
expect(role1.refs).toEqual(["C9NMV6V2TQT81", "C9NMV6V2TQT82"]);
} finally {
await rm(root, { recursive: true, force: true });
}
});
test("buildForkPlan carries refs on historical steps", () => {
const text = `{"name":"demo","hash":"C9NMV6V2TQT81","threadId":"01AAA1111111111111111111","parameters":{"prompt":"hi","options":{"maxRounds":5}},"timestamp":100}
{"role":"planner","content":"p","meta":{},"refs":["KEEPREFAAAAAA"],"timestamp":101}
{"role":"coder","content":"c","meta":{},"refs":["CODERHASHAAAA"],"timestamp":102}
`;
const plan = buildForkPlan(text, null);
expect(plan.ok).toBe(true);
if (!plan.ok) {
return;
}
expect(plan.value.historicalSteps.length).toBe(1);
expect(plan.value.historicalSteps[0]?.refs).toEqual(["KEEPREFAAAAAA"]);
});
});
@@ -0,0 +1,14 @@
import { describe, expect, test } from "bun:test";
import { join } from "node:path";
import { getDefaultWorkflowStorageRoot, getGlobalCasDir } from "../src/storage-root.js";
describe("getGlobalCasDir", () => {
test("joins cas segment under explicit storage root", () => {
expect(getGlobalCasDir("/tmp/wf-root")).toBe(join("/tmp/wf-root", "cas"));
});
test("defaults to default workflow root when storage root is undefined", () => {
expect(getGlobalCasDir(undefined)).toBe(join(getDefaultWorkflowStorageRoot(), "cas"));
});
});
@@ -9,7 +9,6 @@ describe("RFC-001 thread JSONL shapes", () => {
parameters: {
prompt: "Fix the login redirect bug in #3",
options: {
isDryRun: false,
maxRounds: 5,
},
},
@@ -20,13 +19,16 @@ describe("RFC-001 thread JSONL shapes", () => {
role: "planner",
content: "Plan: modify auth middleware...",
meta: { plan: "...", files: ["src/auth.ts"] },
refs: [] as string[],
timestamp: 1714963201000,
};
expect(Object.keys(startRecord).sort()).toEqual(
["hash", "name", "parameters", "threadId", "timestamp"].sort(),
);
expect(Object.keys(roleRecord).sort()).toEqual(["content", "meta", "role", "timestamp"].sort());
expect(Object.keys(roleRecord).sort()).toEqual(
["content", "meta", "refs", "role", "timestamp"].sort(),
);
});
test("documents the `.info.jsonl` debug record keys", () => {
+2 -2
View File
@@ -102,7 +102,7 @@ describe("worker process", () => {
threadId,
workflowName: "demo-flow",
prompt: "hello",
options: { isDryRun: false, maxRounds: 5 },
options: { maxRounds: 5 },
});
const exitCode: number = await new Promise((resolve) => {
@@ -150,7 +150,7 @@ describe("worker process", () => {
threadId,
workflowName: "demo-flow",
prompt: "hello",
options: { isDryRun: false, maxRounds: 5 },
options: { maxRounds: 5 },
steps: [
{
role: "planner",
@@ -0,0 +1,196 @@
import { describe, expect, test } from "bun:test";
import { validateWorkflowDescriptor } from "../src/workflow-descriptor.js";
describe("validateWorkflowDescriptor", () => {
// 1. Valid minimal descriptor
test("accepts a minimal descriptor with empty roles", () => {
const result = validateWorkflowDescriptor({ description: "x", roles: {} });
expect(result.ok).toBe(true);
if (result.ok) {
expect(result.value.description).toBe("x");
expect(result.value.roles).toEqual({});
}
});
// 2. Valid descriptor with one role
test("accepts a descriptor with one role", () => {
const result = validateWorkflowDescriptor({
description: "workflow",
roles: {
solver: { description: "solves things", schema: { type: "object" } },
},
});
expect(result.ok).toBe(true);
if (result.ok) {
expect(result.value.description).toBe("workflow");
expect(result.value.roles.solver.description).toBe("solves things");
expect(result.value.roles.solver.schema).toEqual({ type: "object" });
}
});
// 3. Valid descriptor with multiple roles
test("accepts a descriptor with multiple roles", () => {
const result = validateWorkflowDescriptor({
description: "multi",
roles: {
a: { description: "role a", schema: {} },
b: { description: "role b", schema: { type: "string" } },
},
});
expect(result.ok).toBe(true);
if (result.ok) {
expect(Object.keys(result.value.roles)).toEqual(["a", "b"]);
}
});
// 4-6. Root is null / array / string / number / undefined
test("rejects null", () => {
const result = validateWorkflowDescriptor(null);
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor must be a non-array object");
});
test("rejects an array", () => {
const result = validateWorkflowDescriptor([]);
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor must be a non-array object");
});
test("rejects a string", () => {
const result = validateWorkflowDescriptor("hello");
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor must be a non-array object");
});
test("rejects a number", () => {
const result = validateWorkflowDescriptor(42);
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor must be a non-array object");
});
test("rejects undefined", () => {
const result = validateWorkflowDescriptor(undefined);
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor must be a non-array object");
});
// 7-8. Missing or non-string description
test("rejects missing description", () => {
const result = validateWorkflowDescriptor({ roles: {} });
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor.description must be a string");
});
test("rejects numeric description", () => {
const result = validateWorkflowDescriptor({ description: 123, roles: {} });
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor.description must be a string");
});
test("rejects null description", () => {
const result = validateWorkflowDescriptor({ description: null, roles: {} });
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor.description must be a string");
});
test("rejects boolean description", () => {
const result = validateWorkflowDescriptor({ description: true, roles: {} });
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor.description must be a string");
});
// 9-11. Missing / null / array roles
test("rejects missing roles", () => {
const result = validateWorkflowDescriptor({ description: "x" });
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor.roles must be a non-array object");
});
test("rejects null roles", () => {
const result = validateWorkflowDescriptor({ description: "x", roles: null });
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor.roles must be a non-array object");
});
test("rejects array roles", () => {
const result = validateWorkflowDescriptor({ description: "x", roles: [] });
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor.roles must be a non-array object");
});
// 12-13. Role entry is null / array
test("rejects null role entry", () => {
const result = validateWorkflowDescriptor({ description: "x", roles: { bad: null } });
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor.roles.bad must be a non-array object");
});
test("rejects array role entry", () => {
const result = validateWorkflowDescriptor({ description: "x", roles: { bad: [] } });
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor.roles.bad must be a non-array object");
});
// 14-15. Role missing description / non-string description
test("rejects role with missing description", () => {
const result = validateWorkflowDescriptor({
description: "x",
roles: { r: { schema: {} } },
});
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor.roles.r.description must be a string");
});
test("rejects role with non-string description", () => {
const result = validateWorkflowDescriptor({
description: "x",
roles: { r: { description: 99, schema: {} } },
});
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor.roles.r.description must be a string");
});
// 16-18. Role schema null / array / missing
test("rejects role with null schema", () => {
const result = validateWorkflowDescriptor({
description: "x",
roles: { r: { description: "d", schema: null } },
});
expect(result.ok).toBe(false);
if (!result.ok)
expect(result.error).toBe("descriptor.roles.r.schema must be a non-array object");
});
test("rejects role with array schema", () => {
const result = validateWorkflowDescriptor({
description: "x",
roles: { r: { description: "d", schema: [] } },
});
expect(result.ok).toBe(false);
if (!result.ok)
expect(result.error).toBe("descriptor.roles.r.schema must be a non-array object");
});
test("rejects role with missing schema", () => {
const result = validateWorkflowDescriptor({
description: "x",
roles: { r: { description: "d" } },
});
expect(result.ok).toBe(false);
if (!result.ok)
expect(result.error).toBe("descriptor.roles.r.schema must be a non-array object");
});
// 19. First role valid, second role invalid
test("rejects at first invalid role when earlier roles are valid", () => {
const result = validateWorkflowDescriptor({
description: "x",
roles: {
good: { description: "ok", schema: {} },
bad: { description: 123, schema: {} },
},
});
expect(result.ok).toBe(false);
if (!result.ok) expect(result.error).toBe("descriptor.roles.bad.description must be a string");
});
});
+5 -1
View File
@@ -13,7 +13,11 @@
"xxhashjs": "^0.2.2",
"yaml": "^2.8.4"
},
"peerDependencies": {
"zod": "^4.0.0"
},
"devDependencies": {
"@types/acorn": "^6.0.4"
"@types/acorn": "^6.0.4",
"zod": "^4.0.0"
}
}
+25
View File
@@ -0,0 +1,25 @@
import * as z from "zod/v4";
import type { RoleMeta, WorkflowDefinition } from "./types.js";
import type { WorkflowDescriptor, WorkflowRoleSchema } from "./workflow-descriptor.js";
function stripJsonSchemaMeta(json: Record<string, unknown>): WorkflowRoleSchema {
const { $schema: _drop, ...rest } = json;
return rest as WorkflowRoleSchema;
}
export function buildDescriptor<M extends RoleMeta>(
def: WorkflowDefinition<M>,
): WorkflowDescriptor {
const roles: WorkflowDescriptor["roles"] = {};
for (const [key, roleDef] of Object.entries(def.roles) as Array<
[string, { description: string; schema: z.ZodType }]
>) {
const rawJsonSchema = z.toJSONSchema(roleDef.schema) as Record<string, unknown>;
roles[key] = {
description: roleDef.description,
schema: stripJsonSchemaMeta(rawJsonSchema),
};
}
return { description: def.description, roles };
}
+73
View File
@@ -0,0 +1,73 @@
import { mkdir, readdir, readFile, rename, unlink, writeFile } from "node:fs/promises";
import { join } from "node:path";
import { hashString } from "./hash.js";
export type CasStore = {
put(content: string): Promise<string>;
get(hash: string): Promise<string | null>;
delete(hash: string): Promise<void>;
list(): Promise<string[]>;
};
export function createCasStore(casDir: string): CasStore {
async function ensureDir(): Promise<void> {
await mkdir(casDir, { recursive: true });
}
function filePath(hash: string): string {
return join(casDir, `${hash}.txt`);
}
return {
async put(content: string): Promise<string> {
const hash = hashString(content);
await ensureDir();
const target = filePath(hash);
const tmp = `${target}.tmp.${Date.now()}`;
await writeFile(tmp, content, "utf8");
await rename(tmp, target);
return hash;
},
async get(hash: string): Promise<string | null> {
try {
return await readFile(filePath(hash), "utf8");
} catch (e) {
const errObj = e as NodeJS.ErrnoException;
if (errObj.code === "ENOENT") {
return null;
}
throw e;
}
},
async delete(hash: string): Promise<void> {
try {
await unlink(filePath(hash));
} catch (e) {
const errObj = e as NodeJS.ErrnoException;
if (errObj.code === "ENOENT") {
return;
}
throw e;
}
},
async list(): Promise<string[]> {
try {
const entries = await readdir(casDir);
return entries.filter((name) => name.endsWith(".txt")).map((name) => name.slice(0, -4));
} catch (e) {
const errObj = e as NodeJS.ErrnoException;
if (errObj.code === "ENOENT") {
return [];
}
throw e;
}
},
};
}
/** @deprecated Use {@link createCasStore} — CAS is global, not per-thread. */
export const createThreadCas = createCasStore;
@@ -1,90 +0,0 @@
import {
END,
type RoleMeta,
type RoleOutput,
type RoleStep,
START,
type ThreadContext,
type ThreadInput,
type WorkflowDefinition,
type WorkflowFn,
type WorkflowFnOptions,
type WorkflowResult,
} from "./types.js";
function isRoleNext<M extends RoleMeta>(
next: (keyof M & string) | typeof END,
): next is keyof M & string {
return next !== END;
}
/**
* Role + Moderator pattern as an optional helper: returns a {@link WorkflowFn} that runs the
* moderator loop and yields each {@link RoleOutput}. Assign with `export const run = createRoleModerator(...)`.
*/
export function createRoleModerator<M extends RoleMeta>(
def: Pick<WorkflowDefinition<M>, "roles" | "moderator">,
): WorkflowFn {
return async function* roleModeratorWorkflow(
input: ThreadInput,
options: WorkflowFnOptions,
): AsyncGenerator<RoleOutput, WorkflowResult> {
const nowMs = Date.now();
const start: ThreadContext<M>["start"] = {
role: START,
content: input.prompt,
meta: { maxRounds: options.maxRounds },
timestamp: nowMs,
};
const baseTs = Date.now();
let steps: RoleStep<M>[] = input.steps.map((out, i) => ({
role: out.role,
content: out.content,
meta: out.meta,
timestamp: baseTs + i,
})) as RoleStep<M>[];
while (true) {
if (steps.length >= options.maxRounds) {
return {
returnCode: 0,
summary: `completed: reached maxRounds (${options.maxRounds})`,
};
}
const ctx: ThreadContext<M> = {
start,
steps,
};
const next = def.moderator(ctx);
if (!isRoleNext(next)) {
return { returnCode: 0, summary: "completed: moderator returned END" };
}
const roleFn = def.roles[next];
if (roleFn === undefined) {
return { returnCode: 1, summary: `unknown role: ${next}` };
}
const result = await roleFn(ctx as unknown as ThreadContext);
const ts = Date.now();
const step = {
role: next,
content: result.content,
meta: result.meta,
timestamp: ts,
} as RoleStep<M>;
yield {
role: step.role,
content: step.content,
meta: step.meta,
};
steps = [...steps, step];
}
};
}
+131
View File
@@ -0,0 +1,131 @@
import type { ExtractFn } from "./extract-fn.js";
import {
type AgentBinding,
type AgentContext,
END,
type ExtractContext,
type ModeratorContext,
type RoleDefinition,
type RoleMeta,
type RoleOutput,
type RoleStep,
START,
type ThreadInput,
type WorkflowDefinition,
type WorkflowFn,
type WorkflowFnOptions,
type WorkflowResult,
} from "./types.js";
function isRoleNext<M extends RoleMeta>(
next: (keyof M & string) | typeof END,
): next is keyof M & string {
return next !== END;
}
function resolveExtractedRefs(
roleDef: RoleDefinition<Record<string, unknown>>,
meta: unknown,
): string[] {
const extractRefsFn = roleDef.extractRefs;
if (extractRefsFn === null || typeof extractRefsFn !== "function") {
return [];
}
return extractRefsFn(meta as Record<string, unknown>);
}
/**
* Binds pure role definitions + moderator to runtime agents and structured extraction.
* Assign with `export const run = createWorkflow(def, binding, extract)`.
*/
export function createWorkflow<M extends RoleMeta>(
def: Pick<WorkflowDefinition<M>, "roles" | "moderator">,
binding: AgentBinding,
extract: ExtractFn,
): WorkflowFn {
return async function* workflowLoop(
input: ThreadInput,
options: WorkflowFnOptions,
): AsyncGenerator<RoleOutput, WorkflowResult> {
const nowMs = Date.now();
const start: ModeratorContext<M>["start"] = {
role: START,
content: input.prompt,
meta: { maxRounds: options.maxRounds },
timestamp: nowMs,
};
const baseTs = Date.now();
let steps: RoleStep<M>[] = input.steps.map((out, i) => ({
role: out.role,
content: out.content,
meta: out.meta,
refs: out.refs,
timestamp: baseTs + i,
})) as RoleStep<M>[];
while (true) {
if (steps.length >= options.maxRounds) {
return {
returnCode: 0,
summary: `completed: reached maxRounds (${options.maxRounds})`,
};
}
const modCtx: ModeratorContext<M> = {
threadId: options.threadId,
start,
steps,
};
const next = def.moderator(modCtx);
if (!isRoleNext(next)) {
return { returnCode: 0, summary: "completed: moderator returned END" };
}
const roleDef = def.roles[next];
if (roleDef === undefined) {
return { returnCode: 1, summary: `unknown role: ${next}` };
}
const agentCtx: AgentContext<M> = {
...modCtx,
currentRole: { name: next, systemPrompt: roleDef.systemPrompt },
};
const agent = binding.overrides?.[next] ?? binding.agent;
const raw = await agent(agentCtx as unknown as AgentContext);
const extractCtx: ExtractContext<M> = {
...agentCtx,
agentContent: raw,
};
const meta = await extract(
roleDef.schema,
roleDef.extractPrompt,
extractCtx as unknown as ExtractContext,
);
const refs = resolveExtractedRefs(
roleDef as unknown as RoleDefinition<Record<string, unknown>>,
meta,
);
const ts = Date.now();
const step = {
role: next,
content: raw,
meta,
refs,
timestamp: ts,
} as RoleStep<M>;
yield { role: step.role, content: step.content, meta: step.meta, refs: step.refs };
steps = [...steps, step];
}
};
}
+5 -3
View File
@@ -2,6 +2,7 @@ import { appendFile, mkdir } from "node:fs/promises";
import { dirname } from "node:path";
import type { LogFn } from "./logger.js";
import { normalizeRefsField } from "./refs-field.js";
import type { ThreadInput, WorkflowFn, WorkflowFnOptions, WorkflowResult } from "./types.js";
export type ExecuteThreadIo = {
@@ -16,11 +17,11 @@ export type PrefilledDiskStep = {
role: string;
content: string;
meta: Record<string, unknown>;
refs: string[];
timestamp: number;
};
export type ExecuteThreadOptions = {
isDryRun: boolean;
maxRounds: number;
signal: AbortSignal;
/** Invoked after each successful yield (and outer-loop checks); used for pause/resume. */
@@ -80,6 +81,7 @@ async function driveWorkflowGenerator(params: {
role: step.role,
content: step.content,
meta: step.meta,
refs: normalizeRefsField(step.refs),
timestamp: ts,
});
@@ -133,7 +135,6 @@ export async function executeThread(
parameters: {
prompt: input.prompt,
options: {
isDryRun: options.isDryRun,
maxRounds: options.maxRounds,
},
},
@@ -153,6 +154,7 @@ export async function executeThread(
role: row.role,
content: row.content,
meta: row.meta,
refs: normalizeRefsField(row.refs),
timestamp: row.timestamp,
});
}
@@ -167,7 +169,7 @@ export async function executeThread(
}
const bundleOptions: WorkflowFnOptions = {
isDryRun: options.isDryRun,
threadId: io.threadId,
maxRounds: options.maxRounds,
};
+51
View File
@@ -0,0 +1,51 @@
import type * as z from "zod/v4";
import { llmExtractWithRetry } from "./llm-extract.js";
import type { ExtractContext, LlmProvider } from "./types.js";
export type ExtractFn = <T extends Record<string, unknown>>(
schema: z.ZodType<T>,
prompt: string,
ctx: ExtractContext,
) => Promise<T>;
/**
* Create an ExtractFn backed by an LLM provider.
* Builds prompt text from {@link ExtractContext} plus `prompt` and calls structured extraction.
*/
export function createExtract(provider: LlmProvider): ExtractFn {
return async <T extends Record<string, unknown>>(
schema: z.ZodType<T>,
prompt: string,
ctx: ExtractContext,
): Promise<T> => {
const lines: string[] = [];
lines.push(`## Role: ${ctx.currentRole.name}`);
lines.push(ctx.currentRole.systemPrompt);
lines.push("");
lines.push("## Task");
lines.push(ctx.start.content);
lines.push("");
if (ctx.steps.length > 0) {
lines.push("## Thread History");
for (const step of ctx.steps) {
lines.push(`### ${step.role}`);
lines.push(step.content);
lines.push(`Meta: ${JSON.stringify(step.meta)}`);
lines.push("");
}
}
lines.push("## Agent Output");
lines.push(ctx.agentContent);
lines.push("");
lines.push("## Extraction Instruction");
lines.push(prompt);
const text = lines.join("\n");
const result = await llmExtractWithRetry({ text, schema, provider });
if (!result.ok) {
throw new Error(`extract failed: ${JSON.stringify(result.error)}`);
}
return result.value;
};
}
+6 -7
View File
@@ -1,3 +1,4 @@
import { normalizeRefsField } from "./refs-field.js";
import { err, ok, type Result } from "./result.js";
import type { RoleOutput } from "./types.js";
@@ -9,7 +10,6 @@ export type ParsedThreadStartRecord = {
hash: string;
threadId: string;
prompt: string;
isDryRun: boolean;
maxRounds: number;
};
@@ -37,6 +37,7 @@ function parseRoleLine(
role,
content,
meta: meta as Record<string, unknown>,
refs: normalizeRefsField(obj.refs),
timestamp,
});
}
@@ -72,10 +73,9 @@ function parseStartRecordLine(firstLine: string): Result<ParsedThreadStartRecord
return err("start record missing parameters.options");
}
const optRec = options as Record<string, unknown>;
const isDryRun = optRec.isDryRun;
const maxRounds = optRec.maxRounds;
if (typeof isDryRun !== "boolean" || typeof maxRounds !== "number") {
return err("start record missing parameters.options.isDryRun or maxRounds");
if (typeof maxRounds !== "number") {
return err("start record missing parameters.options.maxRounds");
}
return ok({
@@ -83,7 +83,6 @@ function parseStartRecordLine(firstLine: string): Result<ParsedThreadStartRecord
hash,
threadId,
prompt,
isDryRun,
maxRounds,
});
}
@@ -197,7 +196,7 @@ export type ForkPlan = {
hash: string;
sourceThreadId: string;
prompt: string;
runOptions: { isDryRun: boolean; maxRounds: number };
runOptions: { maxRounds: number };
historicalSteps: ForkHistoricalStep[];
};
@@ -222,7 +221,7 @@ export function buildForkPlan(
hash: start.hash,
sourceThreadId: start.threadId,
prompt: start.prompt,
runOptions: { isDryRun: start.isDryRun, maxRounds: start.maxRounds },
runOptions: { maxRounds: start.maxRounds },
historicalSteps: selected.value,
});
}
+7
View File
@@ -15,3 +15,10 @@ export function hashWorkflowBundleBytes(data: Uint8Array): string {
const digest = XXH.h64(0).update(buf).digest();
return encodeUint64AsCrockford(digestToUint64(digest));
}
/** XXH64 (seed 0) over a UTF-8 string, encoded as 13-char Crockford Base32. */
export function hashString(content: string): string {
const buf = Buffer.from(content, "utf8");
const digest = XXH.h64(0).update(buf).digest();
return encodeUint64AsCrockford(digestToUint64(digest));
}
+18 -5
View File
@@ -5,8 +5,10 @@ export {
encodeCrockfordBase32Bits,
encodeUint64AsCrockford,
} from "./base32.js";
export { buildDescriptor } from "./build-descriptor.js";
export { validateWorkflowBundle, type WorkflowBundleValidationInput } from "./bundle-validator.js";
export { createRoleModerator } from "./create-role-moderator.js";
export { type CasStore, createCasStore, createThreadCas } from "./cas.js";
export { createWorkflow } from "./create-workflow.js";
export {
type ExecuteThreadIo,
type ExecuteThreadOptions,
@@ -14,6 +16,7 @@ export {
type PrefilledDiskStep,
} from "./engine.js";
export { type ExtractedBundleExports, extractBundleExports } from "./extract-bundle-exports.js";
export { createExtract, type ExtractFn } from "./extract-fn.js";
export {
buildForkPlan,
type ForkHistoricalStep,
@@ -23,7 +26,13 @@ export {
selectForkHistoricalSteps,
} from "./fork-thread.js";
export { stringifyWorkflowDescriptor } from "./generate-descriptor.js";
export { hashWorkflowBundleBytes } from "./hash.js";
export { hashString, hashWorkflowBundleBytes } from "./hash.js";
export {
type LlmError,
llmErrorToCause,
llmExtract,
llmExtractWithRetry,
} from "./llm-extract.js";
export {
type CreateLoggerOptions,
createLogger,
@@ -46,16 +55,20 @@ export {
writeWorkflowRegistry,
} from "./registry.js";
export { err, ok, type Result } from "./result.js";
export { getDefaultWorkflowStorageRoot } from "./storage-root.js";
export { getDefaultWorkflowStorageRoot, getGlobalCasDir } from "./storage-root.js";
export { createThreadPauseGate, type ThreadPauseGate } from "./thread-pause-gate.js";
export {
type AgentBinding,
type AgentContext,
type AgentFn,
END,
type ExtractContext,
type LlmProvider,
type Moderator,
type Role,
type ModeratorContext,
type RoleDefinition,
type RoleMeta,
type RoleOutput,
type RoleResult,
type RoleStep,
START,
type StartStep,
@@ -1,14 +1,12 @@
import { err, ok, type Result } from "@uncaged/workflow";
import * as z from "zod/v4";
import { err, ok, type Result } from "./result.js";
import type { LlmProvider } from "./types.js";
export type LlmExtractArgs<T> = {
text: string;
schema: z.ZodType<T>;
provider: LlmProvider;
dryRun: boolean;
/** Returned when `dryRun` is true (ignored for live extract). */
dryRunMeta: T;
};
export type LlmError =
@@ -126,10 +124,6 @@ export function llmErrorToCause(error: LlmError): Error {
async function performLlmExtract<T>(
options: LlmExtractArgs<T> & { userContent: string },
): Promise<Result<T, LlmError>> {
if (options.dryRun) {
return ok(options.dryRunMeta);
}
const rawJsonSchema = z.toJSONSchema(options.schema) as Record<string, unknown>;
const parameters = stripJsonSchemaMeta(rawJsonSchema);
const toolName = readToolName(parameters);
+13
View File
@@ -0,0 +1,13 @@
/** Normalize `refs` from persisted JSONL or IPC payloads (missing or invalid → []). */
export function normalizeRefsField(value: unknown): string[] {
if (!Array.isArray(value)) {
return [];
}
const out: string[] = [];
for (const x of value) {
if (typeof x === "string") {
out.push(x);
}
}
return out;
}
+6
View File
@@ -5,3 +5,9 @@ import { join } from "node:path";
export function getDefaultWorkflowStorageRoot(): string {
return join(homedir(), ".uncaged", "workflow");
}
/** Global content-addressed store directory under the workflow storage root (`<root>/cas`). */
export function getGlobalCasDir(storageRoot: string | undefined): string {
const root = storageRoot ?? getDefaultWorkflowStorageRoot();
return join(root, "cas");
}
+58 -25
View File
@@ -1,3 +1,5 @@
import type * as z from "zod/v4";
/** Sentinel values for automaton control flow. */
export const START = "__start__" as const;
export const END = "__end__" as const;
@@ -5,11 +7,20 @@ export const END = "__end__" as const;
/** Maps role names → their meta types. Single generic drives all inference. */
export type RoleMeta = Record<string, Record<string, unknown>>;
/** OpenAI-compatible LLM endpoint used for structured meta extraction. */
export type LlmProvider = {
baseUrl: string;
apiKey: string;
model: string;
};
/** What each generator yield produces — one role's output (engine adds `timestamp` when persisting). */
export type RoleOutput = {
role: string;
content: string;
meta: Record<string, unknown>;
/** CAS hashes produced or consumed by this step (for GC traceability). */
refs: string[];
};
/** What the workflow AsyncGenerator returns when done. */
@@ -26,7 +37,7 @@ export type ThreadInput = {
/** Options passed to a workflow bundle's `run` export (engine-provided). */
export type WorkflowFnOptions = {
isDryRun: boolean;
threadId: string;
maxRounds: number;
};
@@ -36,12 +47,6 @@ export type WorkflowFn = (
options: WorkflowFnOptions,
) => AsyncGenerator<RoleOutput, WorkflowResult>;
/** Typed output of a Role execution. */
export type RoleResult<Meta extends Record<string, unknown>> = {
content: string;
meta: Meta;
};
/** Engine start frame: initial prompt + thread identity. */
export type StartStep = {
role: typeof START;
@@ -52,28 +57,56 @@ export type StartStep = {
/** A completed role step in the thread. */
export type RoleStep<M extends RoleMeta> = {
[K in keyof M & string]: { role: K; meta: M[K]; content: string; timestamp: number };
[K in keyof M & string]: {
role: K;
meta: M[K];
content: string;
refs: string[];
timestamp: number;
};
}[keyof M & string];
/** Thread-scoped context passed to roles and moderator. */
export type ThreadContext<M extends RoleMeta = RoleMeta> = {
/** Phase 1: Moderator decides next role. */
export type ModeratorContext<M extends RoleMeta = RoleMeta> = {
threadId: string;
start: StartStep;
steps: RoleStep<M>[];
};
/**
* A Role — receives full thread context, returns typed content + meta.
* Implementation can be an agent, LLM call, script, HTTP request, etc.
*/
export type Role<Meta extends Record<string, unknown>> = (
ctx: ThreadContext,
) => Promise<RoleResult<Meta>>;
/** Phase 2: Agent executes — knows its role and prompt. */
export type AgentContext<M extends RoleMeta = RoleMeta> = ModeratorContext<M> & {
currentRole: {
name: string;
systemPrompt: string;
};
};
/**
* An Agent — raw string output interface for LLM/CLI adapters.
* Structured meta is extracted by the role's extract layer.
*/
export type AgentFn = (ctx: ThreadContext, systemPrompt: string) => Promise<string>;
/** Phase 3: Extractor runs — has agent output; the extraction instruction is a separate argument to the extract function. */
export type ExtractContext<M extends RoleMeta = RoleMeta> = AgentContext<M> & {
agentContent: string;
};
/** Alias — most external consumers see the agent-phase context. */
export type ThreadContext<M extends RoleMeta = RoleMeta> = AgentContext<M>;
/** Raw string output from an LLM/CLI adapter; meta is extracted by the engine. */
export type AgentFn = (ctx: AgentContext) => Promise<string>;
/** Runtime agent assignment (optional per-role overrides). */
export type AgentBinding = {
agent: AgentFn;
overrides?: Partial<Record<string, AgentFn>>;
};
/** Role wiring: prompts, schema, and human-readable description. */
export type RoleDefinition<Meta extends Record<string, unknown>> = {
description: string;
systemPrompt: string;
extractPrompt: string;
schema: z.ZodType<Meta>;
/** When non-null, produces CAS hashes to persist on this role's steps (see `RoleOutput.refs`). */
extractRefs: ((meta: Meta) => string[]) | null;
};
/**
* The Moderator — a pure routing function.
@@ -82,12 +115,12 @@ export type AgentFn = (ctx: ThreadContext, systemPrompt: string) => Promise<stri
* Returns the next role name or END to terminate.
*/
export type Moderator<M extends RoleMeta> = (
ctx: ThreadContext<M>,
ctx: ModeratorContext<M>,
) => (keyof M & string) | typeof END;
/** Complete workflow definition as authored by users. */
export type WorkflowDefinition<M extends RoleMeta> = {
name: string;
roles: { [K in keyof M & string]: Role<M[K]> };
description: string;
roles: { [K in keyof M & string]: RoleDefinition<M[K]> };
moderator: Moderator<M>;
};
+11 -5
View File
@@ -5,6 +5,7 @@ import { pathToFileURL } from "node:url";
import type { PrefilledDiskStep } from "./engine.js";
import { type ExecuteThreadIo, executeThread } from "./engine.js";
import { createLogger } from "./logger.js";
import { normalizeRefsField } from "./refs-field.js";
import { err, ok, type Result } from "./result.js";
import { createThreadPauseGate, type ThreadPauseGate } from "./thread-pause-gate.js";
import type { RoleOutput, WorkflowFn } from "./types.js";
@@ -16,7 +17,7 @@ type RunCommand = {
threadId: string;
workflowName: string;
prompt: string;
options: { isDryRun: boolean; maxRounds: number };
options: { maxRounds: number };
steps: RoleOutput[];
/** Timestamps aligned with `steps` for `.data.jsonl` replay; length must match `steps` when non-null. */
stepTimestamps: number[] | null;
@@ -55,7 +56,12 @@ function parseRoleOutputRecord(obj: Record<string, unknown>): RoleOutput | null
if (meta === null || typeof meta !== "object") {
return null;
}
return { role, content, meta: meta as Record<string, unknown> };
return {
role,
content,
meta: meta as Record<string, unknown>,
refs: normalizeRefsField(obj.refs),
};
}
function parseRunStepsPayload(rec: Record<string, unknown>): {
@@ -114,9 +120,8 @@ function parseRunControlPayload(rec: Record<string, unknown>): RunCommand | null
return null;
}
const optRec = options as Record<string, unknown>;
const isDryRun = optRec.isDryRun;
const maxRounds = optRec.maxRounds;
if (typeof isDryRun !== "boolean" || typeof maxRounds !== "number") {
if (typeof maxRounds !== "number") {
return null;
}
const parsedSteps = parseRunStepsPayload(rec);
@@ -136,7 +141,7 @@ function parseRunControlPayload(rec: Record<string, unknown>): RunCommand | null
threadId,
workflowName,
prompt,
options: { isDryRun, maxRounds },
options: { maxRounds },
steps: parsedSteps.steps,
stepTimestamps: parsedSteps.stepTimestamps,
forkSourceThreadId,
@@ -383,6 +388,7 @@ async function main(): Promise<void> {
role: step.role,
content: step.content,
meta: step.meta,
refs: normalizeRefsField(step.refs),
timestamp: typeof ts === "number" && ts > 0 ? ts : baseTs + i,
};
});
+3 -2
View File
@@ -18,9 +18,10 @@
},
"references": [
{ "path": "packages/workflow" },
{ "path": "packages/workflow-util-role" },
{ "path": "packages/workflow-role-llm" },
{ "path": "packages/workflow-agent-llm" },
{ "path": "packages/workflow-role-committer" },
{ "path": "packages/workflow-role-coder" },
{ "path": "packages/workflow-role-planner" },
{ "path": "packages/workflow-role-reviewer" },
{ "path": "packages/workflow-agent-cursor" },
{ "path": "packages/workflow-agent-hermes" },