Adds 'uwf skill user' command for agents/humans using the uwf CLI.
Covers setup, workflow management, thread lifecycle, step operations,
CAS queries, logging, and global options with a Quick Start guide.
Refs #538
Adds 'uwf skill actor' command for agents executing workflow roles.
Covers the two things an actor needs to know:
1. Frontmatter output protocol (status field, schema-defined fields)
2. CAS operations (put, get, refs, walk, merkle DAG pattern)
Refs #540
The HermesAcpClient integration tests require a live Hermes agent
process and always timeout (3 × 120s) in CI containers, causing
every CI run to fail for ~6 minutes before reporting failure.
Switch from `bun run test` to `bun run test:ci` which was already
defined in all testable packages — workflow-agent-hermes's test:ci
runs only unit tests (__tests__/*.test.ts), skipping integration/.
- e2e-walkthrough.yaml: examples/ → .workflows/ (project workflows, not examples)
- .gitea/workflows/ci.yml: bun test → bun run test (avoid legacy-packages)
- .plan/: removed stale test spec from #335
小橘 🍊
- Remove unnecessary Promise.resolve() wrappers (sync function)
- Use try/finally for db.close() instead of manual close at each exit
- Flatten nested try/catch
Follow-up to #535 review nits.
小橘 🍊
When sessions.write_json_snapshots is disabled, Hermes only writes to
state.db (SQLite). loadHermesSession now falls back to reading from
~/.hermes/state.db when the JSON file is missing.
- Add getHermesDbPath() and loadHermesSessionFromDb() functions
- Use bun:sqlite with readonly mode, try-catch for graceful errors
- JSON file still takes priority (fast path)
- Filter messages to user/assistant/tool roles
- Convert unix timestamps to ISO 8601 strings
Runs full uwf CLI walkthrough inside a Docker container with isolated
storage root. Tests: setup, workflow add/list/show, thread start/exec/
cancel/fork, step list/show, CAS operations, config get/set, logs.
Approach: mount host $HOME into node:22-bookworm container, override
UNCAGED_WORKFLOW_STORAGE_ROOT with tmpdir. No mock LLM — real agents.
Known issues documented in header comments (jq quoting, apt-get
startup time, lockfile conflicts).
小橘 🍊(NEKO Team)
This commit addresses three related issues in the CLI config and setup commands:
1. Issue #531: Fix config list apiKey masking
- maskApiKeys() now checks for 'apiKey' instead of 'apiKeyEnv'
- Updated tests to use apiKey field throughout
2. Issue #532: Add config set key validation
- Reject unknown top-level keys with helpful error messages
- Reject unknown nested fields in providers/models/agents
- Reject incomplete paths and nested paths on scalar keys
- Added VALID_CONFIG_KEYS schema and validateConfigKey() function
3. Issue #533: Fix agent name double-prefix in setup
- mergeConfig() now uses _agentNameFromBinary() to normalize agent names
- 'uwf-hermes' input now produces 'hermes' key with 'uwf-hermes' command
- Added tests for prefixed agent names
All tests passing, no regressions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update all documentation files that contained outdated apiKeyEnv
references to use the new apiKey approach.
## Changes
- docs/architecture.md: Update config example to use apiKey field
- docs/wf-stateless-design.md: Update config examples and type
definitions to use apiKey instead of apiKeyEnv
- docs/builtin-agent-research.md: Update ProviderConfig type
definition and code examples
All documentation now consistent with the code implementation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Breaking change: Store API keys directly in config.yaml instead of
environment variable names.
## Changes
### @uncaged/workflow-protocol
- Change ProviderConfig.apiKeyEnv: string → apiKey: string
- Update README to reflect new type
### @uncaged/workflow-util-agent
- extract.ts: Remove dotenv loading, use providerEntry.apiKey directly
- storage.ts: Update normalizeProviders to validate apiKey field
- Update error messages to reference apiKey instead of apiKeyEnv
### @uncaged/cli-workflow
- setup.ts: Write actual API key to config.yaml, not .env
- Remove apiKeyEnvName(), loadEnvFile(), saveEnvFile() functions
- Remove getEnvPath() function
- Update cmdSetup to not return envPath in result
- Update README to reflect config.yaml stores API keys
- Fix setup-validate.test.ts to not expect envPath in result
## Verification
- ✅ bun run build passes
- ✅ All tests pass (260/260 in cli-workflow, 55/55 in util-agent)
- ✅ bun run check passes (only pre-existing warnings)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add configuration management commands to uwf CLI:
- uwf config list: display all config values (masks API keys)
- uwf config get <key>: retrieve specific value using dot notation
- uwf config set <key> <value>: update config value with auto-creation
Implementation:
- New file packages/cli-workflow/src/commands/config.ts with helper functions
- Comprehensive test coverage (32 tests) in config.test.ts
- Supports nested path navigation via dot notation
- Auto-creates intermediate objects when setting new paths
- Masks apiKeyEnv values in list output for security
Resolves#526
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- moderator-reference: use nested map graph format matching evaluate.ts
- yaml-reference: use goal/procedure/output/capabilities/frontmatter fields
matching actual WorkflowPayload, not fabricated system/outputSchema
- skill.test.ts: replace hardcoded absolute path with __dirname-relative
- skill.test.ts: assert 'frontmatter' instead of 'outputSchema'
Closes#519
The ACP protocol's tool_call updates only carry a display title (not a
structured tool name) and omit rawInput for polished tools, making the
reconstructed messages unusable for step read/show.
Changes:
- hermes.ts: storePromptResult reads ~/.hermes/sessions/session_{id}.json
via loadHermesSession() instead of using ACP-reconstructed messages
- acp-client.ts: strip message/tool-call collection logic, keep only
text chunk accumulation for final response extraction
- step.ts: TurnData gains role + toolCalls fields; formatTurnBody
renders them in step read markdown output
- README: document sessions.write_json_snapshots requirement
Two fixes for 'uwf thread start solve-issue' failures:
1. json-cas 0.5.2 (npm) was missing oneOf in ALLOWED_SCHEMA_KEYS.
Published json-cas 0.5.3 with the fix, bumped all packages to ^0.5.3.
2. Semantic validator only recognized oneOf-based multi-exit schemas.
Roles using $status with enum (e.g. enum: [approved, rejected]) were
incorrectly treated as single-exit. Added isEnumMultiExit() support.
Changes:
- validate-semantic.ts: isEnumMultiExit(), getEnumStatuses(), checkSingleExitMustache()
- All package.json: @uncaged/json-cas ^0.5.2 → ^0.5.3
- validate-semantic.test.ts: 5 new enum multi-exit tests (Suite 3b)
- solve-issue-tea-worktree.test.ts: updated for current workflow structure
小橘 🍊
- Add agent discovery step to cmdSetupInteractive flow
- _promptAgentSelection: discover uwf-* binaries, auto-select if only one,
prompt user to choose if multiple, show install hints if none found
- mergeConfig: always write selected agent entry, update defaultAgent
- Known agent labels for hermes, claude-code, cursor, builtin
- 10 new tests for _agentNameFromBinary, _printAgentMenu, cmdSetup agent config
Fixes#424
The test used a fake CasRef string as frontmatter, which fails
putSchema validation when loading from YAML files. Replace with
a proper JSON Schema object.
Fixes pre-existing failures in workflow-resolution, cas-exit-code,
and thread-step-count tests.
- Move hermes ACP integration tests to __tests__/integration/
- Add test:ci script to all packages (excludes integration/)
- CI workflow uses test:ci instead of test
- bun test still runs everything (unit + integration)
Refs #510
- Remove hardcoded ~/repos/workflow paths from procedure text
- Use .worktrees/ relative to repo root instead of global path
- Add developer failed → $END exit for unrecoverable situations
- Add worktree field to reviewer rejected variant
- Fix test workflowPath to use import.meta.dirname
Refs #506
BREAKING CHANGE: StepRecord now requires startedAtMs and completedAtMs fields.
StepEntry now requires durationMs field. Old CAS data without these fields is invalid.
- Add startedAtMs/completedAtMs to StepRecord and StepNodePayload
- Add durationMs to StepEntry (computed: completedAtMs - startedAtMs)
- Update STEP_NODE_SCHEMA to require timing fields as integers
- Record Date.now() before/after agent execution in createAgent
- Show duration in thread read headers (formatStepHeader)
- Update existing test fixtures with timing fields
buildOutputFormatInstruction now detects discriminated union schemas
(oneOf with shared const/ property) and renders separate YAML
example blocks per variant, so agents see exactly which fields belong
to which outcome instead of a flat merge.
Non-discriminated oneOf/anyOf schemas fall back to the existing flat
merge behavior.
Refs #502
- evaluate() reads $status instead of status, defaults to _ when missing
- Update all YAML examples and .workflows to use $status
- Update cli-workflow resolveEvaluateArgs to use $status
- 10 moderator tests pass including new default _ test
- Single-exit roles no longer need to declare status field
Phase 1 of #499 (closes#500)
Replace all JSONata/ConditionDefinition/ConditionalEdge references with
status-based routing terminology across 8 files:
- README.md, CLAUDE.md: moderator description, key terms
- docs/architecture.md: dependency jsonata→mustache, evaluate signature
- docs/wf-stateless-design.md: type definitions, routing context
- packages/workflow-moderator/README.md: full rewrite for new API
- packages/workflow-protocol/README.md: Target type, remove Transition
- packages/workflow-dashboard/context.md: StatusEdge, graph type
- docs/builtin-agent-research.md: stale JSONata references
- Rename ConditionalEdge → StatusEdge, condition → status throughout
- Rename conditional.tsx → status.tsx, edge label shows status value
- Update trans-in/trans-out to use status field instead of condition
- Update validate to check status edges
- Align server/workflow.ts with new WorkflowPayload.graph format
- 20 dashboard tests pass
Phase 3 of #490 (closes#493)
- Migrate solve-issue.yaml, analyze-topic.yaml, debate.yaml to new format
- Add status enum field to all role frontmatter schemas
- Use {{{ }}} (triple mustache) for prompt templates with user content
- Disable mustache HTML escaping globally (prompts are plain text, not HTML)
- Add 2 new tests for HTML escape behavior
- 9 moderator tests pass
Phase 2 of #490 (closes#492)
Extract four helper functions from cmdStepRead to reduce cognitive
complexity from 27 to ≤15:
- loadStepDetail: Load and validate step detail node
- loadTurnData: Load all turn nodes and extract content
- selectTurnsForQuota: Select turns within quota (≥1 always shown)
- formatStepMarkdown: Assemble final markdown output
All 6 existing tests pass. Zero Biome warnings. CLAUDE.md compliant.
Fixes#487
Replace custom buildHistorySummary with shared buildContinuationPrompt
from workflow-agent-kit. This aligns claude-code agent with hermes agent:
- First visit: includes step content (within 32k quota)
- Re-entry: shows only steps since last visit (meta only, session has context)
Previously developer could not see reviewer's detailed feedback on
re-entry, only {"approved":false}. Now gets full review text.
Fixes#486
Implements the `uwf step read` command to render a single step's turns
as human-readable markdown with quota enforcement.
Changes:
- Implement cmdStepRead() in step.ts with quota enforcement
- Renders step metadata (hash, role, agent)
- Loads and formats turns from detail node
- Enforces quota by selecting most recent turns
- Always shows at least one turn even if it exceeds quota
- Gracefully handles steps with no detail or no turns
- Register `step read` command in cli.ts with --quota flag (default 4000)
- Add comprehensive test suite in step-read.test.ts (6 tests covering
basic functionality, quota enforcement, edge cases, and special chars)
- Update README.md CLI Reference table to include `step read`
- Update package-level README.md with command documentation and example
Closes#484
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Issue #480: The --quota flag on 'uwf thread read' was not properly
limiting output size due to an off-by-one error in selectByQuota().
Root cause:
- Items were added to selected array BEFORE checking if they would
exceed the quota
- This meant the last item that exceeded quota was still included
- Prompt deduplication tracking was mutated during quota calculation,
causing prompts to not render in final output
Fix:
- Check quota BEFORE adding items to selected array
- Always include at least one step even if it exceeds quota
- Calculate step lengths using actual rendering format
- Account for start section and separators in quota calculation
- Use temporary Set during length calculation to avoid mutating
the prompt deduplication tracking
Tests:
- Added comprehensive test suite (thread-read-quota.test.ts)
- Covers quota enforcement, boundary conditions, edge cases
- Tests interaction with --before and --start flags
- All tests pass
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Changed the exit behavior of 'uwf cas has' to return exit code 1 when
a hash doesn't exist, while preserving the JSON output {exists:false}.
This enables proper use in shell conditionals like 'if uwf cas has $HASH'.
Fixes#481
Developer procedure now requires running lint/build checks before committing.
Reviewer procedure clarified: hard checks (build/lint) must pass, style-only
suggestions should not block approval.
Fixes#477
Fixed issue #469 where `uwf step list`, `uwf step show`, and `uwf thread read`
failed with "thread not active" error when called on completed threads.
The root cause was that resolveHeadHash() in shared.ts only checked threads.yaml
(active threads index) but never fell back to history.jsonl (completed threads log).
Changes:
- Updated resolveHeadHash() in shared.ts to check history.jsonl as fallback
- Changed error message from "thread not active" to "thread not found"
- Added comprehensive test coverage:
- Unit tests for resolveHeadHash() with active/completed/missing threads
- Integration tests for cmdStepList() with completed threads
- Integration tests for cmdStepShow() with completed threads
- Regression tests for cmdThreadRead() with completed threads
All commands now work identically for active and completed threads.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a role participates for the first time (e.g. committer), it previously
only received the system prompt + last step output, missing the full thread
history. This caused hallucination as the role had to guess what happened.
Changes:
- build-continuation-prompt.ts: detect first-time roles and include all
steps' meta + content for last 2-3 steps (within quota)
- context.ts: add isFirstVisit detection helper
- types.ts: add isFirstVisit field to AgentContext
- hermes.ts: pass isFirstVisit through to prompt builder
Fixes#473
Implements enhanced filtering and pagination for the `uwf thread list` command
to support workflows with large numbers of threads.
Changes:
- Add --page, --page-size parameters for pagination (default: page 1, size 20)
- Add --since, --until time filters supporting multiple formats (ISO8601, relative like "2h", "1d")
- Add --workflow filter to show threads for specific workflow
- Add --sort parameter (newest-first, oldest-first, alphabetical)
- Add pagination metadata in JSON output (page, pageSize, totalThreads, totalPages, hasMore)
- Implement parseRelativeTime() for human-friendly time expressions (1h, 30m, 2d, 1w)
- Add comprehensive unit tests for filters, pagination, and time parsing
- Update CLI help text with new parameters and examples
Fixes#471
- Add `content: string | null` to RoleStep type
- Resolve contentHash → text for the last step when building ThreadContext
- Update buildAgentPrompt to include <output> tag with step content
- Add 16k content quota with truncation
- Update tests
Previously step list showed raw CAS refs for detail fields.
Now detail is recursively expanded (like output already was),
since every turn is individually hashed and walkable.
Refs #463
All roles (developer, reviewer, tester, committer) now work in
~/repos/workflow-worktrees/fix/<issue>-<slug> instead of modifying
the main working directory. Prevents self-destructive edits.
Fixes#464
Each agent now maintains its own session cache file instead of sharing
a single agent-sessions.json. This prevents session ID conflicts when
multiple agents operate on the same thread+role pair.
Changes:
- getCachePath() now takes agentName parameter
- getCachedSessionId/setCachedSessionId require agentName as first param
- Cache files named <agent>-sessions.json (e.g., hermes-sessions.json)
- Agent wrappers inject their agent name into cache calls
- Add comprehensive tests for session cache isolation
- Handle malformed JSON gracefully (treat as empty cache)
Fixes#461
Changed uwf thread read to wrap role prompts and agent outputs in XML tags
(<prompt> and <output>) instead of markdown headings (### Prompt, ### Content).
This prevents Claude Code from treating step outputs as structural headings.
- Updated formatStepPrompt to use <prompt>...</prompt> tags
- Updated formatStepContent to use <output>...</output> tags
- Added comprehensive test suite in thread-read-xml-tags.test.ts
- Updated existing tests to verify XML tag behavior
Fixes#459
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Refactored runBuiltinLoop function to reduce cognitive complexity from 30 to below 15 by extracting helper functions:
- shouldInjectDeadlineWarning: checks if deadline warning should be shown
- shouldProcessToolCalls: determines if tool calls should be processed
- extractFinalText: extracts last assistant message content
- injectDeadlineWarning: injects deadline warning message
- handleTextOnlyTurn: handles text-only turn logic
- handleToolCallTurn: handles tool call turn logic
- processLoopIteration: processes a single loop iteration
Added 24 new unit tests for the extracted helper functions, bringing total test count to 41 (all passing). All existing behavior is preserved.
Fixes#444
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit implements issue #456, adding two related capabilities to the uwf CLI:
1. **Background execution mode** for `uwf thread step` (via `--background` flag)
- Spawns agent execution in a detached background process
- Returns immediately with thread ID and background status
- Maintains marker files to track running processes
- Supports `--count` option to run multiple steps in background
- Prevents concurrent execution of the same thread
2. **Running threads query** command (`uwf thread running`)
- Lists all threads currently executing in background
- Returns thread ID, workflow, current role, PID, and start time
- Automatically filters out stale markers (dead processes)
- Empty list when no threads are running
**Key changes:**
- **workflow-protocol**: Added `RunningThreadItem`, `RunningThreadsOutput` types
Updated `StepOutput` to include `background: boolean | null` field
- **cli-workflow/background**: New module for process management
- Marker file creation/deletion (atomic operations)
- PID liveness checking
- Stale marker cleanup
- Running threads query
- **cli-workflow/commands/thread**:
- Updated `cmdThreadStep` to support `--background` and `--_background-worker` flags
- Added `cmdThreadStepBackground` for spawning detached processes
- Added `cmdThreadRunning` to list running threads
- Updated `cmdThreadKill` to terminate background processes
- **cli-workflow/cli**: Added CLI routing for new commands and flags
**Integration:**
- `uwf thread kill` now terminates background processes before archiving
- Foreground execution checks for existing background process and fails if found
- Background worker creates/cleans up marker files automatically
- Marker files stored in `~/.uncaged/workflow/running/*.json`
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The turns array items in CLAUDE_CODE_DETAIL_SCHEMA were missing
format: 'cas_ref', so expandDeep in step-details couldn't resolve
turn hashes to their payloads. Hermes schema already had this.
小橘 <xiaoju@shazhou.work>
Per CLAUDE.md convention, use `string | null` instead of `?:` in the
isFirstConditionalSibling helper function parameter types.
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Extract helper functions (resolveThreadId, getThreadHead, listThreadSteps,
displayStepDetails, displayThreadRead) to reduce nesting and improve
readability. Also adds test coverage for the refactored functions.
Fixes#446
- Root README: add all 9 packages to table, update architecture diagram,
refresh CLI reference from uwf --help
- New READMEs for 8 packages (cli-workflow, workflow-protocol,
workflow-moderator, workflow-agent-kit, workflow-agent-hermes,
workflow-agent-builtin, workflow-agent-claude-code, workflow-dashboard)
- Updated workflow-util README to match current exports
- All API sections verified against src/index.ts exports
小橘 🍊(NEKO Team)
- Nudge turns don't consume turn budget (up to MAX_NUDGES=3), prevents
wasting agent work capacity on bookkeeping
- Inject deadline warning when 3 turns remain, telling agent to wrap up
- Agent can use status:failed to gracefully exit if it can't finish
- Inject user message when 3 turns remain, telling agent to wrap up
- Prompt tells agent to use status:failed if it can't finish in time
- Prevents wasting all turns without producing any frontmatter output
- Remove stale test file from dogfood agent run
copilot-api returns tool_calls even when tools field is omitted from
the request (infers from message history). Now the loop explicitly
nullifies tool_calls when noTools=true.
When the agent's first run output fails frontmatter extraction, the
retry loop (via options.continue) would replace agentResult entirely,
causing the 1-turn continuation detail to overwrite the original
multi-turn detail containing all tool-call history.
Now we capture primaryDetailHash from the first run and always use it
for the persisted StepNode, regardless of how many retries occur.
Fixes#439
LLM sometimes emits plain text (e.g. 'Now I'll write the tests...')
without calling tools, which the loop treated as final output. Now
the loop detects this and injects a user message nudging the LLM
to either continue using tools or output frontmatter with ---.
Agent was using all continue turns to keep calling tools instead of
outputting the required frontmatter. Now continue runs with noTools=true,
forcing LLM to emit text-only response.
Also supports null tools in chatCompletionWithTools to omit tools from
the API request entirely.
Agent was wasting turns exploring the filesystem because it didn't
know its working directory. Now the system prompt includes:
'Your working directory is: /path/to/cwd'
- Add stripPreamble() to handle LLM output with text before ---
- Strengthen system prompt: CRITICAL instruction for --- at position 0
- Fixes frontmatter parsing failures on first output turn
Previously runBuiltinWithMessages deleted the session jsonl after each
run/continue call. This meant the createAgent retry mechanism (which
calls continue on frontmatter validation failure) would lose all
previous turn data — each continue started with an empty jsonl.
Now the session jsonl accumulates across run + continue calls, so the
final storeBuiltinDetail captures all turns. The jsonl file is left
behind for debugging; it's small and can be cleaned up on next startup.
Also add a workflow hint to the system prompt reminding the LLM to use
tools before outputting frontmatter, preventing premature text-only
responses on the first turn.
- Add 4-strategy resolution priority: CAS hash → file path → local discovery → global registry
- Add helper functions: isFilePath, workflowFileExists, findWorkflowInDir, findWorkflowInParents
- Refactor resolveWorkflowCasRef to support direct hash, explicit paths, and parent traversal
- Add comprehensive test suite with 24 tests covering all strategies and edge cases
- Support .workflow/ and .workflows/ directories with .yaml/.yml extensions
- All 60 tests pass across 5 test files
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Disable aliasDuplicateObjects in YAML stringify to prevent &a1/*a1
anchors when multiple steps have identical output
- Fix unused discoverAgents function (prefixed with _) and format issue
in setup.ts
Skip acp-client 'prompt() collects structured messages' and
resume-e2e 'resume() after close' — both require live LLM calls
and fail intermittently in CI.
Each turn (assistant response / tool result) is appended to a JSONL file
at ~/.uncaged/workflow/sessions/<sessionId>.jsonl during the loop.
On completion, the JSONL is read back, each turn is stored as a CAS node,
and the detail payload references them as a flat turns[] array in
chronological order. The session file is then deleted.
Benefits:
- Real-time observability: tail -f the JSONL to watch loop progress
- Crash recovery: partial JSONL survives process death
- Zero write contention: one file per session
- Detail stays a flat array for easy consumption by CLI/dashboard
Changes:
- New session.ts: initSessionDir, appendSessionTurn, readSessionTurns, removeSession
- loop.ts: append JSONL each turn instead of accumulating in-memory
- detail.ts: reads session JSONL → persists turns to CAS → stores detail
- agent.ts: passes storageRoot/sessionId to loop, cleans up session on completion
- types.ts: remove index from TurnPayload (order is implicit in JSONL/array)
- schemas.ts: sync with type changes
Ref: #433
buildClaudeCodePrompt was dropping ctx.edgePrompt entirely — the graph
transition instruction (e.g. 'Implement the plan') never reached the agent.
Now appended as '## Current Instruction' at the end of the prompt.
- StepRecord adds edgePrompt field (backward compat: defaults to "")
- StepNode CAS schema includes edgePrompt
- writeStepNode persists ctx.edgePrompt
- buildHistory exposes edgePrompt in StepContext
- buildBuiltinMessages reconstructs multi-turn moderator↔agent conversation:
system = role prompt + output format (stable prefix)
per prior visit: user (edgePrompt + inter-step summary) + assistant (output)
current: user (edgePrompt + recent summary)
- Zero extra persistence — pure function of CAS chain
- Stable prefix for LLM prompt cache hits
- 10 builtin tests pass, all other package tests pass
System message = agent identity (role prompt + output format instruction)
User message = moderator speech (task + edge prompt + history)
This reflects the workflow's core model: moderator speaks to agent
via the graph's edge prompt. Previously all content was in a single
system message with no user message, causing Claude API 400 errors.
- buildBuiltinPrompt now returns { system, user } instead of string
- agent.ts sends system + user as separate messages
- Tests updated accordingly
Switch from --output-format json to stream-json --verbose to capture
per-turn data. Detail now includes:
- model name
- usage (input/output/cache tokens)
- stopReason
- turns[] as individual CAS nodes with role, content, tool calls
Also addresses PR #421 review fixes:
- sessionId guard: skip cache write when sessionId is empty/undefined
- silent catch: log resume failures with debug tag 5VKR8N3Q
- atomic write: session cache uses temp+rename for crash safety
Closes#422
Two-role debate (against/for) with up to 3 rounds per side.
Each role re-enters with session resume, making this an ideal
integration test for cross-process session continuity.
Supports early termination via concession (conceded=true in frontmatter).
Refs #418
- Replace resolvePathInWorkspace with simple resolvePath (no boundary check)
- Remove UWF_BUILTIN_ALLOW_SHELL env gate from run_command
- Update tests accordingly
Per review: sandbox was false security with shell=true, and path
restrictions are unnecessary for a trusted agent environment.
Move getCachedSessionId/setCachedSessionId from workflow-agent-hermes
into workflow-agent-kit so all agent adapters can share the same
session cache logic.
Add cross-process session resume to workflow-agent-claude-code:
on re-entry (isFirstVisit=false), look up the cached sessionId and
use 'claude --resume' to continue with full conversation history.
Cache file renamed from hermes-sessions.json to agent-sessions.json
to reflect its shared nature.
Refs #418
Built-in role agent that uses workflow config models directly,
with its own tool-calling run loop. No external agent dependency.
- OpenAI-compatible chat completion client with tool_calls support
- P0 toolkit: read_file, write_file, run_command
- Integrates via createAgent factory from workflow-agent-kit
- CAS detail recording for each turn
- Path sandboxing and shell opt-in (UWF_BUILTIN_ALLOW_SHELL)
Hermes ACP _restore fails for custom providers — resolve_runtime_provider
throws and base_url/api_mode are lost, causing resume to silently create a
new session with no history. Prompt then returns empty text or refusal.
Disable resume by default. Set UWF_HERMES_RESUME=1 to opt back in.
Includes investigation notes in docs/investigations/.
Refs #418
- uwf log list: list log files with sizes
- uwf log show --thread <id>: filter by thread ID
- uwf log show --process <pid>: filter by process ID
- uwf log clean --before <date>: delete old log files
- Tests: 12 new tests covering all subcommands
Implemented by solve-issue workflow, biome fixes applied manually.
Closes#413
Refs #411, #410
- cmdLogList: list log files with sizes, sorted by date descending
- cmdLogShow: filter entries by thread, process, and/or date
- cmdLogClean: delete log files older than given date
- 12 tests covering all functions and edge cases
Fixes#413
- Add isFirstVisit: boolean to AgentContext
- Compute from steps history: !steps.some(s => s.role === role)
- hermes.ts: use isFirstVisit for first-entry vs re-entry logic
- buildInitialPrompt: always append edgePrompt as Moderator Instruction
- edgePrompt is never blanked — always the real moderator instruction
- New tests for first-visit, re-entry, and fallback scenarios
Refs #405, #407, #404
Two bugs fixed:
1. request_permission messages (JSON-RPC requests with both id+method) were
silently swallowed by the response handler, causing hermes to hang waiting
for permission approval. Now properly distinguish responses (id only) from
server requests (id+method).
2. uwf-hermes process never exited after completing because the hermes ACP
subprocess was still alive. Now explicitly close the ACP client after
agent completion so the subprocess terminates.
小橘 <xiaoju@shazhou.work>
The official TS SDK's ndJsonStream hangs indefinitely on prompt()
for sessions with 20+ messages (solve-issue planner). Root cause
appears to be a stream backpressure issue in the SDK's ReadableStream
adapter.
Switch back to readline-based line parsing which reliably receives
all JSON-RPC responses. Also handle session/request_permission
inline (auto-approve, yolo mode equivalent).
Ref #398
UwfAcpClient now tracks all session/update events:
- agent_message_chunk → assistant message content
- agent_thought_chunk → assistant reasoning
- tool_call → pending tool invocation (name + rawInput)
- tool_call_update (completed/failed) → assistant tool_call + tool result
Messages are accumulated across prompts (same session) and stored
via storeHermesSessionDetail, restoring the full structured detail
(turns with tool calls, reasoning) that was lost in the initial ACP
migration.
Ref #398
The client is now created once in createHermesAgent() and shared by
runHermes and continueHermes closures. This preserves conversation
context during frontmatter retry loops — continue() sends a follow-up
prompt on the same ACP session instead of starting a new one.
Client is cleaned up via process.on('exit').
Ref #398
Replace 250-line custom ACP client with official TypeScript SDK.
Uses ClientSideConnection + ndJsonStream for stdio transport.
Same public API (connect/prompt/close), 115 lines, zero custom protocol code.
Ref #398
Remove spawnHermes, spawnHermesChat, spawnHermesResume, parseSessionId,
and buildResultFromSession. runHermes and continueHermes now use
HermesAcpClient for structured JSON-RPC communication.
Session ID comes directly from ACP protocol, eliminating #380 race
condition. Agent output collected via streaming chunks instead of
session file loading.
Phase 2 of RFC #398Fixes#380
Implements JSON-RPC client that communicates with `hermes acp` via
stdin/stdout. Replaces fragile stdout/stderr parsing with structured
protocol: initialize → session/new → session/prompt → collect chunks.
Session ID comes directly from protocol response, eliminating the
race condition in #380.
Phase 1 of RFC #398
buildOutputFormatInstruction now includes explicit language telling agents to
output ONLY schema-defined fields and to focus on their role's deliverable.
Fixes#394
Claude Code CLI adapter for the workflow engine, mirroring
workflow-agent-hermes architecture. Spawns `claude -p` with
`--output-format json` for structured output parsing.
Refs #391
Send a test completion request after configuration to verify the model
is reachable. If validation fails, warn the user and suggest trying a
different model or checking their settings.
Fixes#335
Replace hardcoded 5-field example with schema-driven generation.
Now shows actual enum values, types, and required markers for
each role's frontmatter schema.
Fixes#389
小橘 <xiaoju@shazhou.work>
Replace hardcoded 5-field candidate with schema-driven extraction.
Now reads outputSchema properties and picks matching fields from
parsed frontmatter, supporting role-specific fields like plan,
approved, success.
Falls back to standard 5 fields when schema has no properties.
Fixes#388
小橘 <xiaoju@shazhou.work>
Agent CLI outputs plain CAS hash (not JSON), engine parses plain hash.
StepOutput no longer carries sessionId — session info is already in CAS detail.
Keeps the valuable parts of #385: sessionId in AgentRunResult (process-internal),
continue support, and frontmatter retry loop.
Breaking changes:
- AgentRunResult now requires sessionId field
- AgentOptions now requires continue function
- Agent CLI outputs JSON {stepHash, sessionId} instead of plain CAS hash
- Engine parses JSON output (with legacy CAS hash fallback)
New features:
- Frontmatter validation retry: if agent output lacks valid frontmatter,
engine calls agent.continue() up to 2 times with correction message
- Session tracking: sessionId flows from agent → engine → StepOutput
- Hermes agent: session parse failure is now a hard error (no raw text fallback)
- Hermes agent: supports --resume for continue sessions
Closes#384
Agent output must contain valid YAML frontmatter matching the role schema.
If frontmatter parsing fails, the step fails immediately with a clear error
instead of falling back to an LLM extraction that can fabricate values.
The extract module remains as a public API export but is no longer used
in the agent run loop.
Breaking change: agents that relied on LLM extraction to produce valid
output will now fail. They must output proper frontmatter.
Add --count/-c flag to 'uwf thread step' for running N steps in one
invocation, stopping early if $END is reached.
- cmdThreadStep now loops up to count times, delegates to cmdThreadStepOnce
- CLI parses -c/--count, defaults to 1 (backward compatible single output)
- Validation rejects 0, negative, and non-integer counts
- 7 new tests covering CLI parsing and count validation
Fixes#373
Co-authored-by: uwf-hermes (solve-issue workflow)
Fallback transitions (last entry in graph node) omit the condition
field in YAML, resulting in undefined instead of null. The validator
and materializer now handle this:
- validate.ts: accept undefined as valid condition value
- workflow.ts: normalizeGraph() coerces undefined → null before CAS put
This was broken by the graph fallback pattern introduced in #370.
Register custom $first(role) and $last(role) functions in the JSONata
evaluator. These search the steps array and return the matching role's
frontmatter (output) directly, replacing verbose steps[-1].output.x
expressions with semantic $last('role').field syntax.
- workflow-moderator: register functions via expr.registerFunction()
- Updated all condition expressions in .workflows/ and examples/
- Added tests for $last, $first, and unmatched role (undefined)
Fixes#376
- Last transition in each graph node is now the fallback (no condition)
- Remove redundant positive conditions (ready, devDone, approved, passed, pushSuccess)
- notApproved → rejected (positive naming)
- Move generateCliReference() to @uncaged/workflow-util
- buildRolePrompt inlines CLI reference directly (no agent tool call)
- Fix Role terminology to use new field names
- Add maintenance comment in cli-reference.ts
- Fix test assertions
- Add legacy-packages/ and scripts/ to biome ignore
- Allow noDefaultExport in vitest.config.* and .d.ts
- Allow console in cli.ts and setup.ts (CLI user output)
- Fix unused imports in cas.ts and setup.ts
- Add .workflows/*.yaml scanning from project root (cwd)
- Resolution: project-local first, then global registry
- On-the-fly CAS materialization for local workflows
- Filename/name consistency check
- uwf workflow list shows origin (local/global)
Fixes#365
Breaking change per review:
- Remove systemPrompt from RoleDefinition entirely
- identity/prepare/execute/report are now required (string, not nullable)
- Remove all legacy fallback logic in buildRolePrompt
- Simplify validate.ts, workflow.ts materialize
- Migrate all test fixtures and example workflows
Refs #359
- New buildRolePrompt() in workflow-agent-kit: four-phase prompt assembly
with fallback to systemPrompt
- Export from agent-kit index
- Update uwf-hermes to use buildRolePrompt instead of raw systemPrompt
- Add tests for all modes: four-phase, legacy, mixed
Refs #359, #362
- Extend RoleDefinition with identity, prepare, execute, report fields
- Make systemPrompt optional (nullable) for four-phase workflows
- Update ROLE_DEFINITION JSON Schema (all new fields optional)
- Update validate.ts to accept new fields
- Update workflow.ts to strip null fields before CAS storage
- Update thread read to prefer identity over systemPrompt
- Add --version flag to uwf CLI
- Bump all packages to 0.5.0
Refs #359
Remove all references to ESM bundles, old packages, old CLI name.
Update to reflect YAML workflow definitions, uwf CLI, 6 active packages,
frontmatter markdown output format, and stateless single-step execution.
小橘 🍊(NEKO Team)
Inline Result type and ok/err helpers into workflow-util to break
dependency on the now-archived workflow-protocol package.
Also add explicit @uncaged/json-cas dep to uwf-protocol (was only
available as transitive dep via json-cas-fs).
小橘 🍊(NEKO Team)
buildHermesPrompt was ignoring ctx.outputFormatInstruction — the
deliverable format and scope constraint were injected into context
but never passed to the agent.
Now prepends it before systemPrompt (deliverable-first principle).
Refs #355
- buildOutputFormatInstruction(schema): auto-generates frontmatter
format guide from Zod schema, injected at top of system prompt
- Adapter prepends deliverable format before role's systemPrompt
- buildThreadInput reordered: Task → Steps → Parent → Tools
- Scope reminder: 'Focus exclusively on YOUR role's deliverable'
- 8 tests for buildOutputFormatInstruction
Refs #351
Phase 2 of RFC #351 — adapter tries frontmatter first (zero LLM cost),
falls back to runtime.extract() when frontmatter is missing/invalid.
- tryFrontmatterMeta(): parse → validate → schema.safeParse
- Happy path stores body (no frontmatter) in CAS
- Fallback stores full raw in CAS + LLM extract
- 5 tests covering both paths
Refs #351
Phase 1 of RFC #351 — define AgentFrontmatter type, parseFrontmatterMarkdown()
and validateFrontmatter() with 45 tests.
- Built-in minimal YAML parser (no new deps)
- Never throws on malformed input — degrades gracefully
- All fields use T | null (no optional properties)
Refs #351
--detail now uses expandDeep to recursively resolve all cas_ref
fields in the detail merkle tree, showing full turn content
instead of raw hashes.
Refs #349
- Outputs markdown directly (not JSON/YAML)
- --quota <chars>: character budget, loads steps backward until exceeded (default 4000)
- --before <step-hash>: load steps before this hash (exclusive), omits start
- --start: force include start section even with --before
- --detail: expand detail CAS node content for each step
- Skip hint with uwf thread read command for pagination
- Reuses walkChain/collectOrderedSteps/expandOutput
Closes#349
hermes --quiet outputs session_id to stderr and AI response to stdout.
The agent was only parsing stdout, so session_id was never found and
detail always fell back to raw output.
Now checks stderr first, then stdout as fallback.
--quiet mode outputs session_id on the first line, not the last.
The old code only checked the last non-empty line and broke immediately
if it didn't match, causing session detail to always be empty.
Fixes the empty detail bug when hermes agent runs in quiet mode.
Remove thread-id argument — CAS node is self-contained, no need to
specify which thread it belongs to. Just verify the hash is a valid
StartNode or StepNode.
Refs #342
- uwf thread steps <thread-id>: walk CAS chain, list all steps chronologically
- uwf thread fork <thread-id> <step-hash>: create new thread from history point
- New types: StartEntry, StepEntry, ThreadStepsOutput, ThreadForkOutput
- Supports both active and archived threads
Refs #342
AgentContext now extends ModeratorContext (start + steps) with threadId, role, store, and expanded workflow. Hermes and mock-agent read prompt/steps/systemPrompt from the new shape.
Co-authored-by: Cursor <cursoragent@cursor.com>
Expose the store created during context build on AgentContext so agents
reuse the same in-memory cache instead of opening a second store.
Co-authored-by: Cursor <cursoragent@cursor.com>
Parse session_id from Hermes stdout, store hermes-turn leaves and
hermes-detail root in CAS with cas_ref turns; fall back to raw stdout
when the session file is missing.
Co-authored-by: Cursor <cursoragent@cursor.com>
- Change AgentRunFn to return { output, detailHash } instead of raw string
- Remove agent-kit detail CAS write; agents store their own detail nodes
- Hermes stores raw output as typed hermes-raw-output CAS node
Co-authored-by: Cursor <cursoragent@cursor.com>
Add program-level --format option (default: json) inherited by all
subcommands. json output unchanged, yaml via yaml package, table
renders aligned columns for arrays, falls back to yaml for objects.
Closes#328
小橘 🍊(NEKO Team)
Add package.json, tsconfig.json, and placeholder src/index.ts for
@uncaged/workflow-agent-docx-diff; append reference in root tsconfig.json.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add package.json, tsconfig.json, and placeholder src/index.ts for the
@uncaged/workflow-template-document package; register it in root tsconfig.json references.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All cas subcommands now output JSON via writeJson(), consistent with
other uwf commands. schema list includes meta-schema. Removed --json
flag and --format tree (tree is human-only, not machine-friendly).
Refs #319
小橘 🍊(NEKO Team)
- Remove CAS ref string support from workflow YAML outputSchema
- Simplify validate.ts: no string check for outputSchema
- Auto-set title from role name (workflow.role format)
Refs #319
小橘 🍊(NEKO Team)
When uwf workflow put processes inline JSON Schema for a role,
auto-inject title=roleName if not already set. Makes uwf cas schema list
show meaningful names like 'planner', 'coder' instead of (unnamed).
小橘 🍊(NEKO Team)
- Remove uwf cas list (CAS grows unbounded, listing all hashes is useless)
- Add title to Workflow/StartNode/StepNode schemas so schema list shows names
小橘 🍊(NEKO Team)
1. threads.yaml race condition: reload threads index after agent subprocess
completes before updating head pointer (cli-uwf/commands/thread.ts)
2. evaluateJsonata not awaited: jsonata evaluate() returns Promise for async
expressions — now properly awaited (uwf-moderator/evaluate.ts)
3. resolveWorkflowHash dead code: function always returns a value, removed
impossible null return type and dead null-check branches at call sites
(cli-uwf/store.ts, commands/thread.ts, commands/workflow.ts)
The agent subprocess writes StepNode to CAS on disk, but the parent
process had an in-memory cache from createFsStore init. Fix: re-create
store after agent spawn to pick up new nodes.
Also centralized JSON Schemas in uwf-protocol so cli-uwf and agent-kit
produce identical type hashes.
E2E smoke test passing: workflow put → thread start → 3x step → done
Refs #309
- thread start: ULID generation, StartNode to CAS, threads.yaml
- thread show: active (done:false) or archived (done:true)
- thread list: active threads, --all includes history
- thread kill: archive to history.jsonl
Refs #309, #313
Refactor dashboard graph/schema helpers and descriptor role validation
into smaller functions so bun run check passes without warnings.
Co-authored-by: Cursor <cursoragent@cursor.com>
Make AgentFn<Opt> always take a mandatory options argument, removing
the void conditional overload. Simplify createAgentAdapter, restore
exports needed by tests, and fix CLI test bundles to use cas.put
instead of disallowed @uncaged/* imports.
Co-authored-by: Cursor <cursoragent@cursor.com>
Now that bundles are fully self-contained (no external @uncaged/* imports),
the symlink mechanism is no longer needed.
- Delete ensure-uncaged-workflow-symlink.ts
- Remove ensureUncagedWorkflowSymlink from all imports/exports
- Remove ExtractBundleExportsOptions type (storageRoot param)
- Simplify extractBundleExports to single-arg signature
- Clean up stale comments
- Remove uncagedWorkflowExternals() from scaffold build script
- Remove --external from Bun.build config
- Tighten bundle validator: only Node built-ins allowed, all deps must be inlined
- Update skill.ts documentation
Bundles are now deterministic — same Node/Bun version = same behavior,
no dependency resolution at runtime.
Bundles must run without env vars — env vars are overrides, not requirements.
Single function: env(name, fallback) always returns string with a default.
- Removed requireEnv and optionalEnv
- Updated bundle entries, tests, and skill docs
小橘 🍊
requireEnv causes silent worker crash when env vars are missing —
thread shows 0 steps with no error. Use optionalEnv + sensible defaults.
Also added pitfall guidance in skill author docs.
小橘 🍊
The previous computeLayers used a reachability-based relation (a « b)
with depth tiebreaker to define a < b, then grouped nodes by == into
equivalence classes. However == is NOT an equivalence relation (fails
transitivity), making the grouping order-dependent and incorrect for
graphs with parallel branches.
Replace with standard Sugiyama longest-path layering:
1. DFS to detect and remove back-edges (break cycles)
2. Kahn's topological sort on the resulting DAG
3. rank(n) = max(rank(pred) + 1) for longest-path assignment
4. Group nodes by rank into layers
Also removes the experimental dagre layout strategy that was added
for comparison — longest-path produces better results for our
workflow graphs.
Replace linear spine walk with a proper partial order:
- a « b = a ~> b AND NOT b ~> a (strict precedence)
- a ~ b = incomparable under «
- depth tiebreaker for incomparable nodes
- Equivalent nodes (same layer) placed side-by-side horizontally
- scripts/publish-all.mjs: pins workspace:^ before npm publish, restores after
- Workaround for bun publish workspace:^ resolution bug in pre mode
小橘 🍊
- Add systemPrompt to WorkflowRoleDescriptor (protocol)
- Propagate systemPrompt through buildDescriptor and validateWorkflowDescriptor
- Display system prompt as collapsible <details> in RoleCard
Two fixes:
1. cmdThreadRemove: always call both removeThreadEntry and
removeThreadHistoryEntries regardless of resolved source,
preventing race where thread moves from active to history
between resolve and delete.
2. Test: add waitUntilRunningFileAbsent before thread show/rm,
matching the pattern used by adjacent test cases.
Verified 5x consecutive runs with 0 failures.
Closes#265
- Workflow list is now a simple clickable list (no expand/collapse)
- Clicking a workflow navigates to dedicated detail page (#client/workflows/name)
- Detail page: fixed graph sidebar on left, scrollable role cards on right
- Back button returns to workflow list
- Route: added workflowName to hash routing
- Nested object: expand properties with └─ indentation
- object[]: show type as 'object[]', expand items.properties
- string[]/number[]: show type directly, no expansion
- oneOf/discriminatedUnion: variant headers with ├/└ connectors
- Auto-detect discriminator field (const values)
- Skip discriminator in variant field list
- Recursive to arbitrary depth
Phase 1+2 of #258
Left sidebar: compact workflow graph with clickable nodes for navigation.
Right panel: workflow overview card + per-role cards with meta schema tables.
Clicking a node in the graph scrolls to the corresponding role card.
All nodes are lit in static view (workflow definition, not a thread).
- Reduce feedback edge offset (140→80) for tighter layout
- Terminal nodes (start/end) now clickable when lit
- Unlit nodes have no cursor-pointer and ignore clicks
- Role nodes cycle through occurrences on repeated clicks
- Start node click scrolls to thread prompt
- End node click scrolls to bottom
- All records get data-record-index for scroll targeting
Ref: #247
What: Feedback (back) edges now connect from the left/right side of nodes
instead of top/bottom, making the routing visually clearer.
Changes:
- role-node.tsx: add Left/Right handles for feedback edge connections
- use-layout.ts: set sourceHandle/targetHandle for feedback edges
- condition-edge.tsx + use-layout.ts: increase FEEDBACK_OFFSET_X to 140
Ref: #247
What: Feedback (back) edges now alternate between left and right sides
instead of all routing to the right.
Why: Multiple feedback edges targeting the same node (e.g. reviewer→coder
and tester→coder) were overlapping on the right side.
Changes:
- types.ts: add feedbackSide field to ConditionEdgeData
- use-layout.ts: track feedback count per target, alternate sides
- condition-edge.tsx: feedbackPath() accepts side param, mirrors path for left
Ref: #247, closes#249
- Remove llmProvider and workspace from CursorAgentConfig (now just command/model/timeout)
- extractWorkspacePath uses runtime.extract + runtime.cas instead of standalone reactor
- TextProducerFn signature gains runtime parameter: (ctx, prompt, runtime)
- develop-entry.ts hardcodes cursor-agent path, no more env var dependency
- Drop @uncaged/workflow-reactor dep from workflow-agent-cursor
- Update tests for simplified config
小橘 <xiaoju@shazhou.work>
- wrangler.toml: keep first migration as AgentSocket (already applied),
second migration handles the rename
- index.ts: pathAfterAgent → pathAfterClient
- WS client receives app.fetch function instead of localPort
- Gateway mode no longer starts a local HTTP server
- Local-only mode (no secret) still starts HTTP server as before
- Removes unnecessary HTTP round-trip for gateway requests
- Delete tunnel.ts (startTunnel/cloudflared), rename to gateway.ts
- Remove --no-tunnel, --tunnel-url flags
- ServeOptions: drop noTunnel, tunnelUrl fields
- Two modes: gateway (with WORKFLOW_GATEWAY_SECRET) or local-only
- WS reverse connection is the only gateway transport
- PlannerMeta is now a discriminated union: planned | aborted
- Moderator routes aborted planner → END (no coder invocation)
- System prompt requires absolute workspace path, instructs abort if missing
- extractRefs handles both variants
- Test: 'planner aborted → END'
Signed-off-by: 小橘 <xiaoju@shazhou.work>
Planner, coder, reviewer, and tester system prompts now explicitly
instruct the agent to keep responses short and avoid pasting diffs,
code blocks, or full build logs. This reduces CAS storage and token
waste when downstream roles read the thread.
Signed-off-by: 小橘 <xiaoju@shazhou.work>
- Remove Gitea npm registry, use npmjs.org
- changeset publish works natively with npmjs
- release script: build + test + changeset publish
- Remove custom release.sh, all via changesets
- All @uncaged/* packages → 0.4.0
- Internal deps: workspace:* → workspace:^ (resolves to ^0.4.0 on publish)
- Fix exports: add 'bun' condition for local dev (src), 'import' for consumers (dist)
- Remove stale 'main: src/index.ts' from 6 packages
- Fix publish.sh topo sort to match workspace:^ prefix
星月 <xingyue@shazhou.work>
- Topological sort from publish-all.sh replaces hardcoded order
- bun publish directly (no manual workspace:* replacement/restoration)
- bun run build + bun test (not npm run)
- --dry-run support (skips git commit/push)
- Delete publish-all.sh
- Update package.json scripts
Closes#237
Signed-off-by: 星月 <xingyue@shazhou.work>
- Use bun pm pack for workspace:* resolution (no manual replace/restore)
- Topological sort replaces hardcoded PUBLISH_ORDER
- Registry unified to uncaged org
- Delete scripts/publish-all.sh
- Add --dry-run flag support
Closes#237
What: Replace hardcoded PUBLISH_ORDER with auto-discovery of all
non-private packages, sorted by topological dependency order (Kahn's).
Add a test gate (npm test) after build, before publish.
Why: The manual list was missing workflow-gateway and workflow-agent-react,
causing them to never get published. Any future package additions would
have the same problem.
Changes:
- scripts/publish.sh: Replace static PUBLISH_ORDER array with node script
that reads all packages/*/package.json, filters out private, and
topologically sorts by @uncaged/* internal dependencies
- scripts/publish.sh: Add npm test step between build and publish,
aborting on failure
What: Replace ELK layout engine with a hand-written spine layout that
topologically sorts nodes into a vertical main path with feedback edges
routed to the right side.
Why: ELK's layered algorithm spreads the graph too wide when handling
feedback (back) edges, causing fitView to shrink nodes until text is
unreadable. Our workflow graphs are predominantly linear pipelines with
feedback loops — a custom layout handles this topology much better.
Changes:
- packages/workflow-dashboard/src/components/workflow-graph/use-layout.ts:
rewrite from async ELK to synchronous spine layout — topo-sort extracts
main path, nodes stack vertically, feedback edges get right-side routing
- packages/workflow-dashboard/src/components/workflow-graph/condition-edge.tsx:
add custom SVG path for feedback edges (right-side arc with Q curves),
use typed isFeedback/isSelfLoop fields from ConditionEdgeData
- packages/workflow-dashboard/src/components/workflow-graph/types.ts:
rename elkLabelX/Y to labelX/Y, add isFeedback and isSelfLoop fields
- packages/workflow-dashboard/src/components/workflow-graph/workflow-graph.tsx:
remove ReactFlowProvider/useReactFlow/useEffect fitView workaround
(no longer needed — layout is synchronous), simplify component
- packages/workflow-dashboard/package.json: remove elkjs and dagre deps
The actual issue was that 'files' only included dist/, so src/ was
excluded from the published package. bun can run .ts natively — no
need to point bin at compiled dist/cli.js.
Fix: add 'src' to files array so it ships with the package.
Packages without exports.types pointed main/types to src/ which
doesn't exist in published tarballs. Now all packages have:
exports."." = { types: dist/index.d.ts, import: src/index.ts }
Bump to 0.3.18.
What: Replace Dagre layout engine with ELK (Eclipse Layout Kernel) for
workflow graph visualization in the dashboard.
Why: Dagre lacks support for edge label placement and orthogonal edge
routing, causing condition labels to overlap with nodes. ELK provides
proper label positioning, better edge routing, and more compact layouts.
Changes:
- packages/workflow-dashboard/package.json: add elkjs dependency
- packages/workflow-dashboard/src/components/workflow-graph/use-layout.ts:
rewrite layout from Dagre to async ELK with layered algorithm,
orthogonal routing, reduced spacing for compactness
- packages/workflow-dashboard/src/components/workflow-graph/condition-edge.tsx:
use ELK-computed label positions, show all labels including FALLBACK,
switch to getSmoothStepPath for all edges
- packages/workflow-dashboard/src/components/workflow-graph/workflow-graph.tsx:
wrap in ReactFlowProvider, add fitView on async layout change,
key-based remount for layout stability
- packages/workflow-dashboard/src/components/workflow-list.tsx:
left-right layout (info left, graph right), fix toggleExpanded
React 18 batching bug, increase graph container height
Add ModeratorTable syntax, AdapterFn/AdapterBinding types, lazy init
pattern, bundle import restrictions, and descriptor requirements.
Knowledge from smoke test discoveries — these are the most common
mistakes when writing workflow bundles.
小橘 <xiaoju@shazhou.work>
After bumping versions, bun pm pack reads the old bun.lock and resolves
workspace:* to stale versions. Now deletes bun.lock and runs bun install
before the pack loop to ensure correct resolution.
小橘 <xiaoju@shazhou.work>
Root cause: executeThread awaited gen.next() without racing against
the abort signal. When a workflow bundle awaited a long setTimeout
between yields, the engine could not respond to kill until the
Promise resolved — causing the kill test to flake when the thread
completed before kill arrived.
Fix: Promise.race gen.next() with an abort listener so kill takes
effect immediately, even mid-yield. Also move the bundle's delay
to after the first yield (between planner and coder) to ensure the
thread is killable while running.
Closes#209
- Add AgentSocket Durable Object (holds one WS per agent name)
- Add /ws/connect route with GATEWAY_SECRET auth
- Add ws-client.ts with auto-reconnect (exponential backoff 1s-30s)
- serve defaults to WS mode (no cloudflared needed)
- Keep --tunnel-url and --no-tunnel as fallback options
- Endpoints list merges KV heartbeat + DO WebSocket status
Testing: #211
- Add RoleFn<T>, AdapterFn, AdapterBinding types to workflow-protocol
- Mark AgentFn, AgentFnResult, AgentBinding as @deprecated
- Refactor createWorkflow to accept AdapterBinding instead of AgentBinding
- Adapter returns typed meta directly — no more extract call in workflow loop
- Add buildThreadInput (ThreadContext-based), keep buildAgentPrompt as deprecated wrapper
- Update template bundle-entries to wrap AgentFn as AdapterFn
- Update solve-issue tests to use AdapterFn directly
description:"End-to-end walkthrough of uwf CLI. Dogfooding: uwf tests uwf. Each role validates a phase of the CLI surface inside an isolated Docker container."
roles:
bootstrap:
description:"Start Docker container with isolated storage, verify uwf is runnable"
goal:"You are an E2E test runner. Set up an isolated Docker environment and verify basic uwf functionality."
capabilities:
- docker
- shell
procedure:|
1. Start a Docker container with isolated storage:
3. Verify it appears in completed list: `uwf thread list --status completed`
Fork:
4. Fork from the first thread's last step: `uwf step fork <lastStepHash>`
5. Verify fork creates a new thread with a different ID
Logs:
6. `uwf log list` — verify output (may be empty)
7. `uwf log show --thread <threadId>` — verify runs without error
Report results with summary.
output:"Report test results with summary. Set $status to pass or fail."
frontmatter:
oneOf:
- properties:
$status:{const:"pass"}
containerName:{type:string }
summary:{type:string }
required:[$status, containerName, summary]
- properties:
$status:{const:"fail"}
error:{type:string }
containerName:{type:string }
required:[$status, error, containerName]
cleanup:
description:"Remove Docker container"
goal:"You are an E2E test runner. Clean up the Docker container used for testing."
capabilities:
- docker
- shell
procedure:|
Remove the Docker container (containerName is in your prompt):
1. `docker rm -f <containerName>`
2. Verify the container is gone: `docker ps -a --filter name=<containerName> --format '{{.Names}}'` should return empty
Report cleanup result.
output:"Report cleanup result. Set $status to pass or fail."
frontmatter:
oneOf:
- properties:
$status:{const:"pass"}
summary:{type:string }
required:[$status, summary]
- properties:
$status:{const:"fail"}
error:{type:string }
required:[$status, error]
graph:
$START:
_:{role:"bootstrap", prompt:"Set up the Docker container and verify uwf is runnable."}
bootstrap:
pass:{role:"config-and-registry", prompt:"Container {{{containerName}}} is ready. Validate config and workflow registration."}
fail:{role:"$END", prompt:"Bootstrap failed: {{{error}}}. No container was created."}
config-and-registry:
pass:{role:"thread-ops", prompt:"Config and registry OK. Workflow '{{{workflowName}}}' registered. Container: {{{containerName}}}. Now test thread operations."}
fail:{role:"cleanup", prompt:"Config/registry failed: {{{error}}}. Clean up container {{{containerName}}}."}
thread-ops:
pass:{role:"inspect", prompt:"Thread ops OK. threadId={{{threadId}}}, workflowName={{{workflowName}}}, containerName={{{containerName}}}. Now test inspect operations."}
fail:{role:"cleanup", prompt:"Thread ops failed: {{{error}}}. Clean up container {{{containerName}}}."}
inspect:
pass:{role:"cancel-and-fork", prompt:"Inspect OK. threadId={{{threadId}}}, lastStepHash={{{lastStepHash}}}, workflowName={{{workflowName}}}, containerName={{{containerName}}}. Now test cancel, fork, and logs."}
fail:{role:"cleanup", prompt:"Inspect failed: {{{error}}}. Clean up container {{{containerName}}}."}
cancel-and-fork:
pass:{role:"cleanup", prompt:"All tests passed! {{{summary}}}. Clean up container {{{containerName}}}."}
fail:{role:"cleanup", prompt:"Cancel/fork failed: {{{error}}}. Clean up container {{{containerName}}}."}
description:"TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds."
roles:
planner:
description:"Analyzes issue and outputs a TDD test spec"
goal:"You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify."
capabilities:
- issue-analysis
- planning
procedure:|
On first run (no previous steps):
1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
2. Look for project conventions files (CLAUDE.md, CONTRIBUTING.md, .cursor/rules/) in the repo
3. Assess whether the issue has enough information to produce a test spec
4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output $status=insufficient_info
5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
On subsequent runs (bounced back by tester with fix_spec):
1. Read the tester's output from the previous step to understand what's wrong with the spec
2. Revise the test spec accordingly
After producing the test spec:
1. Store it via `uwf cas put-text "<markdown content>"` and capture the returned hash
2. Put the hash in frontmatter.plan (required when $status=ready)
3. Set repoPath to the absolute path of the repository root
output:"Output a brief summary of the test spec. Set $status to ready (with plan hash and repoPath) or insufficient_info."
frontmatter:
oneOf:
- properties:
$status:{const:"ready"}
plan:{type:string }
repoPath:{type:string }
required:[$status, plan, repoPath]
- properties:
$status:{const:"insufficient_info"}
required:[$status]
developer:
description:"TDD implementation per test spec"
goal:"You are a developer agent. You implement code changes following TDD — write tests first, then implementation."
capabilities:
- coding
procedure:|
IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
The repo path and other details are provided in your task prompt.
Before starting any work, set up an isolated worktree:
1. cd into the repo path provided in your task prompt
This monorepo implements a workflow engine that executes single-file ESM bundles. Each workflow is a self-contained `.esm.js` file with an XXH64 hash as its version identifier. Shared types live in `@uncaged/workflow-protocol`; bundle authors typically depend on `@uncaged/workflow-runtime`.
This monorepo implements a stateless workflow engine driven by a single-step CLI (`uwf`). Workflows are **YAML definitions** stored as CAS nodes; threads are immutable chains of CAS-linked step nodes. No daemon — each `uwf thread step` invocation runs one moderator→agent→extract cycle and exits.
### Key Terms
| Concept | What it is |
|---------|-----------|
| **Workflow** | A single-file ESM module that exports `run` (workflow function) and `descriptor` (metadata). Identified by its XXH64 hash (Crockford Base32). |
| **Bundle** | The physical `.esm.js` file stored in `~/.uncaged/workflow/bundles/`. |
| **Thread** | A single execution of a workflow, identified by a ULID. State lives in CAS (linked nodes); active threads indexed in `threads.json`; completed rows in `history/*.jsonl`. Debug logs use `.info.jsonl`. |
| **Role** | A named actor within a workflow. Each role produces output with typed `meta`. |
| **Workflow** | A YAML definition (`WorkflowPayload`) with roles, status-based routing, and a directed graph. Stored as a CAS node, identified by its XXH64 hash. |
| **Thread** | A single execution of a workflow, identified by a ULID. State is an immutable CAS chain; active threads indexed in `threads.yaml`; completed threads in `history.jsonl`. |
| **Role** | A named actor within a workflow. Each role has a system prompt and a JSON Schema `outputSchema`. |
| **Moderator** | Status-based graph evaluator — determines the next role (or `$END`) with zero LLM cost. |
| **Agent** | An external CLI command (`uwf-hermes`, etc.) spawned by `uwf thread step`. Produces frontmatter markdown output. |
| **CAS** | Content-Addressed Storage via `@uncaged/json-cas` — all workflow definitions, thread nodes, and outputs are immutable CAS nodes. |
| **Registry** | `~/.uncaged/workflow/registry.yaml` — maps workflow names to current CAS hashes. |
### Monorepo Structure
```
workflow/
packages/
workflow-protocol/ # @uncaged/workflow-protocol — shared types + Result
workflow-runtime/ # @uncaged/workflow-runtime — createWorkflow, type re-exports
- One module = one responsibility, filename = purpose
Workflow bundles (`.esm.js`) follow the same rule: export `const run` and `const descriptor`, not `export default`.
### Folder Module Discipline
Every folder under `src/` is a **module boundary**. Four rules:
@@ -135,10 +128,10 @@ export { createCasStore } from "../cas/cas.js";
// ❌ Bad — types defined in index.ts
// in cas/index.ts:
exporttypeCasStore={...};// should be in cas/types.ts
exporttypeCasStore={...};// should be in cas/types.ts
```
**Exception**: The package-level `src/index.ts` is the public API surface and re-exports from folder `index.ts` files. Files that remain at `src/` root (e.g. `types.ts`, `workflow-as-agent.ts`) are not inside a folder module and follow normal rules.
**Exception**: The package-level `src/index.ts` is the public API surface and re-exports from folder `index.ts` files. Files that remain at `src/` root (e.g. `types.ts`) are not inside a folder module and follow normal rules.
## Naming
@@ -159,7 +152,7 @@ Workflow names use **verb-first** kebab-case:
| **vitest** | Test runner (`cli-workflow` uses vitest; other packages use `bun test`) |
### Commands
### Development Workflow
```bash
bun run check # tsc --build + biome check
bun run format # biome format --write
bun test# run tests
# ── Setup ──
bun install # install all workspace dependencies
# ── Daily development ──
bun run build # tsc --build (all packages, dependency order)
bun run check # tsc --build + biome check + lint-log-tags
bun run format # biome format --write
bun test# run tests across all packages
# ── Before committing ──
bun run check # must pass — typecheck + lint + log tag validation
bun test# must pass — all package tests
```
### Publishing to Gitea npm Registry
### Publishing
All public `@uncaged/*` packages are published to the Gitea npm registry at `git.shazhou.work`. Workflow workspaces consume packages from this registry via `bunfig.toml`.
All public `@uncaged/*` packages are published to **npmjs.org** with **fixed mode** (all packages share the same version number).
```bash
# Publish all packages (bun pm pack resolves workspace:* → actual versions)
bun run publish:gitea
# 1. Add a changeset describing the change
bun changeset
# Dry run — see what would be published
bun run publish:gitea:dry
# 2. Bump all package versions + generate CHANGELOGs
bun version
# 3. Build, test, and publish (runs scripts/publish-all.mjs)
bun release
# Or publish manually with a tag:
node scripts/publish-all.mjs --tag alpha
node scripts/publish-all.mjs --dry-run # preview without publishing
```
Prerequisites: `.npmrc` in monorepo root with Gitea auth token (`//git.shazhou.work/api/packages/shazhou/npm/:_authToken=<token>`).
-`workspace:^` dependencies resolve to `^x.y.z` on publish
- Publish order defined in `scripts/publish-all.mjs` (dependency order)
- Changesets config: `.changeset/config.json` (fixed mode, public access)
### Workflow Workspace Setup
External workflow repos (e.g. `xingyue-workflows`) use the Gitea registry for `@uncaged/*` packages. Add a `bunfig.toml`:
moderator → agent → extract — one step per invocation, repeat until $END
```
1.**Monorepo changes** → `bun run publish:gitea` (packages auto-discovered from `packages/*/`, topologically sorted, `workspace:*` resolved to real versions)
2.**Workspace**→`bun install` fetches latest from Gitea, `bun install` is safe to run anytime
3.**Build**→ produces single-file ESM bundle with `@uncaged/*` as externals
4.**Register & Run** → `uncaged-workflow workflow add <name> <bundle>` then `uncaged-workflow run <name>`
1.**Author** — write a workflow YAML file with roles, conditions, and graph
2.**Register**—`uwf workflow put <file.yaml>` parses YAML, registers output schemas, stores `WorkflowPayload` in CAS
3.**Run**— `uwf thread start` creates a thread, `uwf thread step` executes one cycle per invocation
-`docs(protocol): update StepNode field descriptions`
## Pull Request Process
1.**Branch** from `main`: `git checkout -b feat/123-short-description`
2.**Implement** your change with tests
3.**Run checks**: `bun run check && bun test`
4.**Commit** with a descriptive message referencing the issue: `Fixes #123`
5.**Push** and open a PR
### PR Description Template
```
## What
What this PR does.
## Why
Why the change is needed.
## Changes
- `path/to/file.ts` — what changed and why
## Ref
Fixes #N
```
## Adding a Changeset
For any user-facing change (feat, fix, breaking change), add a changeset:
```bash
bun changeset
```
This creates a markdown file in `.changeset/` describing the change. It will be consumed on the next release to bump versions and generate CHANGELOG entries.
A workflow engine that executes single-file ESM bundles. Each workflow is a self-contained `.esm.js` file identified by its XXH64 hash (Crockford Base32).
A stateless workflow engine driven by a single-step CLI. Workflows are YAML definitions with roles, status-based routing, and a directed graph. Threads are immutable CAS-linked chains — each `uwf thread step` runs one moderator→agent→extract cycle and exits.
| Concept | Description |
|---------|-------------|
| **Workflow** | A single-file ESM module exporting `run` (workflow function) and `descriptor` (metadata). Identified by its XXH64 hash. |
| **Bundle** | The physical `.esm.js` file stored in `~/.uncaged/workflow/bundles/`. |
| **Thread** | A single execution of a workflow, identified by a ULID. CAS-backed chain plus `threads.json` / `history/*.jsonl`; `.info.jsonl` for debug logs. |
| **Role** | A named actor within a workflow. Each role produces output with typed `meta`. Roles live inside template packages (`src/roles/`). |
| **CAS** | Content-Addressed Storage — bundles are immutable and addressed by hash. |
## Overview
## Monorepo Packages
This monorepo implements **uwf**, a workflow engine with no long-running daemon. You register YAML workflow definitions in a content-addressed store (CAS), start a thread with an initial prompt, then invoke `uwf thread step` repeatedly until the moderator routes to `$END`. Each step is a complete process: the moderator evaluates status-based routing to pick the next role, an external agent CLI produces frontmatter markdown output, and an extract pipeline validates or structures that output against the role's JSON Schema.
Workflow state lives entirely on disk under `~/.uncaged/workflow/`: CAS nodes for definitions and step payloads, `registry.yaml` for workflow name→hash mappings, and `threads.yaml` for active thread head pointers. Completed threads are archived to `history.jsonl`. Because there is no server process, workflows are easy to debug, fork, and inspect with ordinary CLI tools.
Agents are pluggable CLI binaries (`uwf-hermes`, `uwf-builtin`, `uwf-claude-code`, or custom commands). The engine spawns the configured agent with `<thread-id>` and `<role>`, sets `UWF_EDGE_PROMPT` from the graph transition, and captures both the agent's markdown output and a detail CAS node for session replay.
## Install
```bash
npm install -g @uncaged/cli-workflow
```
Managed with **bun workspace** using the `workspace:*` protocol.
Requires [Bun](https://bun.sh/) runtime (used internally for TypeScript execution).
See [docs/architecture.md](docs/architecture.md) for the full design — three-phase engine loop, CAS node types, storage layout, agent CLI protocol, and design decisions.
A workflow engine that executes single-file ESM bundles. Each workflow is a self-contained `.esm.js` file identified by its XXH64 hash (Crockford Base32). No daemon — processes start on demand and exit when done.
A stateless workflow engine driven by a single-step CLI. Workflows are YAML definitions stored as CAS nodes; threads are immutable chains of CAS-linked step nodes. No daemon — each `uwf thread step` invocation runs one moderator→agent→extract cycle and exits.
The implementation lives in **15**Bun workspace packages under `packages/`, using the `workspace:*` protocol.
The implementation lives in **5**active packages under `packages/`, plus two external CAS packages (`@uncaged/json-cas`, `@uncaged/json-cas-fs`). Legacy packages reside in `legacy-packages/` and are not part of the active stack.
## Package map
Grouped by responsibility (npm name → folder).
| Layer | Package | One-line role |
|-------|---------|----------------|
| Contract | `@uncaged/workflow-protocol` → `workflow-protocol` | Shared TypeScript types and `Result` helpers; peer`zod` only — no other workspace deps. |
| Author API | `@uncaged/workflow-runtime` → `workflow-runtime` | `createWorkflow` and re-exports of protocol workflow types for bundle authors. |
| Shared infra | `@uncaged/workflow-util` → `workflow-util` | Base32/ULID, logger, storage root paths, global CAS dir, ref-field helpers. |
| LLM plumbing | `@uncaged/workflow-reactor` → `workflow-reactor` | `createLlmFn`, `createThreadReactor`, and related tool-call types for threaded LLM invocation. |
| CAS | `@uncaged/workflow-cas` → `workflow-cas` | `CasStore` implementation, XXH64 hashing, Merkle helpers over CAS payloads. |
-`@uncaged/workflow-template-solve-issue` → `@uncaged/workflow-register`, `@uncaged/workflow-runtime`, `zod` (dev-only workspace deps: `@uncaged/workflow-cas`, `@uncaged/workflow-execute` for tests/tooling per `package.json`)
Workflows are **YAML files** (not ESM bundles). `uwf workflow put <file.yaml>` parses the YAML, registers output schemas as JSON Schema CAS nodes, and stores the `WorkflowPayload` as a CAS node.
## Package roles (detail)
Example (`examples/solve-issue.yaml`):
- **`workflow-protocol`** — Pure types (`WorkflowFn`, contexts, `CasStore` interface, descriptor shapes), `START` / `END`, `ok` / `err`. Depends only on peer `zod` for schema-related types in signatures.
- **`workflow-runtime`** — Workflow author surface: `createWorkflow` from `src/create-workflow.js`, re-exports protocol types/constants used when authoring bundles.
- **`workflow-util-agent`** — Shared prompt assembly and subprocess spawning for CLI agents.
- **`workflow-template-*`** — Concrete `WorkflowDefinition` graphs + Zod role schemas + descriptor builders for publishing bundles.
- **`workflow-dashboard`** — Standalone React UI; no published library entry matching `src/index.ts`.
```yaml
name:"solve-issue"
description:"End-to-end issue resolution"
roles:
planner:
description:"Creates implementation plan"
goal:"You are a planning agent. Analyze the issue and create a step-by-step plan."
capabilities:
- issue-analysis
- planning
procedure:"Analyze the issue and create a detailed, actionable implementation plan."
output:"Output the plan summary and list of concrete steps."
meta:
type:object
properties:
plan:{type:string }
steps:{type: array, items:{type:string } }
required:[plan, steps]
developer:
description:"Implements code changes"
goal:"You are a developer agent. Implement the plan."
capabilities:
- file-edit
- shell
procedure:"Implement the plan. Write code, tests, and ensure existing tests pass."
output:"List all files changed and provide a summary of the implementation."
meta:
type:object
properties:
filesChanged:{type: array, items:{type:string } }
summary:{type:string }
required:[filesChanged, summary]
reviewer:
description:"Reviews code changes"
goal:"You are a code reviewer. Review the implementation."
capabilities:
- code-review
procedure:"Review the implementation against the plan."
output:"Approve or reject with detailed comments."
meta:
type:object
properties:
approved:{type:boolean }
comments:{type:string }
required:[approved, comments]
conditions:
notApproved:
description:"Reviewer rejected the implementation"
expression:"steps[-1].output.approved = false"
graph:
$START:
- role:"planner"
condition:null
planner:
- role:"developer"
condition:null
developer:
- role:"reviewer"
condition:null
reviewer:
- role:"developer"
condition:"notApproved"
- role:"$END"
condition:null
```
Key properties:
- **`roles`** — inline role definitions; each `meta` is a JSON Schema (stored as its own CAS node on registration)
- **`graph`** — `Record<Role | "$START", Record<Status, Target>>` — status-based routing; each role maps statuses to targets
- **No agent binding** — agent selection is a deployment concern, configured in `config.yaml`
- **No Zod** — all schemas are JSON Schema, validated through `@uncaged/json-cas`
## Three-phase engine loop
Each role round is implemented in `packages/workflow-runtime/src/create-workflow.ts` (`advanceOneRound`): moderator → agent → extractor, with progressive context types from `@uncaged/workflow-protocol`.
Each `uwf thread step` runs exactly one cycle: moderator → agent → extract. The CLI orchestrates this in `packages/cli-workflow/src/commands/thread.ts` (`cmdThreadStep`).
- **Moderator is synchronous and pure** — no I/O, no state mutation inside `createWorkflow`’s moderator call path.
- **Agent receives `AgentContext`** — reads `ctx.currentRole.systemPrompt`; raw output becomes `agentContent` for extract.
- **Extractor is `WorkflowRuntime.extract`** — supplied by the engine from registry-resolved LLM config (`workflow-execute`); stores agent body in CAS and yields `contentHash` + `refs` on each step (`create-workflow.ts`).
- **`extractPrompt` is a call parameter** on `RoleDefinition`, not implicit context state.
- **Moderator** — pure status-based map lookup; no LLM call, no I/O beyond CAS reads. Looks up `graph[lastRole][lastOutput.status]` to get the next target.
- **Agent** — receives `AgentContext` with thread history + rolesystem prompt + output format instruction. Raw output is frontmatter markdown.
- **Extractor** — two-layer: tries frontmatter fast-path first (zero LLM cost), falls back to LLM extract if frontmatter is absent or invalid.
- **Stateless** — each `uwf thread step` is an atomic, self-contained operation. No in-memory state between steps.
## Agent information sources
## Agent CLI protocol
An agent has exactly three information sources:
Each agent is an external command invoked by `uwf thread step`:
No hidden environment parameters. If an agent needs something (like a workspace path), it obtains it via `ExtractFn` (e.g. Cursor agent).
## Bundle contract
A workflow bundle is a single `.esm.js` file with two named exports (see `WorkflowFn` / `WorkflowDescriptor` in `packages/workflow-protocol/src/types.ts`):
```typescript
exportconstdescriptor: WorkflowDescriptor;
exportconstrun: WorkflowFn;
typeWorkflowFn=(
thread: ThreadContext,
runtime: WorkflowRuntime,
)=>AsyncGenerator<RoleOutput,WorkflowCompletion>;
```bash
<agent-cmd> <thread-id> <role>
```
`RoleOutput` carries `contentHash`, `meta`, and `refs` (agent text lives in CAS, addressed by hash).
Contract:
1.`uwf thread step` determines the next role via the moderator
2. Agent CLI is spawned with `(thread-id, role)` as positional args
3.`workflow-util-agent` (`createAgent`) handles the boilerplate:
- Parses argv
- Loads `.env` from storage root
- Builds `AgentContext` by walking the CAS chain from `threads.yaml` head
- Resolves the role's `meta` schema and builds `outputFormatInstruction`
- Calls the agent's `run` function
- Runs two-layer extract on the raw output
- Writes `StepNode` to CAS (output + detail + prev link)
- Prints the new `StepNode` CAS hash to stdout
4.`uwf thread step` reads stdout, updates `threads.yaml` head pointer, re-evaluates moderator for `done`
Agents produce **frontmatter markdown** — YAML frontmatter for structured meta, followed by a markdown body for content:
- Each `yield` lets `workflow-execute` persist state, CAS rows, and enforce pause/abort
-`return` supplies `WorkflowCompletion`
- Fork replays historical steps into a new thread context
- Bundle does not import the engine — only protocol/runtime types at build time
```markdown
---
status: done
next: reviewer
confidence: 0.9
artifacts:
- src/auth.ts
scope: role
---
## Implementation
Fixed the login redirect by updating the auth middleware...
```
The `outputFormatInstruction` (built by `buildOutputFormatInstruction` in `workflow-util-agent`) is prepended to the role's system prompt, so the deliverable format is the first thing the agent sees. It lists the expected frontmatter fields derived from the role's `meta` JSON Schema.
## Two-layer extract
Structured output extraction uses a two-layer strategy (`workflow-util-agent`):
### Layer 1: frontmatter fast path (`frontmatter.ts`)
1. Parse YAML frontmatter from raw agent output (`parseFrontmatterMarkdown`)
3. Build a candidate object from frontmatter fields (`status`, `next`, `confidence`, `artifacts`, `scope`)
4.`store.put()` the candidate against the role's `meta` schema
5. Validate with `json-cas` schema validation
6. If valid → return `outputHash` (zero LLM cost)
### Layer 2: LLM extract fallback (`extract.ts`)
If the fast path returns `null` (no frontmatter, invalid, or doesn't satisfy schema):
1. Resolve extract model alias from config (`modelOverrides.extract` → `models.extract` → `defaultModel`)
2. Call OpenAI-compatible chat completion with JSON mode
3. System prompt: "Extract structured data matching this JSON Schema: ..."
4. User message: the raw agent output
5. Parse response, `store.put()`, validate
6. Return `outputHash`
## Prompt injection
`workflow-util-agent` prepends two pieces of context to the agent's system prompt:
1.**Deliverable format instruction** — generated from the role's `meta` schema, tells the agent exactly what frontmatter fields to produce and the expected format
2.**Scope constraint** — "Focus exclusively on YOUR role's deliverable. Do not perform actions outside your role's scope."
This ensures agents produce parseable frontmatter output without requiring per-agent format knowledge.
## CAS node types
### Workflow
```yaml
type:<workflow-schema-hash>
payload:
name:"solve-issue"
description:"End-to-end issue resolution"
roles:
planner:
description:"Creates implementation plan"
goal:"You are a planning agent..."
capabilities:[planning, issue-analysis]
procedure:"Analyze the issue and create a plan."
output:"Output the plan summary."
meta:"5GWKR8TN1V3JA"# cas_ref → JSON Schema node
conditions:
notApproved:
description:"Reviewer rejected"
expression:"steps[-1].output.approved = false"
graph:
$START:
- role:"planner"
condition:null
```
### StartNode
```yaml
type:<start-node-schema-hash>
payload:
workflow:"4KNM2PXR3B1QW"# cas_ref → Workflow
prompt:"Fix the login bug..."
```
### StepNode
```yaml
type:<step-node-schema-hash>
payload:
start:"4TNVW8KR2B3MA"# cas_ref → StartNode
prev:"2MXBG6PN4A8JR"# cas_ref → previous StepNode (null for first step)
role:"developer"
output:"9KRVW3TN5F1QA"# cas_ref → structured output (validated against meta schema)
Managed by `@uncaged/workflow-register` (`readWorkflowRegistry`, `writeWorkflowRegistry`, …). Shape includes workflow entries and a top-level `config` section used for extract/supervisor model resolution.
```yaml
providers:
openrouter:
baseUrl:"https://openrouter.ai/api/v1"
apiKey:"sk-..."
### Thread storage (CAS + index)
models:
sonnet:
provider:"openrouter"
name:"anthropic/claude-sonnet-4"
gpt4o-mini:
provider:"openai"
name:"gpt-4o-mini"
Thread execution state is a chain of immutable CAS nodes (`StartNode`, `StateNode`, content Merkle blobs). Per bundle:
agents:
hermes:
command:"uwf-hermes"
args:[]
cursor:
command:"uwf-cursor"
args:[]
- **`threads.json`** — only in-flight threads (`head`, `start`, `updatedAt`).
| **YAML workflow definitions** | Human-readable, versionable, no build step required. JSON Schema inline in YAML, registered as CAS nodes on `workflow put`. |
| **Stateless single-step CLI** | Each `uwf thread step` is atomic — no in-memory state, no daemon, no long-running process. OS handles lifecycle. |
| **CAS-backed thread state** | Immutable linked nodes enable fork, replay, and GC without copying data. Content-addressed deduplication across threads. |
| **Status-based moderator** | Status-based map routing — `graph[role][status]` lookup against last output. No LLM cost for routing decisions. |
| **Frontmatter markdown output** | Agents produce structured meta (YAML frontmatter) alongside free-form content (markdown body). Enables zero-cost extraction when frontmatter is well-formed. |
| **Two-layer extract** | Fast path avoids LLM calls when agents follow the format; LLM fallback handles messy output gracefully. |
| **Prompt injection for format** | Output format instruction prepended to system prompt ensures agents produce parseable output without per-agent configuration. |
| **JSON Schema (not Zod)** | Schemas are CAS-native data — storable, hashable, validatable through `json-cas`. No code generation, no runtime library dependency. |
| **Agent as external command** | Agents are independent CLI binaries (`uwf-hermes`, `uwf-cursor`). Swappable per workflow/role via config. No tight coupling to the engine. |
| **No daemon** | Process starts, does one step, exits. Simpler failure model, no connection management. |
description:"TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds."
roles:
planner:
description:"Analyzes issue and outputs a TDD test spec"
goal:"You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify."
capabilities:
- issue-analysis
- planning
procedure:|
On first run (no previous steps):
1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
2. Look for project conventions files (CLAUDE.md, CONTRIBUTING.md, .cursor/rules/) in the repo
3. Assess whether the issue has enough information to produce a test spec
4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output $status=insufficient_info
5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
On subsequent runs (bounced back by tester with fix_spec):
1. Read the tester's output from the previous step to understand what's wrong with the spec
2. Revise the test spec accordingly
After producing the test spec:
1. Store it via `uwf cas put-text "<markdown content>"` and capture the returned hash
2. Put the hash in frontmatter.plan (required when $status=ready)
3. Set repoPath to the absolute path of the repository root
output:"Output a brief summary of the test spec. Set $status to ready (with plan hash and repoPath) or insufficient_info."
frontmatter:
oneOf:
- properties:
$status:{const:"ready"}
plan:{type:string }
repoPath:{type:string }
required:[$status, plan, repoPath]
- properties:
$status:{const:"insufficient_info"}
required:[$status]
developer:
description:"TDD implementation per test spec"
goal:"You are a developer agent. You implement code changes following TDD — write tests first, then implementation."
capabilities:
- coding
procedure:|
IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
The repo path and other details are provided in your task prompt.
Before starting any work, set up an isolated worktree:
1. cd into the repo path provided in your task prompt
Command-line interface for the Uncaged workflow engine (`uncaged-workflow`).
The CLI reads and writes the workflow registry, starts and inspects threads, manages CAS blobs, and prints agent-oriented reference docs via `skill`. It uses the same storage layout as `@uncaged/workflow` (default `~/.uncaged/workflow`).
## Install
```bash
bun add @uncaged/cli-workflow
```
In this monorepo: `"@uncaged/cli-workflow": "workspace:*"`. Depends on `"@uncaged/workflow": "workspace:*"`.
## Usage
```bash
uncaged-workflow workflow list
uncaged-workflow run <name> --prompt "Your task"
uncaged-workflow thread show <id>
uncaged-workflow skill
```
Invoking the CLI with no command (or from this repo: `bun packages/cli-workflow/src/cli.ts`) prints:
```
uncaged-workflow — workflow engine CLI
Workflow registry:
workflow add <name> <file.esm.js> [--types <path>] Register a workflow bundle in the registry
workflow list List all registered workflows
workflow show <name> Show details of a registered workflow
workflow rm <name> Remove a workflow from the registry
workflow history <name> Show version history of a workflow
workflow rollback <name> [hash] Rollback a workflow to a previous version
Thread execution:
thread run <name> [--prompt <text>] [--max-rounds N] Start a new thread executing a workflow
thread list [name] List threads, optionally filtered by workflow name
thread show <id> Show thread details and state
thread rm <id> Remove a thread
thread fork <thread-id> [--from-role <role>] Fork a thread, optionally from a specific role
thread ps List running threads
thread kill <thread-id> Kill a running thread
thread live <thread-id> | --latest [--debug] [--role <name>] Attach to a thread and stream output live
thread pause <thread-id> Pause a running thread
thread resume <thread-id> Resume a paused thread
Content-addressable storage:
cas get <hash> Retrieve content by hash from CAS
cas put <content> Store content in CAS, prints hash
cas list List all hashes in CAS
cas rm <hash> Remove a CAS entry by hash
cas gc Garbage-collect unreferenced CAS entries
Development:
init workspace <name> Initialize a new workflow workspace
init template <name> Initialize a new workflow template
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.