Compare commits

...

101 Commits

Author SHA1 Message Date
xingyue 44147da419 fix(builtin): split prompt into system/user messages
System message = agent identity (role prompt + output format instruction)
User message = moderator speech (task + edge prompt + history)

This reflects the workflow's core model: moderator speaks to agent
via the graph's edge prompt. Previously all content was in a single
system message with no user message, causing Claude API 400 errors.

- buildBuiltinPrompt now returns { system, user } instead of string
- agent.ts sends system + user as separate messages
- Tests updated accordingly
2026-05-23 17:15:23 +08:00
xiaoju 0e5b494e12 chore(debate): remove round limit, let step control drive pacing 2026-05-23 08:31:07 +00:00
xiaomo 747b318cc5 Merge pull request 'feat(claude-code): enrich step details with per-turn breakdown' (#423) from feat/422-claude-code-detail-enrichment into main 2026-05-23 08:19:20 +00:00
xiaoju d16ce44bc3 feat(claude-code): enrich step details with per-turn breakdown
Switch from --output-format json to stream-json --verbose to capture
per-turn data. Detail now includes:
- model name
- usage (input/output/cache tokens)
- stopReason
- turns[] as individual CAS nodes with role, content, tool calls

Also addresses PR #421 review fixes:
- sessionId guard: skip cache write when sessionId is empty/undefined
- silent catch: log resume failures with debug tag 5VKR8N3Q
- atomic write: session cache uses temp+rename for crash safety

Closes #422
2026-05-23 08:16:47 +00:00
xiaomo 45122bc458 Merge pull request 'fix: disable hermes resume, add claude-code resume support, debate workflow' (#421) from test/418-resume-e2e-repro into main 2026-05-23 07:59:53 +00:00
xiaomo 3183b4c879 Merge pull request 'feat: add @uncaged/workflow-agent-builtin package' (#420) from feat/builtin-agent into main 2026-05-23 07:57:44 +00:00
xiaoju 03eacbabb2 feat: add debate workflow for resume integration testing
Two-role debate (against/for) with up to 3 rounds per side.
Each role re-enters with session resume, making this an ideal
integration test for cross-process session continuity.

Supports early termination via concession (conceded=true in frontmatter).

Refs #418
2026-05-23 07:50:38 +00:00
xingyue cef4db9a87 refactor: remove workspace path sandbox and shell gate
- Replace resolvePathInWorkspace with simple resolvePath (no boundary check)
- Remove UWF_BUILTIN_ALLOW_SHELL env gate from run_command
- Update tests accordingly

Per review: sandbox was false security with shell=true, and path
restrictions are unnecessary for a trusted agent environment.
2026-05-23 15:50:30 +08:00
xiaoju 1afaeacd57 feat: extract session cache to agent-kit, add resume to claude-code agent
Move getCachedSessionId/setCachedSessionId from workflow-agent-hermes
into workflow-agent-kit so all agent adapters can share the same
session cache logic.

Add cross-process session resume to workflow-agent-claude-code:
on re-entry (isFirstVisit=false), look up the cached sessionId and
use 'claude --resume' to continue with full conversation history.

Cache file renamed from hermes-sessions.json to agent-sessions.json
to reflect its shared nature.

Refs #418
2026-05-23 07:44:02 +00:00
xingyue deac2336b6 feat: add @uncaged/workflow-agent-builtin package
Built-in role agent that uses workflow config models directly,
with its own tool-calling run loop. No external agent dependency.

- OpenAI-compatible chat completion client with tool_calls support
- P0 toolkit: read_file, write_file, run_command
- Integrates via createAgent factory from workflow-agent-kit
- CAS detail recording for each turn
- Path sandboxing and shell opt-in (UWF_BUILTIN_ALLOW_SHELL)
2026-05-23 15:29:55 +08:00
xiaoju aad2792754 fix(hermes): disable ACP session/resume by default
Hermes ACP _restore fails for custom providers — resolve_runtime_provider
throws and base_url/api_mode are lost, causing resume to silently create a
new session with no history. Prompt then returns empty text or refusal.

Disable resume by default. Set UWF_HERMES_RESUME=1 to opt back in.

Includes investigation notes in docs/investigations/.

Refs #418
2026-05-23 07:23:14 +00:00
scottwei 10642fdc45 Merge pull request 'test: failing e2e test for session resume bug (#418)' (#419) from test/418-resume-e2e-repro into main
Reviewed-on: #419
2026-05-23 06:49:54 +00:00
xiaomo 92020d2d78 Merge pull request 'docs: sync cli-reference with recent CLI additions' (#417) from chore/update-cli-reference into main 2026-05-23 06:48:20 +00:00
xiaomo cd0a79d72b chore: remove accidental pnpm-lock.yaml 2026-05-23 06:47:25 +00:00
xiaoju 3b6aa6525f test: add failing e2e test for session resume bug (#418)
Cross-process resume returns empty text on subsequent prompt.
This test documents the bug — expected to fail until #418 is fixed.
2026-05-23 06:43:47 +00:00
xiaomo 54631c43c7 docs: update cli-reference with log commands, --count flag, edge prompt concept 2026-05-23 06:32:27 +00:00
xiaomo 655b57c4b5 Merge pull request 'feat: add uwf log subcommands (list, show, clean)' (#415) from fix/413-log-subcommands into main 2026-05-23 06:27:15 +00:00
xiaoju 7faa8184ae feat: add uwf log subcommands (list, show, clean)
- uwf log list: list log files with sizes
- uwf log show --thread <id>: filter by thread ID
- uwf log show --process <pid>: filter by process ID
- uwf log clean --before <date>: delete old log files
- Tests: 12 new tests covering all subcommands

Implemented by solve-issue workflow, biome fixes applied manually.

Closes #413
Refs #411, #410
2026-05-23 06:23:56 +00:00
xiaoju 816137315e feat: add uwf log subcommands (list, show, clean)
- cmdLogList: list log files with sizes, sorted by date descending
- cmdLogShow: filter entries by thread, process, and/or date
- cmdLogClean: delete log files older than given date
- 12 tests covering all functions and edge cases

Fixes #413
2026-05-23 06:21:06 +00:00
xiaoju 9a111d16c7 fix: invalid Crockford Base32 char 'L' in log tag PL_AGENT_DONE
Fixes runtime crash on uwf thread step.
2026-05-23 06:13:29 +00:00
xiaoju ea6ceafe51 merge: resolve conflict in process-logger test (use null 3rd arg) 2026-05-23 06:10:53 +00:00
xiaoju d0dc7b5a19 feat: add process-level debug logger (Phase 1)
- New ProcessLogger in workflow-util: process-scoped JSONL logger
- Entry schema: {ts, pid, tag, msg, thread, workflow}
- Storage: ~/.uncaged/workflow/logs/YYYY-MM-DD.jsonl
- Auto logs process init info (argv, node version, context)
- cli-workflow thread commands fully instrumented:
  - thread start/step, moderator evaluate, agent spawn/done
  - thread archived, error paths

Refs #411, #412, #410
2026-05-23 06:10:05 +00:00
xiaomo 3b81521e9d Merge pull request 'feat: add process-level debug logger (Phase 1)' (#414) from feat/411-process-logger into main 2026-05-23 06:09:15 +00:00
xiaoju aa0a23293f feat: add process-level debug logger (Phase 1)
- New ProcessLogger in workflow-util: process-scoped JSONL logger
- Entry schema: {ts, pid, tag, msg, thread, workflow}
- Storage: ~/.uncaged/workflow/logs/YYYY-MM-DD.jsonl
- Auto logs process init info (argv, node version, context)
- cli-workflow thread commands fully instrumented:
  - thread start/step, moderator evaluate, agent spawn/done
  - thread archived, error paths

Refs #411, #412, #410
2026-05-23 06:07:45 +00:00
xiaomo 187dd036e5 Merge pull request 'feat: replace edgePrompt null check with isFirstVisit (Phase 2)' (#409) from feat/405-phase2-find-last-role-index into main 2026-05-23 04:55:23 +00:00
xiaoju 4b45f4e6d1 feat: replace edgePrompt null check with isFirstVisit (Phase 2)
- Add isFirstVisit: boolean to AgentContext
- Compute from steps history: !steps.some(s => s.role === role)
- hermes.ts: use isFirstVisit for first-entry vs re-entry logic
- buildInitialPrompt: always append edgePrompt as Moderator Instruction
- edgePrompt is never blanked — always the real moderator instruction
- New tests for first-visit, re-entry, and fallback scenarios

Refs #405, #407, #404
2026-05-23 04:54:11 +00:00
xiaomo 2a6bce4918 Merge pull request 'feat: make edge prompt required (Phase 1)' (#408) from feat/405-edge-prompt-required into main 2026-05-23 04:36:53 +00:00
xiaoju 3d6399c0e3 feat: make edge prompt required (Phase 1)
- Transition.prompt: string | null → string
- EvaluateResult.prompt: string | null → string
- AgentContext.edgePrompt: string | null → string
- CLI YAML validation rejects missing prompt
- All tests updated

Phase 2 will replace edgePrompt === null checks with findLastRoleIndex.

Refs #405, #406, #404
2026-05-23 04:28:58 +00:00
xiaomo b9258f84a5 Merge pull request 'feat: edge prompt + session resume (#402)' (#403) from feat/402-edge-prompt-session-resume into main 2026-05-23 04:00:24 +00:00
xiaoju 638329a562 feat: edge prompt + session resume implementation (#402)
- buildContinuationPrompt: incremental prompt for role re-entry
- buildHermesPrompt: dual-mode (initial vs continuation)
- session-cache: thread:role → hermes sessionId mapping
- HermesAcpClient.resume(): session/resume JSON-RPC
- Fallback: cache miss or resume fail → initial prompt
- UWF_NO_RESUME env to skip cache
- solve-issue.yaml: reviewer→developer edge prompt
- Tests updated for EvaluateResult + continuation prompt

Refs #402
2026-05-23 03:57:04 +00:00
xiaoju 1a06e014f5 feat(protocol): add edge prompt to Transition + EvaluateResult (#402)
- Transition type gains prompt: string | null
- evaluate() returns EvaluateResult { role, prompt } instead of string
- normalizeGraph coerces prompt: undefined → null
- spawnAgent passes edge prompt via UWF_EDGE_PROMPT env
- AgentContext gains edgePrompt field

Refs #402
2026-05-23 03:49:15 +00:00
xiaoju d5d05334f5 fix: ACP client permission handling and process cleanup
Two bugs fixed:
1. request_permission messages (JSON-RPC requests with both id+method) were
   silently swallowed by the response handler, causing hermes to hang waiting
   for permission approval. Now properly distinguish responses (id only) from
   server requests (id+method).
2. uwf-hermes process never exited after completing because the hermes ACP
   subprocess was still alive. Now explicitly close the ACP client after
   agent completion so the subprocess terminates.

小橘 <xiaoju@shazhou.work>
2026-05-22 14:51:43 +00:00
xiaoju 844f5438fe fix: replace @agentclientprotocol/sdk with readline-based JSON-RPC
The official TS SDK's ndJsonStream hangs indefinitely on prompt()
for sessions with 20+ messages (solve-issue planner). Root cause
appears to be a stream backpressure issue in the SDK's ReadableStream
adapter.

Switch back to readline-based line parsing which reliably receives
all JSON-RPC responses. Also handle session/request_permission
inline (auto-approve, yolo mode equivalent).

Ref #398
2026-05-22 14:34:27 +00:00
xiaomo e329d74ec0 Merge pull request 'refactor: migrate hermes agent from stdout parsing to ACP protocol' (#401) from feat/398-hermes-acp-client into main 2026-05-22 13:16:46 +00:00
xiaoju f90614a622 feat: collect structured turns from ACP session updates
UwfAcpClient now tracks all session/update events:
- agent_message_chunk → assistant message content
- agent_thought_chunk → assistant reasoning
- tool_call → pending tool invocation (name + rawInput)
- tool_call_update (completed/failed) → assistant tool_call + tool result

Messages are accumulated across prompts (same session) and stored
via storeHermesSessionDetail, restoring the full structured detail
(turns with tool calls, reasoning) that was lost in the initial ACP
migration.

Ref #398
2026-05-22 13:13:02 +00:00
xiaoju 68af555313 fix: share ACP client across run/continue for session continuity
The client is now created once in createHermesAgent() and shared by
runHermes and continueHermes closures. This preserves conversation
context during frontmatter retry loops — continue() sends a follow-up
prompt on the same ACP session instead of starting a new one.

Client is cleaned up via process.on('exit').

Ref #398
2026-05-22 13:06:14 +00:00
xiaoju 025695dbe9 refactor: use @agentclientprotocol/sdk instead of hand-rolled JSON-RPC
Replace 250-line custom ACP client with official TypeScript SDK.
Uses ClientSideConnection + ndJsonStream for stdio transport.
Same public API (connect/prompt/close), 115 lines, zero custom protocol code.

Ref #398
2026-05-22 12:58:55 +00:00
xiaoju 96584e481f refactor: replace spawnHermes with HermesAcpClient
Remove spawnHermes, spawnHermesChat, spawnHermesResume, parseSessionId,
and buildResultFromSession. runHermes and continueHermes now use
HermesAcpClient for structured JSON-RPC communication.

Session ID comes directly from ACP protocol, eliminating #380 race
condition. Agent output collected via streaming chunks instead of
session file loading.

Phase 2 of RFC #398
Fixes #380
2026-05-22 12:18:14 +00:00
xiaoju 766ec7ddc2 feat: add HermesAcpClient for structured ACP communication
Implements JSON-RPC client that communicates with `hermes acp` via
stdin/stdout. Replaces fragile stdout/stderr parsing with structured
protocol: initialize → session/new → session/prompt → collect chunks.

Session ID comes directly from protocol response, eliminating the
race condition in #380.

Phase 1 of RFC #398
2026-05-22 12:15:09 +00:00
xiaoju aeb7180e9d chore: fix meta.plan → frontmatter.plan in workflow procedures
小橘 <xiaoju@shazhou.work>
2026-05-22 11:22:34 +00:00
xiaomo 9b56f7b75e Merge pull request 'fix: add git worktree hygiene to solve-issue workflow' (#397) from fix/395-worktree-hygiene into main 2026-05-22 11:20:58 +00:00
xiaoju c9507b8dc1 fix: add git worktree hygiene to solve-issue workflow
Developer: checkout main + create fresh branch before coding.
Reviewer: verify branch matches issue before reviewing.

Fixes #395

小橘 <xiaoju@shazhou.work>
2026-05-22 10:59:08 +00:00
xiaomo baa2edfa38 Merge pull request 'feat: workflow-agent-claude-code' (#393) from feat/391-workflow-agent-claude-code into main 2026-05-22 10:58:18 +00:00
xingyue 4dff320d5c fix: throw on non-JSON Claude Code output instead of fallback 2026-05-22 18:57:07 +08:00
scottwei d8863ceda2 Merge pull request 'init workflow dashboard' (#387) from jshang/workflow-dashboard into main
Reviewed-on: #387
2026-05-22 10:54:14 +00:00
scottwei c9fcb15384 Merge branch 'main' into jshang/workflow-dashboard 2026-05-22 10:52:53 +00:00
xiaomo 5e868a2977 Merge pull request 'fix: explicitly forbid extra frontmatter fields in output format instruction' (#396) from fix/394-forbid-extra-frontmatter-fields into main 2026-05-22 10:51:52 +00:00
xiaoju 76fab22827 fix: explicitly forbid extra frontmatter fields in output format instruction
buildOutputFormatInstruction now includes explicit language telling agents to
output ONLY schema-defined fields and to focus on their role's deliverable.

Fixes #394
2026-05-22 10:49:04 +00:00
xingyue 176844d7f5 fix: add sessionId to raw fallback, fix test meta→frontmatter+description 2026-05-22 18:42:27 +08:00
xingyue 31695e89a8 feat: add workflow-agent-claude-code package
Claude Code CLI adapter for the workflow engine, mirroring
workflow-agent-hermes architecture. Spawns `claude -p` with
`--output-format json` for structured output parsing.

Refs #391
2026-05-22 18:38:18 +08:00
xiaomo 669875fb46 Merge pull request 'feat: validate model connectivity during uwf setup' (#392) from feat/335-setup-validate-model into main 2026-05-22 10:32:01 +00:00
xiaoju 6d94be34a9 feat: validate model connectivity during uwf setup
Send a test completion request after configuration to verify the model
is reachable. If validation fails, warn the user and suggest trying a
different model or checking their settings.

Fixes #335
2026-05-22 10:30:39 +00:00
xiaomo d95fe45a3d Merge pull request 'feat: add --count/-c flag to uwf thread step' (#390) from feat/373-thread-step-count into main 2026-05-22 10:11:13 +00:00
xiaoju b9252b5ce2 fix: dynamic frontmatter instruction from role schema (closes #389) 2026-05-22 10:03:56 +00:00
xiaoju 4d47effd39 fix: generate frontmatter instruction dynamically from role schema
Replace hardcoded 5-field example with schema-driven generation.
Now shows actual enum values, types, and required markers for
each role's frontmatter schema.

Fixes #389

小橘 <xiaoju@shazhou.work>
2026-05-22 10:03:45 +00:00
xiaoju 7b93ce8f3e fix: dynamic frontmatter field extraction from role schema (closes #388) 2026-05-22 09:57:45 +00:00
xiaoju 67870392ab fix: dynamic frontmatter field extraction from role schema
Replace hardcoded 5-field candidate with schema-driven extraction.
Now reads outputSchema properties and picks matching fields from
parsed frontmatter, supporting role-specific fields like plan,
approved, success.

Falls back to standard 5 fields when schema has no properties.

Fixes #388

小橘 <xiaoju@shazhou.work>
2026-05-22 09:57:30 +00:00
jiashuang 9316b843f6 init workflow service 2026-05-22 17:46:53 +08:00
xiaomo 6b9ff9781d Merge pull request 'fix: revert unnecessary output protocol changes from #385' (#386) from fix/385-revert-output-protocol into main 2026-05-22 09:40:33 +00:00
xiaoju 487c48effa fix: revert output protocol changes from #385
Agent CLI outputs plain CAS hash (not JSON), engine parses plain hash.
StepOutput no longer carries sessionId — session info is already in CAS detail.
Keeps the valuable parts of #385: sessionId in AgentRunResult (process-internal),
continue support, and frontmatter retry loop.
2026-05-22 09:39:36 +00:00
xiaomo 4eca2d533c Merge pull request 'feat: agent session protocol — sessionId, continue, frontmatter retry' (#385) from feat/384-agent-session-protocol into main 2026-05-22 09:20:35 +00:00
xiaoju f0f840e6e0 fix: StepOutput.sessionId → string | null, legacy fallback → null 2026-05-22 09:16:13 +00:00
xiaoju 7ff90cef4f feat: agent session protocol — sessionId in result, continue support, frontmatter retry
Breaking changes:
- AgentRunResult now requires sessionId field
- AgentOptions now requires continue function
- Agent CLI outputs JSON {stepHash, sessionId} instead of plain CAS hash
- Engine parses JSON output (with legacy CAS hash fallback)

New features:
- Frontmatter validation retry: if agent output lacks valid frontmatter,
  engine calls agent.continue() up to 2 times with correction message
- Session tracking: sessionId flows from agent → engine → StepOutput
- Hermes agent: session parse failure is now a hard error (no raw text fallback)
- Hermes agent: supports --resume for continue sessions

Closes #384
2026-05-22 09:13:05 +00:00
xiaoju e62d51d845 Merge remote-tracking branch 'origin/feat/remove-llm-extract' into feat/384-agent-session-protocol 2026-05-22 09:06:24 +00:00
xiaoju a803fcb4fc fix: solve-issue.yaml meta.plan → frontmatter.plan
Follows #375 rename.
2026-05-22 09:04:34 +00:00
xiaomo d00c93fc19 Merge pull request 'feat: uwf cas put-text for storing plain text in CAS' (#382) from feat/cas-put-text into main 2026-05-22 09:02:09 +00:00
xiaoju 99a2890be2 feat: remove LLM extract fallback, require YAML frontmatter
Agent output must contain valid YAML frontmatter matching the role schema.
If frontmatter parsing fails, the step fails immediately with a clear error
instead of falling back to an LLM extraction that can fabricate values.

The extract module remains as a public API export but is no longer used
in the agent run loop.

Breaking change: agents that relied on LLM extraction to produce valid
output will now fail. They must output proper frontmatter.
2026-05-22 08:58:01 +00:00
xiaoju 3b7d0564bb feat: uwf cas put-text for storing plain text in CAS
- Register built-in text schema ({type: 'string'}) alongside workflow schemas
- Add cmdCasPutText command: uwf cas put-text <text>
- Update CLI reference in workflow-util
- Update solve-issue.yaml procedure to use put-text

Refs #380
2026-05-22 08:53:27 +00:00
xiaoju 45dacf540b feat: thread step --count/-c <number> to run multiple steps
Add --count/-c flag to 'uwf thread step' for running N steps in one
invocation, stopping early if $END is reached.

- cmdThreadStep now loops up to count times, delegates to cmdThreadStepOnce
- CLI parses -c/--count, defaults to 1 (backward compatible single output)
- Validation rejects 0, negative, and non-integer counts
- 7 new tests covering CLI parsing and count validation

Fixes #373

Co-authored-by: uwf-hermes (solve-issue workflow)
2026-05-22 08:06:26 +00:00
xiaomo 2eb5ee0666 Merge pull request 'fix: accept omitted condition in fallback transitions' (#378) from fix/fallback-transition-validation into main 2026-05-22 07:56:18 +00:00
xiaoju e67932c83c fix: accept omitted condition in fallback transitions
Fallback transitions (last entry in graph node) omit the condition
field in YAML, resulting in undefined instead of null. The validator
and materializer now handle this:

- validate.ts: accept undefined as valid condition value
- workflow.ts: normalizeGraph() coerces undefined → null before CAS put

This was broken by the graph fallback pattern introduced in #370.
2026-05-22 07:38:24 +00:00
xiaomo 04a12231c3 Merge pull request 'feat: register $first/$last JSONata functions in moderator' (#377) from feat/376-first-last-jsonata into main 2026-05-22 07:32:17 +00:00
xiaoju e5ae9a134c feat: register $first/$last JSONata functions in moderator
Register custom $first(role) and $last(role) functions in the JSONata
evaluator. These search the steps array and return the matching role's
frontmatter (output) directly, replacing verbose steps[-1].output.x
expressions with semantic $last('role').field syntax.

- workflow-moderator: register functions via expr.registerFunction()
- Updated all condition expressions in .workflows/ and examples/
- Added tests for $last, $first, and unmatched role (undefined)

Fixes #376
2026-05-22 06:29:56 +00:00
xiaomo bdafaf3aa1 Merge pull request 'refactor!: rename RoleDefinition.meta → frontmatter' (#375) from refactor/374-meta-to-frontmatter into main 2026-05-22 06:06:06 +00:00
xiaoju 02f7f0b708 refactor!: rename RoleDefinition.meta → frontmatter
BREAKING CHANGE: All workflow YAML files must use 'frontmatter' instead of 'meta'.

- workflow-protocol: RoleDefinition.meta → frontmatter, schema updated
- cli-workflow: validate.ts, workflow.ts — resolveMetaRef → resolveFrontmatterRef
- workflow-agent-kit: run.ts — metaSchema → frontmatterSchema
- All YAML files updated (examples/, .workflows/)

Fixes #374
2026-05-22 06:05:07 +00:00
xiaoju 8ea554bb5e Merge pull request 'feat: create .workflows/solve-issue.yaml' (#372) from feat/370-solve-issue-workflow into main 2026-05-22 06:02:15 +00:00
xiaoju 8a425521da fix: output instructions now specify required frontmatter meta fields 2026-05-22 05:42:17 +00:00
xiaoju f174f2fd0a fix: remove redundant condition null from $START 2026-05-22 05:33:39 +00:00
xiaoju 355594d074 refactor: graph fallback pattern + positive condition names
- Last transition in each graph node is now the fallback (no condition)
- Remove redundant positive conditions (ready, devDone, approved, passed, pushSuccess)
- notApproved → rejected (positive naming)
2026-05-22 05:31:43 +00:00
xiaoju fd7609fe90 fix: address review feedback from xingyue
1. npm/npx → bun/bunx (project standard)
2. Fix tea CLI usage (tea comment + -r flag)
3. cursor-agent → coding (abstract capability)
4. Clarify committer inherits developer's worktree
5. Mark meta.plan required when status=ready
6. PR description must follow What/Why/Changes/Ref template
7. Note maxRounds loop protection in description
2026-05-22 05:27:21 +00:00
xiaoju dacecfbbb7 feat: create .workflows/solve-issue.yaml
TDD-driven issue resolution workflow with 5 roles:
- planner: analyzes issue, outputs TDD test spec (stored in CAS)
- developer: implements code following TDD
- reviewer: code standards compliance check (not functionality)
- tester: functional correctness verification
- committer: commits and creates PR

Graph handles bounce-backs: reviewer→developer, tester→developer,
tester→planner (fix_spec), committer→developer (hook_failed).

Refs #370
2026-05-22 05:21:19 +00:00
xiaomo 3238eaeddf Merge pull request 'feat: add uwf skill cli command and Prepare section' (#371) from feat/369-uwf-skill-cli into main 2026-05-22 04:50:12 +00:00
xiaoju 995f273fa5 address review: move CLI reference to workflow-util, inline in prompt
- Move generateCliReference() to @uncaged/workflow-util
- buildRolePrompt inlines CLI reference directly (no agent tool call)
- Fix Role terminology to use new field names
- Add maintenance comment in cli-reference.ts
- Fix test assertions
2026-05-22 03:29:01 +00:00
xiaoju 866154ad73 feat: add uwf skill cli command and Prepare section in role prompt
- Add 'uwf skill cli' command that prints markdown CLI reference
- buildRolePrompt now generates ## Prepare section:
  - Always prompts agent to run 'uwf skill cli' (explicit skill)
  - Renders capabilities as keyword hints for implicit skill loading

Fixes #369
2026-05-22 03:20:04 +00:00
xiaomo 8efc5050cb Merge pull request 'chore: exclude legacy code from biome check' (#368) from chore/ignore-legacy-biome into main 2026-05-22 02:10:20 +00:00
xiaoju 3fb60ee649 chore: exclude legacy-packages and scripts from biome check
- Add legacy-packages/ and scripts/ to biome ignore
- Allow noDefaultExport in vitest.config.* and .d.ts
- Allow console in cli.ts and setup.ts (CLI user output)
- Fix unused imports in cas.ts and setup.ts
2026-05-22 02:09:18 +00:00
xiaomo e181f67a2d Merge pull request 'feat: support project-local workflow discovery' (#367) from feat/365-project-local-workflows into main 2026-05-22 02:07:33 +00:00
xiaoju a3114bf840 chore: apply biome formatting across codebase 2026-05-22 02:06:05 +00:00
xiaoju e59ae9aca1 feat: support project-local workflow discovery
- Add .workflows/*.yaml scanning from project root (cwd)
- Resolution: project-local first, then global registry
- On-the-fly CAS materialization for local workflows
- Filename/name consistency check
- uwf workflow list shows origin (local/global)

Fixes #365
2026-05-22 01:01:45 +00:00
xiaomo c050a38f38 Merge pull request 'refactor: rename RoleDefinition fields for clarity' (#366) from refactor/364-rename-role-fields into main 2026-05-22 00:48:23 +00:00
xiaoju c60c310074 refactor: rename RoleDefinition fields for clarity
- identity → goal
- prepare → capabilities (string[])
- execute → procedure
- report → output
- outputSchema → meta

Fixes #364
2026-05-22 00:46:06 +00:00
xiaomo fe035c065d Merge pull request 'feat: Role 四段式描述 (identity/prepare/execute/report)' (#361) from feat/359-role-four-phase into main 2026-05-21 03:11:00 +00:00
xiaoju 192ad656a4 refactor: remove systemPrompt, make four-phase fields required
Breaking change per review:
- Remove systemPrompt from RoleDefinition entirely
- identity/prepare/execute/report are now required (string, not nullable)
- Remove all legacy fallback logic in buildRolePrompt
- Simplify validate.ts, workflow.ts materialize
- Migrate all test fixtures and example workflows

Refs #359
2026-05-21 03:07:56 +00:00
xiaoju c0c8d6499e feat: add four-phase example workflow (analyze-topic)
Refs #359, #363
2026-05-21 02:56:11 +00:00
xiaoju 505f85e3c4 feat: add buildRolePrompt in agent-kit, integrate with uwf-hermes
- New buildRolePrompt() in workflow-agent-kit: four-phase prompt assembly
  with fallback to systemPrompt
- Export from agent-kit index
- Update uwf-hermes to use buildRolePrompt instead of raw systemPrompt
- Add tests for all modes: four-phase, legacy, mixed

Refs #359, #362
2026-05-21 02:31:56 +00:00
xiaoju fc7d482b4f feat: add four-phase role description (identity/prepare/execute/report)
- Extend RoleDefinition with identity, prepare, execute, report fields
- Make systemPrompt optional (nullable) for four-phase workflows
- Update ROLE_DEFINITION JSON Schema (all new fields optional)
- Update validate.ts to accept new fields
- Update workflow.ts to strip null fields before CAS storage
- Update thread read to prefer identity over systemPrompt
- Add --version flag to uwf CLI
- Bump all packages to 0.5.0

Refs #359
2026-05-21 01:41:20 +00:00
xiaoju f9979c3c89 chore: upgrade json-cas to 0.4.x, fix Store → BootstrapCapableStore
- @uncaged/json-cas ^0.3.0 → ^0.4.0
- @uncaged/json-cas-fs ^0.3.0 → ^0.4.0 (now publishes .d.ts + .js)
- UwfStore.store typed as BootstrapCapableStore
- tsc --build now clean (no more node_modules type errors)

小橘 🍊(NEKO Team)
2026-05-19 10:29:57 +00:00
xiaoju 46def2945a chore: update dev workflow — fix publish script, remove deploy.sh, update CLAUDE.md
- scripts/publish-all.mjs: update to 6 active packages only
- scripts/deploy.sh: removed (dashboard/gateway in legacy)
- package.json: release script uses publish-all.mjs directly
- CLAUDE.md: add complete dev workflow section (setup, build, check, test, publish)

小橘 🍊(NEKO Team)
2026-05-19 08:07:45 +00:00
xiaoju 4e89508246 docs: rewrite README.md and CLAUDE.md for current architecture
Remove all references to ESM bundles, old packages, old CLI name.
Update to reflect YAML workflow definitions, uwf CLI, 6 active packages,
frontmatter markdown output format, and stateless single-step execution.

小橘 🍊(NEKO Team)
2026-05-19 08:03:13 +00:00
xiaoju 77d799d458 chore: remove obsolete .env.example, config via uwf setup
小橘 🍊(NEKO Team)
2026-05-19 07:58:50 +00:00
xiaoju 6c14259184 chore: remove pnpm-lock.yaml files, bun only
小橘 🍊(NEKO Team)
2026-05-19 07:58:24 +00:00
181 changed files with 11157 additions and 1684 deletions
-40
View File
@@ -1,40 +0,0 @@
# ──────────────────────────────────────────────
# Workflow Engine — Environment Variables
# ──────────────────────────────────────────────
# Copy this file to .env and fill in the values.
# ── Cursor Agent ──
# CLI command to invoke the Cursor agent (required for develop workflow)
WORKFLOW_CURSOR_COMMAND=
# Model override for Cursor agent
WORKFLOW_CURSOR_MODEL=
# Timeout in milliseconds for Cursor agent operations
WORKFLOW_CURSOR_TIMEOUT=
# ── Hermes Agent (used by develop tester/committer + solve-issue) ──
# CLI command to invoke the Hermes agent (absolute path required)
WORKFLOW_HERMES_COMMAND=
# Model override for Hermes agent
WORKFLOW_HERMES_MODEL=
# Timeout in milliseconds for Hermes agent operations
WORKFLOW_HERMES_TIMEOUT=
# ── Storage ──
# Override the workflow storage root directory
# Default: ~/.uncaged/workflow
WORKFLOW_STORAGE_ROOT=
# Gateway secret for the serve command
WORKFLOW_DASHBOARD_SECRET=
# ── Display ──
# Set to any value to disable colored output
# NO_COLOR=1
+2
View File
@@ -11,3 +11,5 @@ solve-issue-entry.ts
packages/workflow-template-develop/develop.esm.js
.DS_Store
*.py
.claude
tmp
+83
View File
@@ -0,0 +1,83 @@
# Test Spec: uwf setup model connectivity validation (#335)
## Context
File: `packages/cli-workflow/src/commands/setup.ts`
Test file: `packages/cli-workflow/src/__tests__/setup-validate.test.ts`
After `cmdSetup` writes config, it should send a test chat completion request to verify the configured model is reachable. If validation fails, warn the user (don't abort — config is already saved).
## Implementation Notes
- Add a `validateModel(baseUrl, apiKey, model)` function that sends a minimal chat completion request (`POST /chat/completions` with `messages: [{role:"user",content:"hi"}]`, `max_tokens: 1`)
- Returns `Result<void, string>` — ok if 2xx response, error with reason string otherwise
- Use `AbortSignal.timeout(15_000)` for the request
- Both `cmdSetup` and `cmdSetupInteractive` should call it after saving config
- `cmdSetup` returns validation result in its return object: `{ ...existing, validation: { ok: true } | { ok: false, error: string } }`
- `cmdSetupInteractive` prints a warning to console if validation fails, success message if it passes
- Use the project logger (`createLogger`) — no raw `console.log` except in interactive CLI output (per CLAUDE.md)
## Test Cases (vitest)
### 1. `validateModel` — success path
- Mock `fetch` to return `{ status: 200, ok: true, json: () => ({}) }`
- Call `validateModel(baseUrl, apiKey, model)`
- Assert returns `{ ok: true, value: undefined }`
- Assert fetch was called with correct URL (`${baseUrl}/chat/completions`), correct headers (`Authorization: Bearer ${apiKey}`), correct body (model, messages, max_tokens: 1)
### 2. `validateModel` — HTTP error (401 unauthorized)
- Mock `fetch` to return `{ status: 401, ok: false, statusText: "Unauthorized" }`
- Call `validateModel(baseUrl, apiKey, model)`
- Assert returns `{ ok: false, error: <string containing "401"> }`
### 3. `validateModel` — HTTP error (404 model not found)
- Mock `fetch` to return `{ status: 404, ok: false, statusText: "Not Found" }`
- Assert returns `{ ok: false, error: <string containing "404"> }`
### 4. `validateModel` — network timeout
- Mock `fetch` to throw `DOMException` with name `AbortError`
- Assert returns `{ ok: false, error: <string containing "timeout" or "unreachable"> }`
### 5. `validateModel` — network error (DNS failure, connection refused)
- Mock `fetch` to throw `TypeError("fetch failed")`
- Assert returns `{ ok: false, error: <string mentioning connectivity> }`
### 6. `cmdSetup` — includes validation result on success
- Mock global `fetch` for `/chat/completions` to succeed
- Call `cmdSetup({ provider, baseUrl, apiKey, model, storageRoot })`
- Assert returned object has `validation: { ok: true, value: undefined }`
- Assert config files are still written (existing behavior preserved)
### 7. `cmdSetup` — includes validation result on failure (config still saved)
- Mock global `fetch` for `/chat/completions` to return 401
- Call `cmdSetup({ ... })`
- Assert returned object has `validation: { ok: false, error: ... }`
- Assert `config.yaml` and `.env` are still written (validation failure doesn't prevent saving)
### 8. `cmdSetupInteractive` — prints success message on validation pass
- Mock `fetch` for both `/models` and `/chat/completions` to succeed
- Mock stdin to provide valid selections
- Capture console output
- Assert output contains a success message like "Model verified" or "✓"
### 9. `cmdSetupInteractive` — prints warning on validation failure
- Mock `fetch`: `/models` succeeds, `/chat/completions` returns 401
- Mock stdin for valid selections
- Capture console output
- Assert output contains a warning about model not being reachable and suggests trying a different model
### 10. `validateModel` — request body correctness
- Mock `fetch` to capture the request body
- Call `validateModel(baseUrl, apiKey, "test-model")`
- Assert body is `{ model: "test-model", messages: [{role: "user", content: "hi"}], max_tokens: 1 }`
## Export Requirements
- `validateModel` must be exported (for direct unit testing)
- Signature: `async function validateModel(baseUrl: string, apiKey: string, model: string): Promise<Result<void, string>>`
- `Result` type: `{ ok: true; value: T } | { ok: false; error: E }` (project convention)
## Files to Create/Modify
- **New**: `packages/cli-workflow/src/__tests__/setup-validate.test.ts` — all test cases above
- **Modify**: `packages/cli-workflow/src/commands/setup.ts` — add `validateModel`, integrate into `cmdSetup` and `cmdSetupInteractive`
+196
View File
@@ -0,0 +1,196 @@
name: "solve-issue"
description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds."
roles:
planner:
description: "Analyzes issue and outputs a TDD test spec"
goal: "You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify."
capabilities:
- issue-analysis
- planning
procedure: |
On first run (no previous steps):
1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
2. Read CLAUDE.md (or equivalent project conventions file) to understand coding standards
3. Assess whether the issue has enough information to produce a test spec
4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output status=insufficient_info and terminate
5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
On subsequent runs (bounced back by tester with fix_spec):
1. Read the tester's output from the previous step to understand what's wrong with the spec
2. Revise the test spec accordingly
After producing the test spec:
1. Store it via `uwf cas put-text "<markdown content>"` and capture the returned hash
2. Put the hash in frontmatter.plan (required when status=ready)
output: "Output a brief summary of the test spec. Frontmatter must include: status (ready or insufficient_info) and plan (CAS hash of the test spec, required when status=ready)."
frontmatter:
type: object
properties:
status:
type: string
enum: [ready, insufficient_info]
plan:
type: string
required: [status]
developer:
description: "TDD implementation per test spec"
goal: "You are a developer agent. You implement code changes following TDD — write tests first, then implementation."
capabilities:
- coding
procedure: |
Before starting any work, ensure a clean worktree:
1. `git checkout main && git pull` to get the latest code
2. `git checkout -b fix/<issue-number>-<short-description>` to create a fresh branch
- If bounced back from reviewer or tester, reuse the existing branch instead
Then implement TDD:
3. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
4. If bounced back from reviewer or tester: read the previous role's output to understand what needs fixing
5. Write tests first based on the spec
6. Implement the code to make tests pass
7. Ensure `bun run build` passes with no errors
8. Run `bun test` to verify all tests pass
output: "List all files changed and provide a summary. Frontmatter must include: status (done or failed)."
frontmatter:
type: object
properties:
status:
type: string
enum: [done, failed]
required: [status]
reviewer:
description: "Code standards compliance check"
goal: "You are a code reviewer. You verify code standards compliance — NOT functionality (that's the tester's job)."
capabilities:
- code-review
- static-analysis
procedure: |
Before reviewing, verify the git branch:
1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
2. If the branch doesn't correspond to the issue, flag it in your output and reject
Then perform code review:
Hard checks (must all pass):
3. `bun run build` — no build errors
4. `bunx biome check` — no lint violations
5. TypeScript strict mode — no type errors
Soft checks (review against CLAUDE.md conventions):
- Functional-first: `function` + `type`, not `class` + `interface`
- No optional properties (`?:`) — use `T | null`
- Naming conventions (kebab-case files, PascalCase types, camelCase functions)
- Module boundary discipline (folder exports via index.ts)
- No `console.log` (use structured logger)
- No dynamic imports in production code
Only review standards compliance. Do NOT test functionality.
If rejecting, you MUST explain the specific reason in your output.
output: "Explain your decision with specific file/line references. Frontmatter must include: approved (true or false)."
frontmatter:
type: object
properties:
approved:
type: boolean
required: [approved]
tester:
description: "Functional correctness verification"
goal: "You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec."
capabilities:
- testing
procedure: |
1. Run `bun test` for automated test verification
2. Read the test spec from CAS: `uwf cas get <plan hash>` (find the hash from the latest planner step's frontmatter.plan)
3. Verify each scenario in the spec is covered and passing
4. Determine outcome:
- passed: all scenarios verified, tests pass
- fix_code: tests fail or implementation doesn't match spec → send back to developer
- fix_spec: the spec itself is wrong or incomplete → send back to planner
output: "Report test results per scenario. Frontmatter must include: status (passed, fix_code, or fix_spec)."
frontmatter:
type: object
properties:
status:
type: string
enum: [passed, fix_code, fix_spec]
required: [status]
committer:
description: "Commits and creates PR"
goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue."
capabilities: []
procedure: |
Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
1. Stage all changes: `git add -A`
2. Commit with a descriptive message referencing the issue: `git commit -m "type: description\n\nFixes #N"`
3. Push the branch: `git push -u origin <branch-name>`
- If push hook fails: capture the error log in your output, mark hook_failed
4. On push success: create a PR via `tea pr create --title "..." --description "..."`
- PR description must follow the project template: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
output: "Include PR URL on success or error log on failure. Frontmatter must include: success (true or false)."
frontmatter:
type: object
properties:
success:
type: boolean
required: [success]
conditions:
insufficientInfo:
description: "Planner determined there's not enough info to proceed"
expression: "$last('planner').status = 'insufficient_info'"
devFailed:
description: "Developer failed to implement"
expression: "$last('developer').status = 'failed'"
rejected:
description: "Reviewer rejected the implementation"
expression: "$last('reviewer').approved = false"
fixCode:
description: "Tester found code issues"
expression: "$last('tester').status = 'fix_code'"
fixSpec:
description: "Tester found spec issues"
expression: "$last('tester').status = 'fix_spec'"
hookFailed:
description: "Push hook failed"
expression: "$last('committer').success = false"
graph:
$START:
- role: "planner"
condition: null
prompt: "Analyze the issue and produce an implementation plan."
planner:
- role: "$END"
condition: "insufficientInfo"
prompt: "Insufficient information to proceed; end the workflow."
- role: "developer"
condition: null
prompt: "Implement the plan from the planner."
developer:
- role: "$END"
condition: "devFailed"
prompt: "Development failed; end the workflow."
- role: "reviewer"
condition: null
prompt: "Send the implementation to the reviewer."
reviewer:
- role: "developer"
condition: "rejected"
prompt: "Reviewer rejected the implementation; fix the issues."
- role: "tester"
condition: null
prompt: "Review passed; run tests on the implementation."
tester:
- role: "developer"
condition: "fixCode"
prompt: "Tests found code issues; return to developer."
- role: "planner"
condition: "fixSpec"
prompt: "Tests found spec issues; return to planner."
- role: "committer"
condition: null
prompt: "Tests passed; commit and push the changes."
committer:
- role: "developer"
condition: "hookFailed"
prompt: "Push hook failed; return to developer to fix."
- role: "$END"
condition: null
prompt: "Commit succeeded; complete the workflow."
+64 -69
View File
@@ -2,46 +2,41 @@
## Project Overview
This monorepo implements a workflow engine that executes single-file ESM bundles. Each workflow is a self-contained `.esm.js` file with an XXH64 hash as its version identifier. Shared types live in `@uncaged/workflow-protocol`; bundle authors typically depend on `@uncaged/workflow-runtime`.
This monorepo implements a stateless workflow engine driven by a single-step CLI (`uwf`). Workflows are **YAML definitions** stored as CAS nodes; threads are immutable chains of CAS-linked step nodes. No daemon — each `uwf thread step` invocation runs one moderator→agent→extract cycle and exits.
### Key Terms
| Concept | What it is |
|---------|-----------|
| **Workflow** | A single-file ESM module that exports `run` (workflow function) and `descriptor` (metadata). Identified by its XXH64 hash (Crockford Base32). |
| **Bundle** | The physical `.esm.js` file stored in `~/.uncaged/workflow/bundles/`. |
| **Thread** | A single execution of a workflow, identified by a ULID. State lives in CAS (linked nodes); active threads indexed in `threads.json`; completed rows in `history/*.jsonl`. Debug logs use `.info.jsonl`. |
| **Role** | A named actor within a workflow. Each role produces output with typed `meta`. |
| **Registry** | `workflow.yaml` — maps workflow names to current/historical bundle hashes. |
| **Workflow** | A YAML definition (`WorkflowPayload`) with roles, conditions, and a routing graph. Stored as a CAS node, identified by its XXH64 hash. |
| **Thread** | A single execution of a workflow, identified by a ULID. State is an immutable CAS chain; active threads indexed in `threads.yaml`; completed threads in `history.jsonl`. |
| **Role** | A named actor within a workflow. Each role has a system prompt and a JSON Schema `outputSchema`. |
| **Moderator** | JSONata-based graph evaluator — determines the next role (or `$END`) with zero LLM cost. |
| **Agent** | An external CLI command (`uwf-hermes`, etc.) spawned by `uwf thread step`. Produces frontmatter markdown output. |
| **CAS** | Content-Addressed Storage via `@uncaged/json-cas` — all workflow definitions, thread nodes, and outputs are immutable CAS nodes. |
| **Registry** | `~/.uncaged/workflow/registry.yaml` — maps workflow names to current CAS hashes. |
### Monorepo Structure
```
workflow/
packages/
workflow-protocol/ # @uncaged/workflow-protocol — shared types + Result
workflow-runtime/ # @uncaged/workflow-runtime — createWorkflow, type re-exports
workflow-util/ # @uncaged/workflow-util — Base32, ULID, logger, storage paths, refs helpers
workflow-reactor/ # @uncaged/workflow-reactor — LLM fn + thread reactor (tool calls)
workflow-cas/ # @uncaged/workflow-cas — CAS store, hash, Merkle
workflow-register/ # @uncaged/workflow-register — bundle validation, registry YAML, model resolution
workflow-execute/ # @uncaged/workflow-execute — engine, extract, fork, GC, workflowAsAgent
cli-workflow/ # @uncaged/cli-workflow — uncaged-workflow CLI
workflow-agent-cursor/ # @uncaged/workflow-agent-cursor
workflow-agent-hermes/ # @uncaged/workflow-agent-hermes
workflow-agent-llm/ # @uncaged/workflow-agent-llm
workflow-agent-react/ # @uncaged/workflow-agent-react
workflow-util-agent/ # @uncaged/workflow-util-agent — buildAgentPrompt, spawnCli
workflow-template-develop/ # @uncaged/workflow-template-develop
workflow-template-solve-issue/ # @uncaged/workflow-template-solve-issue
workflow-dashboard/ # @uncaged/workflow-dashboard — React dashboard (private app)
docs/ # RFCs, conventions
biome.json # root Biome config
tsconfig.json # root TypeScript config
workflow-protocol/ # @uncaged/workflow-protocol — shared types (WorkflowPayload, StepNodePayload, WorkflowConfig, etc.)
workflow-util/ # @uncaged/workflow-util — Crockford Base32, ULID, logger, frontmatter parsing/validation
workflow-moderator/ # @uncaged/workflow-moderator — JSONata graph evaluator
workflow-agent-kit/ # @uncaged/workflow-agent-kit — createAgent factory, context builder, extract pipeline
workflow-agent-hermes/ # @uncaged/workflow-agent-hermes — uwf-hermes CLI binary (spawns hermes chat)
cli-workflow/ # @uncaged/cli-workflow — uwf CLI binary
legacy-packages/ # Archived packages (preserved for reference, not active)
examples/ # Workflow YAML examples (solve-issue.yaml)
docs/ # Architecture docs
biome.json # root Biome config
tsconfig.json # root TypeScript config
```
- Execution stack layers: `workflow-protocol` → (`workflow-runtime`, `workflow-util`, `workflow-reactor`) → (`workflow-cas`, `workflow-register`)`workflow-execute` `cli-workflow`
- Dependency layers: `workflow-protocol` → (`workflow-util`, `workflow-moderator`) → `workflow-agent-kit``workflow-agent-hermes` / `cli-workflow`
- Packages use `workspace:^` protocol (resolves to `^x.y.z` on publish)
- External CAS: `@uncaged/json-cas` (store API, hashing, schema validation) + `@uncaged/json-cas-fs` (filesystem backend)
## Language & Paradigm
@@ -109,8 +104,6 @@ type WorkflowEntry = {
- Always named exports, never default exports
- One module = one responsibility, filename = purpose
Workflow bundles (`.esm.js`) follow the same rule: export `const run` and `const descriptor`, not `export default`.
### Folder Module Discipline
Every folder under `src/` is a **module boundary**. Four rules:
@@ -136,10 +129,10 @@ export { createCasStore } from "../cas/cas.js";
// ❌ Bad — types defined in index.ts
// in cas/index.ts:
export type CasStore = { ... }; // should be in cas/types.ts
export type CasStore = { ... }; // should be in cas/types.ts
```
**Exception**: The package-level `src/index.ts` is the public API surface and re-exports from folder `index.ts` files. Files that remain at `src/` root (e.g. `types.ts`, `workflow-as-agent.ts`) are not inside a folder module and follow normal rules.
**Exception**: The package-level `src/index.ts` is the public API surface and re-exports from folder `index.ts` files. Files that remain at `src/` root (e.g. `types.ts`) are not inside a folder module and follow normal rules.
## Naming
@@ -160,7 +153,7 @@ Workflow names use **verb-first** kebab-case:
### ID Encoding
All IDs use **Crockford Base32**:
- Bundle hash: XXH64 → 13-char Crockford Base32
- CAS hash: XXH64 → 13-char Crockford Base32
- Thread ID: ULID → 26-char Crockford Base32 (10 timestamp + 16 random)
## Error Handling
@@ -189,7 +182,7 @@ import { createLogger } from "@uncaged/workflow-util";
const log = createLogger();
// Each call site has a fixed 8-char Crockford Base32 tag
log("4KNMR2PX", "Loading workflow bundle...");
log("4KNMR2PX", "Loading workflow...");
log("7BQST3VW", `Role ${role} started`);
```
@@ -204,7 +197,7 @@ log("7BQST3VW", `Role ${role} started`);
### Why fixed tags?
- `grep "4KNMR2PX"` in `.info.jsonl` → instant code location
- `grep "4KNMR2PX"` in logs → instant code location
- No need for file/line info in the log — tag is the locator
- Survives refactoring (tag stays the same when code moves)
@@ -221,74 +214,76 @@ console.log(result);
Do NOT use `await import()` in production code. Always use static top-level `import`.
**Exception**: The bundle loader and `extractBundleExports` dynamically import user workflow files at runtime.
```ts
// Dynamic import required: user bundle path resolved at runtime
const mod = await import(bundlePath);
```
Test files (`__tests__/**`) are exempt.
## Toolchain
| Tool | Purpose |
|------|---------|
| **bun** | Package manager + runtime + test runner |
| **bun** | Package manager + runtime |
| **TypeScript** | Type checking (strict mode) |
| **Biome** | Lint + format (replaces ESLint + Prettier) |
| **vitest** | Test runner (`cli-workflow` uses vitest; other packages use `bun test`) |
### Commands
### Development Workflow
```bash
bun run check # tsc --build + biome check
bun run format # biome format --write
bun test # run tests
# ── Setup ──
bun install # install all workspace dependencies
# ── Daily development ──
bun run build # tsc --build (all packages, dependency order)
bun run check # tsc --build + biome check + lint-log-tags
bun run format # biome format --write
bun test # run tests across all packages
# ── Before committing ──
bun run check # must pass — typecheck + lint + log tag validation
bun test # must pass — all package tests
```
### Version Management & Publishing
### Publishing
All public `@uncaged/*` packages are published to **npmjs.org** via `@changesets/cli` with **fixed mode** (all packages share the same version number). `workflow-dashboard` is private and excluded.
All public `@uncaged/*` packages are published to **npmjs.org** with **fixed mode** (all packages share the same version number).
```bash
# 1. After making changes, add a changeset describing the change
# 1. Add a changeset describing the change
bun changeset
# 2. Before release, bump all package versions + generate CHANGELOGs
# 2. Bump all package versions + generate CHANGELOGs
bun version
# 3. Build, test, and publish to npmjs
# 3. Build, test, and publish (runs scripts/publish-all.mjs)
bun release
# Or publish manually with a tag:
node scripts/publish-all.mjs --tag alpha
node scripts/publish-all.mjs --dry-run # preview without publishing
```
- `workspace:^` dependencies resolve to `^x.y.z` on publish
- Publish order defined in `scripts/publish-all.mjs` (dependency order)
- Changesets config: `.changeset/config.json` (fixed mode, public access)
- Each package has auto-generated `CHANGELOG.md`
### Consuming @uncaged/* Packages
External workflow repos just `bun install` — packages come from npmjs like any other dependency. No special registry config needed.
### End-to-end: Monorepo → Registry → Workspace → Bundle
### End-to-end: Author → Register → Run
```
workflow/ (monorepo) — engine, runtime, templates, agents
bun release — build + test + changeset publish
examples/solve-issue.yaml — write a workflow YAML definition
uwf workflow put
npmjs.org — @uncaged/* scoped packages (public)
│ bun install
~/.uncaged/workflow/cas/ — Workflow stored as CAS node
~/.uncaged/workflow/registry.yaml — name → hash mapping updated
│ uwf thread start <name> -p "..."
my-workflows/ (workspace) — normal package.json
bun run build:develop — bun build → single .esm.js
~/.uncaged/workflow/threads.yaml — new thread head pointer
uwf thread step <thread-id>
uncaged-workflow workflow add — register bundle locally
uncaged-workflow run — execute workflow
moderator → agent → extract — one step per invocation, repeat until $END
```
1. **Monorepo changes**`bun changeset` (describe change) → `bun version` (bump) → `bun release` (publish)
2. **Workspace** `bun install` fetches latest from npmjs
3. **Build** → produces single-file ESM bundle with `@uncaged/*` as externals
4. **Register & Run**`uncaged-workflow workflow add <name> <bundle>` then `uncaged-workflow run <name>`
1. **Author** — write a workflow YAML file with roles, conditions, and graph
2. **Register** `uwf workflow put <file.yaml>` parses YAML, registers output schemas, stores `WorkflowPayload` in CAS
3. **Run** `uwf thread start` creates a thread, `uwf thread step` executes one cycle per invocation
## Commit Convention
@@ -296,5 +291,5 @@ uncaged-workflow run — execute workflow
<type>(<scope>): <description>
type: feat | fix | refactor | docs | chore | test
scope: workflow | cli | rfc-001 | ...
scope: workflow | cli | moderator | agent-kit | hermes | util | protocol | ...
```
+69 -47
View File
@@ -1,71 +1,93 @@
# @uncaged/workflow
A workflow engine that executes single-file ESM bundles. Each workflow is a self-contained `.esm.js` file identified by its XXH64 hash (Crockford Base32).
A stateless workflow engine driven by a single-step CLI. Workflows are YAML definitions with roles, JSONata routing conditions, and a directed graph. Threads are immutable CAS-linked chains — each `uwf thread step` runs one moderator→agent→extract cycle and exits.
## Core Concepts
## Package Map
| Concept | Description |
|---------|-------------|
| **Workflow** | A single-file ESM module exporting `run` (workflow function) and `descriptor` (metadata). Identified by its XXH64 hash. |
| **Bundle** | The physical `.esm.js` file stored in `~/.uncaged/workflow/bundles/`. |
| **Thread** | A single execution of a workflow, identified by a ULID. CAS-backed chain plus `threads.json` / `history/*.jsonl`; `.info.jsonl` for debug logs. |
| **Role** | A named actor within a workflow. Each role produces output with typed `meta`. Roles live inside template packages (`src/roles/`). |
| **Registry** | `workflow.yaml` — maps workflow names to current/historical bundle hashes. |
| **CAS** | Content-Addressed Storage — bundles are immutable and addressed by hash. |
| Package | npm | Role |
|---------|-----|------|
| `cli-workflow` | `@uncaged/cli-workflow` | `uwf` CLI binary — thread lifecycle, workflow registry, CAS inspection, setup |
| `workflow-protocol` | `@uncaged/workflow-protocol` | Shared TypeScript types (`WorkflowPayload`, `StepNodePayload`, `WorkflowConfig`, etc.) |
| `workflow-moderator` | `@uncaged/workflow-moderator` | JSONata graph evaluator — determines next role or `$END` |
| `workflow-agent-kit` | `@uncaged/workflow-agent-kit` | `createAgent` factory, context builder, two-layer extract pipeline |
| `workflow-agent-hermes` | `@uncaged/workflow-agent-hermes` | `uwf-hermes` agent — spawns Hermes chat, captures session |
| `workflow-util` | `@uncaged/workflow-util` | Crockford Base32, ULID, logger, frontmatter parsing |
## Monorepo Packages
```
packages/
workflow/ # @uncaged/workflow — core lib (types, engine, hash, ULID, registry)
cli-workflow/ # @uncaged/cli-workflow — CLI (`uncaged-workflow` command)
workflow-template-develop/ # @uncaged/workflow-template-develop — develop workflow template (includes roles)
workflow-template-solve-issue/ # @uncaged/workflow-template-solve-issue — solve-issue workflow template (includes roles)
workflow-agent-hermes/ # @uncaged/workflow-agent-hermes — Hermes agent adapter
workflow-agent-cursor/ # @uncaged/workflow-agent-cursor — Cursor agent adapter
workflow-agent-llm/ # @uncaged/workflow-agent-llm — LLM agent adapter
workflow-util-agent/ # @uncaged/workflow-util-agent — agent utilities (buildAgentPrompt, spawnCli)
```
Managed with **bun workspace** using the `workspace:*` protocol.
External: [`@uncaged/json-cas`](https://www.npmjs.com/package/@uncaged/json-cas) (CAS store + JSON Schema validation) + `@uncaged/json-cas-fs` (filesystem backend).
## Quick Start
```bash
# Install dependencies
bun install
# 1. Configure provider and model
uwf setup
# Build all packages
bun run build
# 2. Register a workflow from YAML
uwf workflow put examples/solve-issue.yaml
# Register a workflow bundle
uncaged-workflow workflow add solve-issue dist/packages/workflow-template-solve-issue/solve-issue.esm.js
# 3. Start a thread
uwf thread start solve-issue -p "Fix the login redirect bug"
# Run a workflow
uncaged-workflow run solve-issue --prompt "Fix bug #42"
# 4. Execute steps (one at a time, until done)
uwf thread step <thread-id>
```
## CLI Usage
## CLI Commands
```bash
uncaged-workflow # Print full command usage (exits with status 1)
uncaged-workflow workflow list # List registered workflows
uncaged-workflow run <name> # Start a workflow thread
uncaged-workflow thread list # List all threads
uncaged-workflow thread show <id> # Inspect a thread
uncaged-workflow skill # Agent-consumable reference docs
```
### Thread
Run `uncaged-workflow` with no arguments to print usage, or `uncaged-workflow skill cli` for the full CLI skill reference.
| Command | Description |
|---------|-------------|
| `uwf thread start <workflow> -p <prompt>` | Create a thread (no execution) |
| `uwf thread step <thread-id> [--agent <cmd>]` | Execute one moderator→agent→extract cycle |
| `uwf thread show <thread-id>` | Show head pointer and done status |
| `uwf thread list [--all]` | List threads (`--all` includes archived) |
| `uwf thread steps <thread-id>` | List all steps chronologically |
| `uwf thread read <thread-id> [--quota N]` | Render thread as readable markdown |
| `uwf thread fork <step-hash>` | Fork from a specific step |
| `uwf thread step-details <step-hash>` | Dump full detail node |
| `uwf thread kill <thread-id>` | Terminate and archive |
### Workflow
| Command | Description |
|---------|-------------|
| `uwf workflow put <file.yaml>` | Register a workflow from YAML |
| `uwf workflow show <name-or-hash>` | Show workflow definition |
| `uwf workflow list` | List registered workflows |
### CAS
| Command | Description |
|---------|-------------|
| `uwf cas get <hash>` | Read a CAS node |
| `uwf cas put <type-hash> <data>` | Store a node |
| `uwf cas has <hash>` | Check existence |
| `uwf cas refs <hash>` | List direct references |
| `uwf cas walk <hash>` | Recursive traversal |
| `uwf cas reindex` | Rebuild type index |
| `uwf cas schema list` | List schemas |
| `uwf cas schema get <hash>` | Show a schema |
### Setup
| Command | Description |
|---------|-------------|
| `uwf setup` | Interactive provider/model/agent configuration |
| `uwf setup --provider ... --base-url ... --api-key ... --model ...` | Non-interactive setup |
Config stored in `~/.uncaged/workflow/config.yaml`. API keys in `~/.uncaged/workflow/.env`.
## Development
```bash
bun run check # Biome lint + format check
bun run format # Auto-format with Biome
bun test # Run tests
bun install --no-cache # Install dependencies
bun run check # tsc + biome + lint-log-tags
bun run format # Auto-format with Biome
bun test # Run all tests
```
Managed with **bun workspace**. See [CLAUDE.md](CLAUDE.md) for coding conventions.
## Architecture
See [docs/architecture.md](docs/architecture.md) for the full design — three-phase engine loop, bundle contract, storage layout, and design decisions.
See [docs/architecture.md](docs/architecture.md) for the full design — three-phase engine loop, CAS node types, storage layout, agent CLI protocol, and design decisions.
+13 -1
View File
@@ -5,6 +5,8 @@
"**",
"!**/dist",
"!**/node_modules",
"!**/legacy-packages",
"!scripts",
"!packages/workflow/workflow",
"!xiaoju/scripts/bundle.ts"
]
@@ -36,7 +38,7 @@
}
},
{
"includes": ["**/*.d.ts"],
"includes": ["**/*.d.ts", "**/vitest.config.*"],
"linter": {
"rules": {
"style": {
@@ -44,6 +46,16 @@
}
}
}
},
{
"includes": ["**/cli.ts", "**/setup.ts"],
"linter": {
"rules": {
"suspicious": {
"noConsole": "off"
}
}
}
}
],
"linter": {
+32 -15
View File
@@ -85,8 +85,13 @@ description: "End-to-end issue resolution"
roles:
planner:
description: "Creates implementation plan"
systemPrompt: "You are a planning agent. Analyze the issue and create a step-by-step plan."
outputSchema:
goal: "You are a planning agent. Analyze the issue and create a step-by-step plan."
capabilities:
- issue-analysis
- planning
procedure: "Analyze the issue and create a detailed, actionable implementation plan."
output: "Output the plan summary and list of concrete steps."
meta:
type: object
properties:
plan: { type: string }
@@ -94,8 +99,13 @@ roles:
required: [plan, steps]
developer:
description: "Implements code changes"
systemPrompt: "You are a developer agent. Implement the plan."
outputSchema:
goal: "You are a developer agent. Implement the plan."
capabilities:
- file-edit
- shell
procedure: "Implement the plan. Write code, tests, and ensure existing tests pass."
output: "List all files changed and provide a summary of the implementation."
meta:
type: object
properties:
filesChanged: { type: array, items: { type: string } }
@@ -103,8 +113,12 @@ roles:
required: [filesChanged, summary]
reviewer:
description: "Reviews code changes"
systemPrompt: "You are a code reviewer. Review the implementation."
outputSchema:
goal: "You are a code reviewer. Review the implementation."
capabilities:
- code-review
procedure: "Review the implementation against the plan."
output: "Approve or reject with detailed comments."
meta:
type: object
properties:
approved: { type: boolean }
@@ -133,7 +147,7 @@ graph:
Key properties:
- **`roles`** — inline role definitions; each `outputSchema` is a JSON Schema (stored as its own CAS node on registration)
- **`roles`** — inline role definitions; each `meta` is a JSON Schema (stored as its own CAS node on registration)
- **`conditions`** — named JSONata expressions evaluated against the `ModeratorContext`
- **`graph`** — `Record<Role | "$START", Transition[]>` — first matching transition wins; `condition: null` = fallback
- **No agent binding** — agent selection is a deployment concern, configured in `config.yaml`
@@ -156,7 +170,7 @@ Each `uwf thread step` runs exactly one cycle: moderator → agent → extract.
│ Output: raw string (frontmatter markdown)
│ Phase 3: EXTRACT
│ Input: raw agent output + role's outputSchema
│ Input: raw agent output + role's meta schema
│ Engine: two-layer extract (frontmatter fast path → LLM fallback)
│ Output: CasRef to structured output node
@@ -213,7 +227,7 @@ Contract:
- Parses argv
- Loads `.env` from storage root
- Builds `AgentContext` by walking the CAS chain from `threads.yaml` head
- Resolves the role's `outputSchema` and builds `outputFormatInstruction`
- Resolves the role's `meta` schema and builds `outputFormatInstruction`
- Calls the agent's `run` function
- Runs two-layer extract on the raw output
- Writes `StepNode` to CAS (output + detail + prev link)
@@ -242,7 +256,7 @@ scope: role
Fixed the login redirect by updating the auth middleware...
```
The `outputFormatInstruction` (built by `buildOutputFormatInstruction` in `workflow-agent-kit`) is prepended to the role's system prompt, so the deliverable format is the first thing the agent sees. It lists the expected frontmatter fields derived from the role's JSON Schema.
The `outputFormatInstruction` (built by `buildOutputFormatInstruction` in `workflow-agent-kit`) is prepended to the role's system prompt, so the deliverable format is the first thing the agent sees. It lists the expected frontmatter fields derived from the role's `meta` JSON Schema.
## Two-layer extract
@@ -253,7 +267,7 @@ Structured output extraction uses a two-layer strategy (`workflow-agent-kit`):
1. Parse YAML frontmatter from raw agent output (`parseFrontmatterMarkdown`)
2. Validate required fields (`validateFrontmatter`)
3. Build a candidate object from frontmatter fields (`status`, `next`, `confidence`, `artifacts`, `scope`)
4. `store.put()` the candidate against the role's `outputSchema`
4. `store.put()` the candidate against the role's `meta` schema
5. Validate with `json-cas` schema validation
6. If valid → return `outputHash` (zero LLM cost)
@@ -272,7 +286,7 @@ If the fast path returns `null` (no frontmatter, invalid, or doesn't satisfy sch
`workflow-agent-kit` prepends two pieces of context to the agent's system prompt:
1. **Deliverable format instruction** — generated from the role's `outputSchema`, tells the agent exactly what frontmatter fields to produce and the expected format
1. **Deliverable format instruction** — generated from the role's `meta` schema, tells the agent exactly what frontmatter fields to produce and the expected format
2. **Scope constraint** — "Focus exclusively on YOUR role's deliverable. Do not perform actions outside your role's scope."
This ensures agents produce parseable frontmatter output without requiring per-agent format knowledge.
@@ -289,8 +303,11 @@ payload:
roles:
planner:
description: "Creates implementation plan"
systemPrompt: "You are a planning agent..."
outputSchema: "5GWKR8TN1V3JA" # cas_ref → JSON Schema node
goal: "You are a planning agent..."
capabilities: [planning, issue-analysis]
procedure: "Analyze the issue and create a plan."
output: "Output the plan summary."
meta: "5GWKR8TN1V3JA" # cas_ref → JSON Schema node
conditions:
notApproved:
description: "Reviewer rejected"
@@ -318,7 +335,7 @@ payload:
start: "4TNVW8KR2B3MA" # cas_ref → StartNode
prev: "2MXBG6PN4A8JR" # cas_ref → previous StepNode (null for first step)
role: "developer"
output: "9KRVW3TN5F1QA" # cas_ref → structured output (validated against outputSchema)
output: "9KRVW3TN5F1QA" # cas_ref → structured output (validated against meta schema)
detail: "7BQST3VW9F2MA" # cas_ref → execution detail (raw turns, session data)
agent: "uwf-hermes" # agent command used (plain string)
```
+779
View File
@@ -0,0 +1,779 @@
# Built-in Role Agent 调研
## 目标
实现一个内置的 role agent(暂称 `uwf-builtin`),不依赖 hermes/openclaw 等外部 agent 进程。
直接使用 workflow config 中配置的 model,自己实现 agent run loop 和关键 toolkit。
---
## 关键问题
### Q1: Agent 接口协议
现有 agent 是怎么被 CLI 调用的?输入(argv、环境变量)和输出(stdout、CAS)格式是什么?
**调研要点:**
- `cli-workflow``spawnAgent` 的完整实现
- AgentConfig 类型定义
- agent 进程的 exit code 约定
- 环境变量传递(UWF_STORAGE_ROOT 等)
**答案:**
#### 调用链
`uwf thread step``cmdThreadStepOnce` → moderator 求值下一 role → `resolveAgentConfig``spawnAgent`
#### AgentConfig 类型
```146:149:packages/workflow-protocol/src/types.ts
export type AgentConfig = {
command: string;
args: string[];
};
```
在 `config.yaml` 的 `agents` 段注册,例如 `hermes: { command: "uwf-hermes", args: [] }`。
#### spawnAgent 行为
```627:653:packages/cli-workflow/src/commands/thread.ts
function spawnAgent(agent: AgentConfig, threadId: ThreadId, role: string): CasRef {
const argv = [...agent.args, threadId, role];
let stdout: string;
try {
stdout = execFileSync(agent.command, argv, {
encoding: "utf8",
env: process.env,
stdio: ["ignore", "pipe", "pipe"],
});
} catch (e) {
// ... stderr 拼进 fail 消息
}
const line = stdout.trim().split("\n").pop()?.trim() ?? "";
if (!isCasRef(line)) {
fail(`agent stdout is not a valid CAS hash: ${line || "(empty)"}`);
}
return line;
}
```
| 项目 | 约定 |
|------|------|
| **argv** | `[...agent.args, <thread-id>, <role>]`,即 `process.argv[2]`=threadId,`process.argv[3]`=role(与 `createAgent` 的 `parseArgv` 一致) |
| **stdin** | 忽略 |
| **stdout** | 纯文本,**最后一行**必须是新 `StepNode` 的 CAS hash(13 字符 Crockford Base32) |
| **stderr** | 失败时 CLI 会附带 stderr;成功时无约定 |
| **exit code** | `0` = 成功;非 0 时 `execFileSync` 抛错,step 失败 |
| **环境变量** | 继承父进程 `process.env`(含 storage root、API key 等) |
| **链头更新** | **不由 agent 负责**;agent 只写 CAS StepNode,CLI 在拿到 stdout hash 后更新 `threads.yaml` |
Agent 解析优先级(`resolveAgentConfig`):
1. CLI `--agent` override(整段 command + args 字符串)
2. `config.agentOverrides[workflow.name][role]`
3. `config.defaultAgent`
#### 环境变量:Storage Root
文档中写的 `UWF_STORAGE_ROOT` **在当前代码中不存在**。实际优先级(`workflow-agent-kit` / `cli-workflow` 一致):
```33:43:packages/workflow-agent-kit/src/storage.ts
export function resolveStorageRoot(): string {
const internal = process.env.UNCAGED_WORKFLOW_STORAGE_ROOT;
if (internal !== undefined && internal !== "") {
return internal;
}
const userOverride = process.env.WORKFLOW_STORAGE_ROOT;
if (userOverride !== undefined && userOverride !== "") {
return userOverride;
}
return getDefaultStorageRoot();
}
```
Agent 子进程通过继承的 `process.env` 与父 CLI 共享同一 storage root;`createAgent` 内还会 `loadDotenv({ path: getEnvPath(storageRoot) })` 加载 `~/.uncaged/workflow/.env`。
#### Agent 侧职责(设计文档 + 实现)
- 读 `threads.yaml` 链头,构建 context,执行 role
- 将 `StepNode` 写入 CAS(`output` / `detail` / `agent` / `prev` / `start`)
- stdout 打印 step hash
- **不**更新 `threads.yaml`
---
### Q2: createAgent 工厂
workflow-agent-kit 的 `createAgent` 做了什么?它的完整生命周期是什么?
**调研要点:**
- `AgentOptions` 类型的 `run` 和 `continue` 回调签名
- `AgentRunResult` 的完整定义
- retry 逻辑(frontmatter 校验失败后的重试机制)
- `persistStep` 写入 CAS 的 StepNode 结构
**答案:**
#### 类型定义
```4:35:packages/workflow-agent-kit/src/types.ts
export type AgentContext = ModeratorContext & {
threadId: ThreadId;
role: string;
store: Store;
workflow: WorkflowPayload;
outputFormatInstruction: string;
};
export type AgentRunResult = {
output: string;
detailHash: CasRef;
sessionId: string;
};
export type AgentContinueFn = (
sessionId: string,
message: string,
store: AgentContext["store"],
) => Promise<AgentRunResult>;
export type AgentRunFn = (ctx: AgentContext) => Promise<AgentRunResult>;
export type AgentOptions = {
name: string;
run: AgentRunFn;
continue: AgentContinueFn;
};
```
- **`run(ctx)`**:首次执行,返回原始 agent 文本 `output`、审计用 `detailHash`、用于续聊的 `sessionId`。
- **`continue(sessionId, message, store)`**:在同一 session 上追加用户消息(用于 frontmatter 纠错),再次返回 `AgentRunResult`。
`createAgent(options)` 返回 `() => Promise<void>`,作为 agent CLI 的 `main`(见 `uwf-hermes` 的 `cli.ts`)。
#### 生命周期(按执行顺序)
```101:152:packages/workflow-agent-kit/src/run.ts
export function createAgent(options: AgentOptions): () => Promise<void> {
return async function main(): Promise<void> {
const { threadId, role } = parseArgv(process.argv);
const storageRoot = resolveStorageRoot();
loadDotenv({ path: getEnvPath(storageRoot) });
const ctx = await buildContextWithMeta(threadId, role);
// 1. 校验 role 存在
// 2. 从 CAS 取 frontmatter JSON Schema → buildOutputFormatInstruction → ctx.outputFormatInstruction
let agentResult = await options.run(ctx);
let outputHash = await tryExtractOutput(agentResult.output, roleDef.frontmatter, ctx);
for (let retry = 0; retry < MAX_FRONTMATTER_RETRIES && outputHash === null; retry++) {
const correctionMessage = "Your previous response did not contain valid YAML frontmatter...";
agentResult = await options.continue(agentResult.sessionId, correctionMessage, ctx.meta.store);
outputHash = await tryExtractOutput(agentResult.output, roleDef.frontmatter, ctx);
}
if (outputHash === null) { fail(...); }
const stepHash = await persistStep({ ctx, outputHash, detailHash: agentResult.detailHash, agentName });
process.stdout.write(`${stepHash}\n`);
};
}
```
| 阶段 | 行为 |
|------|------|
| 解析 argv | `argv[2]=threadId`, `argv[3]=role`,缺失则 `stderr` + `exit(1)` |
| Context | `buildContextWithMeta` + 可选 `outputFormatInstruction` |
| Run | `options.run(ctx)` |
| Extract | **仅** `tryFrontmatterFastPath`(见 Q4);**不**调用 `extract()` LLM fallback |
| Retry | 最多 `MAX_FRONTMATTER_RETRIES = 2` 次 `continue` + 再试 fast-path |
| Persist | `persistStep` → `writeStepNode` |
| 输出 | stdout 一行 step CAS hash |
#### StepNode 写入结构
```44:68:packages/workflow-agent-kit/src/run.ts
async function writeStepNode(options: {
store: AgentStore["store"];
schemas: AgentStore["schemas"];
startHash: CasRef;
prevHash: CasRef | null;
role: string;
outputHash: CasRef;
detailHash: CasRef;
agentName: string;
}): Promise<CasRef> {
const payload: StepNodePayload = {
start: options.startHash,
prev: options.prevHash,
role: options.role,
output: options.outputHash,
detail: options.detailHash,
agent: options.agentName,
};
// store.put(stepNode schema) + validate
}
```
`agentName` 经 `agentLabel(name)` 规范化:已有 `uwf-` 前缀则原样,否则加 `uwf-`(如 `hermes` → `uwf-hermes`)。
`prevHash`:若链头仍是 `StartNode` 则为 `null`,否则为当前 head step hash。
---
### Q3: Context Builder
`buildContextWithMeta` 构建了什么上下文给 agent?
**调研要点:**
- `AgentContext` 完整类型定义(所有字段)
- context 构建过程(CAS chain walk)
- `outputFormatInstruction` 怎么生成的
- role definition 怎么获取(从 workflow YAML)
**答案:**
#### AgentContext 字段
继承 `ModeratorContext`:
```60:68:packages/workflow-protocol/src/types.ts
export type ModeratorContext = {
start: StartNodePayload;
steps: StepContext[];
};
```
```48:51:packages/workflow-protocol/src/types.ts
export type StartNodePayload = {
workflow: CasRef;
prompt: string;
};
```
```61:63:packages/workflow-protocol/src/types.ts
export type StepContext = Omit<StepRecord, "output"> & {
output: unknown;
};
```
`AgentContext` 额外字段:
| 字段 | 类型 | 含义 |
|------|------|------|
| `threadId` | `ThreadId` | 当前线程 |
| `role` | `string` | 本步要执行的角色名 |
| `store` | `Store` | CAS store(读写节点) |
| `workflow` | `WorkflowPayload` | 已从 CAS 加载的 workflow 定义 |
| `outputFormatInstruction` | `string` | 由 `createAgent` 根据 role 的 frontmatter schema 生成;`buildContext*` 初始为 `""` |
`buildContextWithMeta` 还返回 `meta`:
```148:154:packages/workflow-agent-kit/src/context.ts
export type BuildContextMeta = {
storageRoot: string;
store: Store;
schemas: AgentStore["schemas"];
headHash: CasRef;
chain: ChainState;
};
```
#### CAS chain walk
1. 从 `threads.yaml[threadId]` 取 `headHash`
2. `walkChain`:若 head 是 `StartNode`,`stepsNewestFirst=[]`;否则沿 `prev` 收集所有 `StepNode`, newest-first
3. `buildHistory`:反转为时间序,`expandOutput` 把每步 `output` CasRef 展开为 JSON payload(供 prompt / JSONata 使用)
4. `loadWorkflow`:从 `start.workflow` CasRef 加载 `WorkflowPayload`
#### Role definition 来源
- 作者写在 workflow YAML 的 `roles.<name>`(`goal`, `capabilities`, `procedure`, `output`, `frontmatter` 等)
- `uwf workflow put` 时 `frontmatter` 内联 JSON Schema 经 `putSchema` 存入 CAS,workflow 里存的是 **CasRef**
- Agent 运行时:`ctx.workflow.roles[ctx.role]` → `RoleDefinition`
#### outputFormatInstruction
在 `createAgent` 中,若 `getSchema(store, roleDef.frontmatter)` 非空,则:
```typescript
ctx.outputFormatInstruction = buildOutputFormatInstruction(frontmatterSchema);
```
`buildOutputFormatInstruction` 根据 JSON Schema 的 `properties` 生成「必须以 `---` YAML frontmatter 开头」的说明和示例字段列表(见 `build-output-format-instruction.ts`)。
各 agent 实现(Hermes / Claude Code)在组装 prompt 时把该块放在最前,再接 `buildRolePrompt(roleDef)`。
---
### Q4: Extract Pipeline
agent 输出怎么被处理成结构化数据?
**调研要点:**
- frontmatter fast-path 的完整逻辑
- LLM extract fallback 的实现(`extract.ts`)
- frontmatter schema 从哪里来(role 定义里的 `frontmatter` 字段)
- 校验失败时的 correction prompt 是什么
**答案:**
#### Schema 来源
Workflow YAML 中每个 role 的 `frontmatter:` 段是 JSON Schema 对象;注册时:
```66:76:packages/cli-workflow/src/commands/workflow.ts
async function resolveFrontmatterRef(..., frontmatter: unknown): Promise<CasRef> {
// 校验为 JSON Schema → putSchema → 返回 CasRef
}
```
运行时 `roleDef.frontmatter` 即该 schema 的 CAS hash;structured `output` 节点用**同一 schema** 写入 CAS。
#### Frontmatter fast-path(createAgent 实际使用的路径)
```148:195:packages/workflow-agent-kit/src/frontmatter.ts
export async function tryFrontmatterFastPath(
raw: string,
outputSchema: CasRef,
store: Store,
): Promise<FrontmatterFastPathResult | null>
```
流程:
1. `parseFrontmatterMarkdown(raw)` → 标准 agent 字段(`status`, `next`, `confidence`, `artifacts`, `scope`)+ body
2. `validateFrontmatter` 失败 → `null`
3. `getSchema(store, outputSchema)` + `extractSchemaFields` 得到 role 需要的属性名
4. `buildCandidate`:从标准 frontmatter + YAML 原始字段拼出符合 schema 的对象
5. `store.put(outputSchema, candidate)` + `validate` → 成功则 `{ body, outputHash }`
**永不抛错**,失败返回 `null`。
#### LLM extract fallback(已实现但未接入 createAgent)
```135:181:packages/workflow-agent-kit/src/extract.ts
export async function extract(
rawOutput: string,
outputSchema: CasRef,
config: WorkflowConfig,
): Promise<ExtractResult>
```
- 模型:`resolveExtractModelAlias(config)` → `modelOverrides.extract` → `models.extract` → `models.default` → `defaultModel`
- HTTP:`POST {baseUrl}/chat/completions`,`response_format: { type: "json_object" }`
- System:要求按 JSON Schema 从 agent 输出提取单个 JSON 对象
- 校验通过后 `store.put(outputSchema, structured)`
**重要:`createAgent` 当前未调用 `extract()`**。fast-path 失败且 2 次 `continue` 仍失败则直接 `fail()`。builtin agent 若希望无 frontmatter 也能跑,需在 kit 或 builtin 层显式接入 `extract()`。
#### Correction prompt(retry)
```125:128:packages/workflow-agent-kit/src/run.ts
const correctionMessage =
"Your previous response did not contain valid YAML frontmatter matching the role schema.\n" +
"You MUST begin your response with a YAML frontmatter block (--- delimited).\n" +
"Please output ONLY the corrected frontmatter block followed by your work.";
```
通过 `options.continue(sessionId, correctionMessage, store)` 发给外部 agent;builtin 需在自有 message 历史里 append 同等语义的 user 消息。
---
### Q5: Model 配置与 LLM 调用
workflow 怎么配置和使用 model?
**调研要点:**
- `WorkflowConfig` 中 providers/models/defaultModel/modelOverrides 的完整定义
- `resolveModel` 函数的实现
- `chatCompletionText` 的实现(OpenAI 兼容 HTTP 客户端)
- 有没有 streaming 支持?tool calling 支持?
**答案:**
#### WorkflowConfig
```136:160:packages/workflow-protocol/src/types.ts
export type ProviderConfig = {
baseUrl: string;
apiKeyEnv: string;
};
export type ModelConfig = {
provider: ProviderAlias;
name: string;
};
export type WorkflowConfig = {
providers: Record<ProviderAlias, ProviderConfig>;
models: Record<ModelAlias, ModelConfig>;
agents: Record<AgentAlias, AgentConfig>;
defaultAgent: AgentAlias;
agentOverrides: Record<WorkflowName, Record<RoleName, AgentAlias>> | null;
defaultModel: ModelAlias;
modelOverrides: Record<Scenario, ModelAlias> | null;
};
```
示例见 `docs/architecture.md`(`providers` / `models` / `defaultModel` / `modelOverrides.extract`)。
#### resolveModel
```32:50:packages/workflow-agent-kit/src/extract.ts
export function resolveModel(config: WorkflowConfig, alias: ModelAlias): ResolvedLlmProvider {
const modelEntry = config.models[alias];
const providerEntry = config.providers[modelEntry.provider];
const apiKey = process.env[providerEntry.apiKeyEnv];
return { baseUrl: providerEntry.baseUrl, apiKey, model: modelEntry.name };
}
```
`ResolvedLlmProvider = { baseUrl, apiKey, model }`。
Extract 专用别名解析:
```18:30:packages/workflow-agent-kit/src/extract.ts
export function resolveExtractModelAlias(config: WorkflowConfig): ModelAlias {
return config.modelOverrides?.extract ?? (config.models.extract ? "extract" : config.models.default ? "default" : config.defaultModel);
}
```
**尚无** `modelOverrides` 按 role/workflow 解析 agent 主模型的函数;builtin 首版可用 `config.defaultModel`,扩展时可加 `modelOverrides.agent` 或与 `agentOverrides` 对称的表。
#### chatCompletionText
```87:124:packages/workflow-agent-kit/src/extract.ts
async function chatCompletionText(
provider: ResolvedLlmProvider,
messages: Array<{ role: "system" | "user"; content: string }>,
): Promise<string>
```
| 能力 | 现状 |
|------|------|
| 协议 | OpenAI 兼容 `POST /chat/completions` |
| Streaming | **无**(一次性 `response.text()`) |
| Tool calling | **无**(无 `tools` / `tool_calls` 字段) |
| 多模态 | **无**(仅 text `content`) |
| Extract 专用 | `response_format: { type: "json_object" }` |
builtin agent 的 run loop 需要**新写**带 `tools` 的 completion 客户端(可放在 `workflow-agent-builtin` 或扩展 `workflow-agent-kit` 的 `llm/` 模块),不能复用当前 `chatCompletionText` 而不改。
---
### Q6: Hermes Agent 参考实现
`uwf-hermes` 是怎么实现 `run` 和 `continue` 的?
**调研要点:**
- prompt 怎么组装的(outputFormatInstruction + rolePrompt + task + history)
- hermes CLI 的调用参数
- session management(resume)
- 输出怎么捕获
**答案:**
#### Prompt 组装
```40:53:packages/workflow-agent-hermes/src/hermes.ts
export function buildHermesPrompt(ctx: AgentContext): string {
const roleDef = ctx.workflow.roles[ctx.role];
const rolePrompt = roleDef !== undefined ? buildRolePrompt(roleDef) : "";
const parts: string[] = [];
if (ctx.outputFormatInstruction !== "") {
parts.push(ctx.outputFormatInstruction, "");
}
parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
const historyBlock = buildHistorySummary(ctx.steps);
if (historyBlock !== "") {
parts.push("", historyBlock);
}
return parts.join("\n");
}
```
`buildRolePrompt` 生成 `## Goal` / `## Capabilities` / `## Prepare`(含 `generateCliReference()`)/ `## Procedure` / `## Output`。
`buildHistorySummary`:每步 `role`、`JSON.stringify(step.output)`、`agent`。
Hermes 把**整段 prompt 作为单条 user 消息**传给 `hermes chat -q`(无独立 system channel)。
#### Hermes CLI 参数
首次:
```88:97:packages/workflow-agent-hermes/src/hermes.ts
spawnHermes(["chat", "-q", prompt, "--yolo", "--max-turns", "90", "--quiet"]);
```
续聊:
```100:114:packages/workflow-agent-hermes/src/hermes.ts
spawnHermes(["chat", "--resume", sessionId, "-q", message, "--yolo", "--max-turns", "90", "--quiet"]);
```
#### Session
- stdout/stderr 中解析 `session_id: <id>`(`parseSessionIdFromStdout`)
- 会话文件:`~/.hermes/sessions/session_<id>.json`
- `loadHermesSession` → `storeHermesSessionDetail`:每 assistant/tool 消息写成 CAS turn 节点,汇总为 `detail`;**output 文本** = 最后一条非空 `assistant` 的 `content`
#### 与 createAgent 的衔接
```157:164:packages/workflow-agent-hermes/src/hermes.ts
export function createHermesAgent(): () => Promise<void> {
return createAgent({ name: "hermes", run: runHermes, continue: continueHermes });
}
```
`uwf-hermes` 入口:`createHermesAgent()` 即 main。
Claude Code 包(`workflow-agent-claude-code`)结构相同:`buildClaudeCodePrompt` 同构,`claude -p` + `--resume` + JSON stdout 解析。
---
### Q7: Toolkit 需求分析
要实现一个自给自足的 agent,最少需要哪些 tool?
**调研要点:**
- 现有 workflow example(solve-issue.yaml)里 role 都做什么任务
- hermes agent 在 workflow 场景下常用哪些 tool
- 哪些 tool 是 agent loop 必须的(如 file read/write、shell exec、web fetch)
**答案:**
#### solve-issue.yaml 角色能力
| Role | capabilities | 隐含需求 |
|------|----------------|----------|
| planner | issue-analysis, planning | 读上下文/仓库、总结,通常不需写代码 |
| developer | file-edit, shell, testing | **读文件、写文件、执行命令** |
| reviewer | code-review, static-analysis | 读 diff/文件、静态分析(可读+可选 shell) |
#### Hermes 侧
Hermes 自带完整 agent runtime(`--yolo`、max-turns),tool 集由 Hermes 项目定义,workflow 不配置。从 session JSON 可见 `tool_calls` 被记入 detail,常见包括文件与 shell 类工具。
#### Builtin 最小 toolkit 建议
| 优先级 | Tool | 用途 |
|--------|------|------|
| P0 | `read_file` | 读仓库/配置/issue 上下文 |
| P0 | `write_file` / `edit_file` | developer 改代码 |
| P0 | `run_command` | 测试、构建、git(需 cwd + timeout + 输出截断) |
| P1 | `list_dir` / `glob` | 导航代码库 |
| P1 | `grep` | 搜索符号/引用 |
| P2 | `fetch_url` | 查文档(planner 偶尔需要) |
**不需要**在 builtin 里实现 moderator / workflow 路由工具——仍由 `uwf thread step` + JSONata 负责。
#### Agent loop 必须能力
1. 多轮 LLM 调用 + **OpenAI-style tool_calls** 解析与执行
2. 将 tool 结果 append 回 messages
3. 终止条件:模型不再请求 tool,或达到 `maxTurns`
4. 最终响应须含合法 YAML frontmatter(满足 Q4),供 `createAgent` fast-path
---
## 方案草案
(调研完成后基于以上答案撰写)
### 架构设计
```mermaid
flowchart TB
subgraph cli ["cli-workflow"]
Step["uwf thread step"]
Spawn["spawnAgent(uwf-builtin, threadId, role)"]
Step --> Spawn
end
subgraph builtin_pkg ["@uncaged/workflow-agent-builtin"]
Main["createBuiltinAgent() = createAgent({...})"]
Prompt["buildBuiltinPrompt(ctx)"]
Loop["runBuiltinLoop(provider, messages, tools)"]
Tools["Toolkit: read/write/exec/..."]
Detail["storeBuiltinDetail(turns)"]
Main --> Prompt
Main --> Loop
Loop --> Tools
Loop --> Detail
end
subgraph kit ["workflow-agent-kit"]
Ctx["buildContextWithMeta"]
FM["tryFrontmatterFastPath"]
Persist["persistStep"]
Ctx --> Main
Main --> FM
FM --> Persist
end
subgraph cas ["CAS / config"]
Config["config.yaml models/providers"]
CAS["cas/ + threads.yaml"]
end
Spawn --> Main
Config --> Loop
CAS --> Ctx
Persist --> CAS
Spawn -->|"stdout: step hash"| Step
```
**新包**:`packages/workflow-agent-builtin`,bin `uwf-builtin`,仅依赖 `workflow-agent-kit`、`workflow-protocol`、`workflow-util`(可选 `@uncaged/json-cas` 写 detail schema)。
**分层**:
| 层 | 职责 |
|----|------|
| `createAgent`(kit) | argv、context、frontmatter extract、StepNode、stdout 协议 — **不变** |
| `builtin/agent.ts` | `run` / `continue` 实现 |
| `builtin/llm.ts` | OpenAI 兼容 chat + tools(可后续抽到 kit) |
| `builtin/tools/*.ts` | 各 tool 的 JSON Schema + handler |
| `builtin/prompt.ts` | 复用 Hermes 的 prompt 拼接逻辑(或抽到 kit 的 `buildAgentPrompt`) |
| `builtin/detail.ts` | 类似 Hermes:每轮 assistant/tool 写入 CAS detail |
**配置集成**:
```yaml
agents:
builtin:
command: "uwf-builtin"
args: []
defaultAgent: "builtin" # 或 agentOverrides 按 role 指定
```
模型:首版 `resolveModel(config, config.defaultModel)`;后续可增加 `modelOverrides.agent` 或 per-role 映射。
---
### Agent Run Loop
伪代码(单次 `run(ctx)`):
```
1. provider ← resolveModel(loadWorkflowConfig(), defaultModel)
2. system ← buildBuiltinPrompt(ctx) // outputFormatInstruction + buildRolePrompt + Task + History
3. messages ← [{ role: "system", content: system }]
4. sessionId ← newULID() // 内存或临时目录,供 continue 使用
5. turns ← []
6. for turn in 1..MAX_TURNS:
response ← chatCompletionWithTools(provider, messages, TOOL_DEFINITIONS)
record assistant message + tool_calls in turns
if response has no tool_calls:
finalText ← response.content
break
for each tool_call:
result ← executeTool(tool_call, { cwd: process.cwd() })
messages.push tool result
record in turns
7. if no finalText with valid frontmatter after loop:
optionally one-shot "finalize" message without tools
8. detailHash ← storeBuiltinDetail(store, sessionId, turns, metadata)
9. return { output: finalText, detailHash, sessionId }
```
**`continue(sessionId, message, store)`**:
- 从内存/磁盘恢复 `messages` + `turns`
- `messages.push({ role: "user", content: message })`(correction 或续聊)
- 从步骤 6 继续,步数上限可单独设小一点(如 3)
- 返回新的 `AgentRunResult`
**与 frontmatter 的配合**:
- system prompt 已含 `outputFormatInstruction`;最后一轮可强制 user:`Now output your final answer with YAML frontmatter only if you have not yet.`
- 仍依赖 `createAgent` 的 fast-path + 最多 2 次 continue
**安全**:
- `run_command`:白名单或需 `UWF_BUILTIN_ALLOW_SHELL=1`,默认工作区限定在 `process.cwd()` 或 `start` 中将来扩展的 `workspace` 字段
- 路径:禁止 `..` 逃逸出 workspace root
---
### Toolkit 设计
统一注册表:
```typescript
type BuiltinTool = {
name: string;
description: string;
parameters: JSONSchema; // object type
execute: (args: unknown, ctx: ToolContext) => Promise<string>;
};
type ToolContext = {
cwd: string;
storageRoot: string;
};
```
| Tool name | OpenAI function | 行为摘要 |
|-----------|-----------------|----------|
| `read_file` | `read_file` | `{ path }` → UTF-8 文本,大小上限 |
| `write_file` | `write_file` | `{ path, content }` → 写盘,返回确认 |
| `edit_file` | 可选 | search/replace 块,减少 token |
| `run_command` | `run_command` | `{ command, cwd? }` → stdout/stderr 截断 |
| `list_dir` | `list_dir` | `{ path }` → 条目列表 |
| `grep` | `grep` | `{ pattern, path? }` → 匹配行 |
**LLM 请求形状**(扩展 extract 客户端):
```json
{
"model": "...",
"messages": [...],
"tools": [{ "type": "function", "function": { "name", "description", "parameters" } }],
"tool_choice": "auto"
}
```
解析 `choices[0].message.tool_calls`,执行后以 `{ role: "tool", tool_call_id, content }` 回传。
**不提供** streaming 首版;detail CAS 记录每轮 tool 名/参数/结果摘要供 `uwf thread step-details` 调试。
---
### 与现有架构的集成
| 集成点 | 方式 |
|--------|------|
| CLI 协议 | 实现标准 agent CLI:`uwf-builtin <thread-id> <role>`,stdout 一行 step hash,exit 0/1 |
| 工厂 | `export function createBuiltinAgent()` → `createAgent({ name: "builtin", run, continue })` |
| Context / Prompt | 复用 `buildContextWithMeta`、`buildRolePrompt`、`buildOutputFormatInstruction`;prompt 布局对齐 `buildHermesPrompt` |
| 结构化输出 | 优先 YAML frontmatter fast-path;可选后续在 `createAgent` 增加 `extract()` fallback 开关 |
| 配置 | `config.yaml` 增加 `agents.builtin`;`uwf setup` 可选默认 agent |
| 存储 | `resolveStorageRoot()` + `loadWorkflowConfig` + `getEnvPath`;与 Hermes 相同,**不**改 `threads.yaml` 写入方 |
| 测试 | 单元测试:tool handlers、prompt 组装、mock LLM tool loop;集成测试:临时 storage root + fake provider |
| 发布 | 新包 `@uncaged/workflow-agent-builtin`,bin `uwf-builtin`,加入 `scripts/publish-all.mjs` |
**明确不做**:
- 不替代 moderator / 不在 agent 内调用 `uwf thread step`
- 不依赖 Hermes/OpenClaw/Claude Code 二进制
- 首版不实现 streaming、不实现 MCP
**建议实现顺序**:
1. `llm.ts`:tool calling HTTP 客户端 + 单测
2. P0 tools + `runBuiltinLoop`
3. `createBuiltinAgent` + detail CAS
4. `config` / docs / `examples` 可选 `agentOverrides` 演示
5. (可选)`createAgent` 接入 `extract()` fallback
@@ -0,0 +1,73 @@
# Issue #418: ACP session/resume 返回空文本
## 调研日期: 2026-05-23
## 根因
`session/resume` 在 restore 路径下 `_make_agent()` 失败,异常被静默吞掉。
### 完整调用链
```
resume_session(sid)
→ update_cwd(sid)
→ get_session(sid) → _restore(sid)
→ _make_agent()
→ resolve_runtime_provider("custom") 失败(line 548-561)
→ AIAgent() 抛出 "No LLM provider configured"(line 564)
→ except Exception 静默吞掉(line 482-484)→ return None
→ return None
→ state is None → fallback: create_session()(新 sid,无历史)
```
### 关键代码位置(acp_adapter/session.py)
- `_restore()` line 426-498: 从 DB 恢复 session,但 except 太宽泛
- `_make_agent()` line 520-568: provider 解析在 restore 路径下不完整
- Line 548-561: `resolve_runtime_provider("custom")` 失败后,`base_url` 虽然从 DB 取到了但没传给 AIAgent
### 实测行为
1. Phase 1: `session/new` + `prompt` → 正常,有 `agent_message_chunk`
2. Phase 2: `session/resume` + `prompt`
- resume 返回成功,但 `available_commands_update` 里 sessionId 是新的(create_session fallback)
- 用原始 sid 发 prompt → `stopReason: "refusal"`(session 不在内存中)
- 用新 sid 发 prompt → 能跑但无历史(agent 回答"不知道 secret code")
### 验证脚本
```python
# 直接调用 _restore 验证
cd ~/.hermes/hermes-agent
python3 -c "
import sys; sys.path.insert(0, '.')
from acp_adapter.session import SessionManager
sm = SessionManager()
result = sm._restore('SESSION_ID_HERE')
print(result) # None — _make_agent 抛异常被吞掉
"
```
### 两个 bug
1. **`_make_agent` provider fallback 不完整**: restore 时 DB 里有 `base_url``api_mode`,但 `resolve_runtime_provider` 失败后这些值没被正确传递给 AIAgent
2. **`_restore` 的 except 太宽泛**: 静默吞掉所有异常,连 warning 都只在 debug 级别,导致 resume 失败完全无感知
### Hermes 版本
- v0.10.0 (2026.4.16) — 初始测试
- v0.14.0 (2026.5.16) — 更新后重新测试,bug 仍在
- 代码路径: ~/.hermes/hermes-agent/acp_adapter/session.py
### v0.14.0 测试结果 (2026-05-23)
- `_restore` 仍因 `custom` provider 解析失败返回 None
- 日志更清晰了:`WARNING: Failed to recreate agent for ACP session ...`
- resume fallback 创建新 session(新 sid),但 agent 居然能回答之前的问题(可能通过 memory/session search)
- 核心问题不变:sessionId 变了,client 用旧 sid 发 prompt → refusal
### 上游 Issue
- https://github.com/NousResearch/hermes-agent/issues/13489 — 已评论根因分析
- https://github.com/NousResearch/hermes-agent/issues/8083 — resume 静默创建新 session
- https://github.com/NousResearch/hermes-agent/issues/18452 — _make_agent fallback 不完整
+27 -15
View File
@@ -112,8 +112,8 @@ uwf-hermes <thread-id> <role>
**约定:**
- `uwf step` 负责 moderator 决策,将 role 传给 agent CLI
- agent-kit 根据 thread + role 从 CAS 读 systemPrompt / outputSchema
- agent-kit 组装完整 prompt(role systemPrompt + thread context + user prompt from StartNode)
- agent-kit 根据 thread + role 从 CAS 读 goal / capabilities / procedure / output / meta
- agent-kit 组装完整 prompt(role goal/capabilities/procedure/output + thread context + user prompt from StartNode)
- agent 执行实际逻辑,agent-kit 负责 extract
- agent 将 StepNode 写入 CAS(含 output、detail、agent、prev),但**不挪链头指针**
- stdout 输出新 StepNode 的 CAS hash(纯文本,一行)
@@ -143,7 +143,7 @@ uwf-hermes <thread-id> <role>
#### `Workflow`
Roles 和 moderator 内联在 Workflow 中,只有 outputSchema 独立为 CAS 节点(方便 json-cas 校验)。
Roles 和 moderator 内联在 Workflow 中,只有 meta 独立为 CAS 节点(方便 json-cas 校验)。
```yaml
type: <workflow-schema-hash>
@@ -153,16 +153,25 @@ payload:
roles:
planner:
description: "Creates implementation plan"
systemPrompt: "You are a planning agent..."
outputSchema: "5GWKR8TN1V3JA" # cas_ref → JSON Schema 节点(json-cas 内置)
goal: "You are a planning agent..."
capabilities: [planning, issue-analysis]
procedure: "Analyze the issue and create a plan."
output: "Output the plan summary."
meta: "5GWKR8TN1V3JA" # cas_ref → JSON Schema 节点(json-cas 内置)
developer:
description: "Implements code changes"
systemPrompt: "You are a developer agent..."
outputSchema: "8CNWT4KR6D1HV" # cas_ref → JSON Schema 节点
goal: "You are a developer agent..."
capabilities: [file-edit, shell]
procedure: "Implement the plan."
output: "List all files changed."
meta: "8CNWT4KR6D1HV" # cas_ref → JSON Schema 节点
reviewer:
description: "Reviews code changes"
systemPrompt: "You are a code reviewer..."
outputSchema: "1VPBG9SM5E7WK" # cas_ref → JSON Schema 节点
goal: "You are a code reviewer..."
capabilities: [code-review]
procedure: "Review the implementation."
output: "Approve or reject with comments."
meta: "1VPBG9SM5E7WK" # cas_ref → JSON Schema 节点
conditions:
needsClarification:
description: "Planner requests clarification from user"
@@ -189,7 +198,7 @@ payload:
condition: null
```
- `roles` — 内联定义,每个 role 的 `outputSchema` 是独立的 cas_ref(指向 json-cas 内置 JSON Schema 节点)
- `roles` — 内联定义,每个 role 的 `meta` 是独立的 cas_ref(指向 json-cas 内置 JSON Schema 节点)
- `conditions``Record<Name, JSONata>`,命名条件,方便画图描述
- `graph``Record<Role | "$START", Transition[]>`,每个 Transition = `{ role, condition }`
- `condition` 引用 conditions 中的 key,`null` = fallback
@@ -234,14 +243,14 @@ payload:
start: "4TNVW8KR2B3MA" # cas_ref → StartNode(每个 step 都引用)
prev: "2MXBG6PN4A8JR" # cas_ref → 前一个 StepNode,第一步为 null
role: "developer"
output: "9KRVW3TN5F1QA" # cas_ref → 结构化输出节点(符合 role 的 outputSchema)
output: "9KRVW3TN5F1QA" # cas_ref → 结构化输出节点(符合 role 的 meta schema)
detail: "7BQST3VW9F2MA" # cas_ref → 执行详情(content node / 子 workflow terminal StepNode / ...)
agent: "uwf-cursor" # 实际使用的 agent 命令(纯字符串)
```
- `start` — 每个 StepNode 都直接引用 StartNode,方便随机访问
- `prev` — 前一个 StepNode 的 cas_ref,第一步为 `null`(不指向 StartNode)
- `output` — cas_ref,指向符合 role outputSchema 的 CAS 节点,可用 json-cas 校验
- `output` — cas_ref,指向符合 role meta schema 的 CAS 节点,可用 json-cas 校验
- `detail` — cas_ref,指向执行详情。可以是原始 agent 输出(content node),也可以是子 workflow thread 的 terminal StepNode(workflowAsAgent 场景)
- `agent` — 纯字符串,不是 CAS 节点
@@ -372,7 +381,7 @@ type ThreadId = string;
/** 一个 step 的核心数据,被 StepNode payload 和 JSONata 上下文共享 */
type StepRecord = {
role: string;
output: CasRef; // cas_ref → 结构化输出节点(符合 role outputSchema)
output: CasRef; // cas_ref → 结构化输出节点(符合 role meta schema)
detail: CasRef; // cas_ref → 执行详情(content node / 子 workflow terminal StepNode)
agent: string; // 实际使用的 agent 命令(纯字符串)
};
@@ -383,8 +392,11 @@ type StepRecord = {
```typescript
type RoleDefinition = {
description: string;
systemPrompt: string;
outputSchema: CasRef; // cas_ref → json-cas 内置 JSON Schema 节点
goal: string;
capabilities: string[];
procedure: string;
output: string;
meta: CasRef; // cas_ref → json-cas 内置 JSON Schema 节点
};
type Transition = {
+43
View File
@@ -0,0 +1,43 @@
name: "analyze-topic"
description: "Single-role topic analysis using four-phase role description"
roles:
analyst:
description: "Analyzes a given topic and produces a structured summary"
goal: |
You are a research analyst with expertise in breaking down complex topics
into clear, structured summaries. You think critically and cite key points.
capabilities:
- research
- critical-thinking
- structured-writing
procedure: |
Analyze the topic by:
1. Identifying the main thesis or question
2. Listing 3-5 key points with brief explanations
3. Noting any counterarguments or caveats
Keep your analysis concise (under 500 words).
output: |
Provide your analysis as markdown under the frontmatter.
The frontmatter must include your structured findings.
frontmatter:
type: object
properties:
thesis:
type: string
keyPoints:
type: array
items:
type: string
caveats:
type: string
required: [thesis, keyPoints]
conditions: {}
graph:
$START:
- role: "analyst"
condition: null
prompt: "Analyze the topic in the task and produce a structured summary with key points."
analyst:
- role: "$END"
condition: null
prompt: "Analysis complete. Finish the workflow."
+77
View File
@@ -0,0 +1,77 @@
name: "debate"
description: "Structured debate between two sides. Tests cross-process session resume."
roles:
against:
description: "Argues against the proposition"
goal: |
You are a skilled debater arguing AGAINST the proposition.
Be logical, cite evidence, and directly address your opponent's points.
Keep each argument concise (under 200 words).
capabilities:
- argumentation
- critical-thinking
procedure: |
1. If this is the opening, present your strongest argument against the proposition.
2. If responding to the other side, directly counter their points with evidence and logic.
3. If you find yourself genuinely convinced by the other side, you may concede.
output: |
Provide your argument in the frontmatter.
Set conceded to true ONLY if you are genuinely convinced and wish to stop debating.
frontmatter:
type: object
properties:
argument:
type: string
conceded:
type: boolean
required: [argument, conceded]
for:
description: "Argues for the proposition"
goal: |
You are a skilled debater arguing FOR the proposition.
Be logical, cite evidence, and directly address your opponent's points.
Keep each argument concise (under 200 words).
capabilities:
- argumentation
- critical-thinking
procedure: |
1. Read the opposing side's latest argument carefully.
2. Counter their points with evidence and logic.
3. If you find yourself genuinely convinced by the other side, you may concede.
output: |
Provide your argument in the frontmatter.
Set conceded to true ONLY if you are genuinely convinced and wish to stop debating.
frontmatter:
type: object
properties:
argument:
type: string
conceded:
type: boolean
required: [argument, conceded]
conditions:
againstConceded:
description: "The against side conceded"
expression: "$last('against').conceded = true"
forConceded:
description: "The for side conceded"
expression: "$last('for').conceded = true"
graph:
$START:
- role: "against"
condition: null
prompt: "Present your opening argument against the proposition."
against:
- role: "$END"
condition: "againstConceded"
prompt: "The against side conceded. Debate over."
- role: "for"
condition: null
prompt: "Counter the opposing argument. Address their points directly."
for:
- role: "$END"
condition: "forConceded"
prompt: "The for side conceded. Debate over."
- role: "against"
condition: null
prompt: "Counter the opposing argument. Address their points directly."
+28 -7
View File
@@ -3,8 +3,13 @@ description: "End-to-end issue resolution"
roles:
planner:
description: "Creates implementation plan"
systemPrompt: "You are a planning agent. Analyze the issue and create a step-by-step plan."
outputSchema:
goal: "You are a planning agent. You analyze issues and create step-by-step plans."
capabilities:
- issue-analysis
- planning
procedure: "Analyze the issue and create a detailed, actionable implementation plan."
output: "Output the plan summary and list of concrete steps."
frontmatter:
type: object
properties:
plan:
@@ -16,8 +21,14 @@ roles:
required: [plan, steps]
developer:
description: "Implements code changes"
systemPrompt: "You are a developer agent. Implement the plan."
outputSchema:
goal: "You are a developer agent. You implement code changes according to plans."
capabilities:
- file-edit
- shell
- testing
procedure: "Implement the plan. Write code, tests, and ensure existing tests pass."
output: "List all files changed and provide a summary of the implementation."
frontmatter:
type: object
properties:
filesChanged:
@@ -29,8 +40,13 @@ roles:
required: [filesChanged, summary]
reviewer:
description: "Reviews code changes"
systemPrompt: "You are a code reviewer. Review the implementation."
outputSchema:
goal: "You are a code reviewer. You review implementations for correctness and quality."
capabilities:
- code-review
- static-analysis
procedure: "Review the implementation against the plan. Check for bugs, edge cases, and style."
output: "Approve or reject with detailed comments explaining your decision."
frontmatter:
type: object
properties:
approved:
@@ -41,19 +57,24 @@ roles:
conditions:
notApproved:
description: "Reviewer rejected the implementation"
expression: "steps[-1].output.approved = false"
expression: "$last('reviewer').approved = false"
graph:
$START:
- role: "planner"
condition: null
prompt: "Analyze the issue described in the task and produce a detailed implementation plan."
planner:
- role: "developer"
condition: null
prompt: "Implement the plan from the planner. Write code, tests, and ensure existing tests pass."
developer:
- role: "reviewer"
condition: null
prompt: "Review the developer's implementation against the plan for correctness and quality."
reviewer:
- role: "developer"
condition: "notApproved"
prompt: "The reviewer rejected your implementation. Read their feedback and fix the issues."
- role: "$END"
condition: null
prompt: "The review passed. Complete the workflow."
@@ -1,6 +1,6 @@
import { describe, expect, test } from "bun:test";
import { packageDescriptor } from "../src/package-descriptor.js";
import { createDocxDiffAgent } from "../src/agent.js";
import { packageDescriptor } from "../src/package-descriptor.js";
describe("createDocxDiffAgent", () => {
test("returns an AdapterFn (function)", () => {
@@ -1,8 +1,8 @@
import { mkdirSync, writeFileSync } from "node:fs";
import { join } from "node:path";
import { tmpdir } from "node:os";
import { describe, expect, mock, test } from "bun:test";
import { ok, err } from "@uncaged/workflow-util";
import { mkdirSync, writeFileSync } from "node:fs";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { err, ok } from "@uncaged/workflow-util";
import type { SpawnCliConfig } from "@uncaged/workflow-util-agent";
import { runDocxDiff } from "../src/runner.js";
@@ -74,7 +74,12 @@ describe("runDocxDiff", () => {
test("exit 2: throws error", async () => {
const dir = tempDir();
const spawnFn = makeSpawn(
err({ kind: "non_zero_exit", exitCode: 2, stdout: "", stderr: "fatal error" }) as MockSpawnResult,
err({
kind: "non_zero_exit",
exitCode: 2,
stdout: "",
stderr: "fatal error",
}) as MockSpawnResult,
);
await expect(
@@ -1,7 +1,11 @@
{
"name": "@uncaged/workflow-agent-docx-diff",
"version": "0.1.0",
"files": ["src", "dist", "package.json"],
"files": [
"src",
"dist",
"package.json"
],
"type": "module",
"types": "src/index.ts",
"exports": {
@@ -1,7 +1,12 @@
import * as z from "zod/v4";
import { dirname, join } from "node:path";
import type { AdapterFn, RoleResult, ThreadContext, WorkflowRuntime } from "@uncaged/workflow-runtime";
import type {
AdapterFn,
RoleResult,
ThreadContext,
WorkflowRuntime,
} from "@uncaged/workflow-runtime";
import type { WriterMeta } from "@uncaged/workflow-template-document";
import type * as z from "zod/v4";
import { runDocxDiff } from "./runner.js";
import type { DocxDiffAgentConfig } from "./types.js";
@@ -12,16 +17,10 @@ export function createDocxDiffAgent(config: DocxDiffAgentConfig): AdapterFn {
if (writerStep === undefined) throw new Error("differ: no writer step found");
const writerMeta = writerStep.meta as WriterMeta;
if (writerMeta.mode !== "edit")
throw new Error("differ: writer did not run in edit mode");
if (writerMeta.mode !== "edit") throw new Error("differ: writer did not run in edit mode");
const diffDocx = join(dirname(writerMeta.outputDocx), "diff.docx");
const raw = await runDocxDiff(
config,
writerMeta.sourceDocx,
writerMeta.outputDocx,
diffDocx,
);
const raw = await runDocxDiff(config, writerMeta.sourceDocx, writerMeta.outputDocx, diffDocx);
const meta = schema.parse(JSON.parse(raw)) as T;
return { meta, childThread: null };
@@ -1,6 +1,6 @@
import { stat } from "node:fs/promises";
import { spawnCli } from "@uncaged/workflow-util-agent";
import type { SpawnCliError } from "@uncaged/workflow-util-agent";
import { spawnCli } from "@uncaged/workflow-util-agent";
import type { DocxDiffAgentConfig } from "./types.js";
type SpawnCliFn = typeof spawnCli;
@@ -8,8 +8,7 @@ type SpawnCliFn = typeof spawnCli;
function throwSpawnError(e: SpawnCliError): never {
if (e.kind === "non_zero_exit")
throw new Error(`docx-diff failed (exit ${e.exitCode}): ${e.stderr}`);
if (e.kind === "timeout")
throw new Error("docx-diff: timed out");
if (e.kind === "timeout") throw new Error("docx-diff: timed out");
throw new Error(`docx-diff: spawn failed: ${e.message}`);
}
@@ -1,6 +1,6 @@
import { describe, expect, test } from "bun:test";
import { packageDescriptor } from "../src/package-descriptor.js";
import { createOfficeAgent } from "../src/agent.js";
import { packageDescriptor } from "../src/package-descriptor.js";
describe("createOfficeAgent", () => {
test("returns an AdapterFn (function)", () => {
@@ -1,8 +1,8 @@
import { mkdirSync, writeFileSync } from "node:fs";
import { join } from "node:path";
import { tmpdir } from "node:os";
import { describe, expect, mock, test } from "bun:test";
import { ok, err } from "@uncaged/workflow-util";
import { mkdirSync, writeFileSync } from "node:fs";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { err, ok } from "@uncaged/workflow-util";
import type { SpawnCliConfig } from "@uncaged/workflow-util-agent";
import { editDocument, generateDocument } from "../src/runner.js";
@@ -123,7 +123,13 @@ describe("editDocument", () => {
);
await expect(
editDocument({ outputDir: base, command: null, timeout: null }, "te2", "edit", inputFile, spawnFn),
editDocument(
{ outputDir: base, command: null, timeout: null },
"te2",
"edit",
inputFile,
spawnFn,
),
).rejects.toThrow("spawn failed");
});
});
@@ -1,7 +1,11 @@
{
"name": "@uncaged/workflow-agent-office",
"version": "0.1.0",
"files": ["src", "dist", "package.json"],
"files": [
"src",
"dist",
"package.json"
],
"type": "module",
"types": "src/index.ts",
"exports": {
@@ -1,6 +1,11 @@
import * as z from "zod/v4";
import type { AdapterFn, RoleResult, ThreadContext, WorkflowRuntime } from "@uncaged/workflow-runtime";
import type {
AdapterFn,
RoleResult,
ThreadContext,
WorkflowRuntime,
} from "@uncaged/workflow-runtime";
import { createLogger } from "@uncaged/workflow-util";
import type * as z from "zod/v4";
import { editDocument, generateDocument } from "./runner.js";
import type { OfficeAgentConfig } from "./types.js";
@@ -27,7 +32,10 @@ export function createOfficeAgent(config: OfficeAgentConfig): AdapterFn {
return <T>(_systemPrompt: string, schema: z.ZodType<T>) =>
async (ctx: ThreadContext, _runtime: WorkflowRuntime): Promise<RoleResult<T>> => {
const { prompt, inputDocx } = parseStartInput(ctx.start.content);
log("8FQKP3NV", `office-agent: mode=${inputDocx === null ? "generate" : "edit"} thread=${ctx.threadId}`);
log(
"8FQKP3NV",
`office-agent: mode=${inputDocx === null ? "generate" : "edit"} thread=${ctx.threadId}`,
);
let raw: string;
if (inputDocx === null) {
@@ -35,7 +43,11 @@ export function createOfficeAgent(config: OfficeAgentConfig): AdapterFn {
raw = JSON.stringify({ mode: "generate", outputDocx: result.outputDocx, sourceDocx: null });
} else {
const result = await editDocument(config, ctx.threadId, prompt, inputDocx);
raw = JSON.stringify({ mode: "edit", outputDocx: result.outputDocx, sourceDocx: result.sourceDocx });
raw = JSON.stringify({
mode: "edit",
outputDocx: result.outputDocx,
sourceDocx: result.sourceDocx,
});
}
const meta = schema.parse(JSON.parse(raw)) as T;
@@ -1,7 +1,7 @@
import { copyFile, mkdir, stat } from "node:fs/promises";
import { join } from "node:path";
import { spawnCli } from "@uncaged/workflow-util-agent";
import type { SpawnCliError } from "@uncaged/workflow-util-agent";
import { spawnCli } from "@uncaged/workflow-util-agent";
import type { OfficeAgentConfig } from "./types.js";
type SpawnCliFn = typeof spawnCli;
@@ -9,8 +9,7 @@ type SpawnCliFn = typeof spawnCli;
function throwSpawnError(e: SpawnCliError): never {
if (e.kind === "non_zero_exit")
throw new Error(`office-agent failed (exit ${e.exitCode}): ${e.stderr}`);
if (e.kind === "timeout")
throw new Error("office-agent: timed out");
if (e.kind === "timeout") throw new Error("office-agent: timed out");
throw new Error(`office-agent: spawn failed: ${e.message}`);
}
@@ -5,7 +5,7 @@
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Workflow Dashboard</title>
<script>
(function () {
(() => {
var t = localStorage.getItem("theme");
if (t === "dark" || (!t && matchMedia("(prefers-color-scheme: dark)").matches)) {
document.documentElement.classList.add("dark");
@@ -54,10 +54,14 @@ type CallExpression = {
arguments: Array<AstExpression>;
};
type AstExpression = Identifier | MemberExpression | CallExpression | {
type: string;
[key: string]: unknown;
};
type AstExpression =
| Identifier
| MemberExpression
| CallExpression
| {
type: string;
[key: string]: unknown;
};
type VariableDeclarator = {
id: Identifier | null;
@@ -258,15 +262,21 @@ function createLimitResolver(options: LimitLineOptions): (id: string) => Resolve
}
function shouldProcess(id: string, options: LimitLineOptions): boolean {
return options.include.test(id) && !id.includes("node_modules") && (options.exclude === null || !options.exclude.test(id));
return (
options.include.test(id) &&
!id.includes("node_modules") &&
(options.exclude === null || !options.exclude.test(id))
);
}
// --- Plugin ---
function viteLimitLinePlugin(
userOptions: Partial<LimitLineOptions> = {},
): Array<Plugin> {
const options: LimitLineOptions = { ...DEFAULT_OPTIONS, ...userOptions, overrides: userOptions.overrides ?? [] };
function viteLimitLinePlugin(userOptions: Partial<LimitLineOptions> = {}): Array<Plugin> {
const options: LimitLineOptions = {
...DEFAULT_OPTIONS,
...userOptions,
overrides: userOptions.overrides ?? [],
};
const resolve = createLimitResolver(options);
const rawCodeCache = new Map<string, string>();
@@ -358,5 +368,5 @@ function viteLimitLinePlugin(
];
}
export { viteLimitLinePlugin };
export type { LimitLineOptions, LimitLineOverride };
export { viteLimitLinePlugin };
@@ -55,10 +55,7 @@ export function ResizablePanel({
}, []);
return (
<div
className={cn("relative shrink-0", className)}
style={{ ...style, width }}
>
<div className={cn("relative shrink-0", className)} style={{ ...style, width }}>
{children}
<div
className="absolute top-0 -right-1 w-2 h-full cursor-col-resize z-10 group"
@@ -9,9 +9,7 @@ import type { DocumentMeta } from "../src/roles.js";
const documentModerator = tableToModerator(documentTable);
function makeCtx(
steps: ModeratorContext<DocumentMeta>["steps"],
): ModeratorContext<DocumentMeta> {
function makeCtx(steps: ModeratorContext<DocumentMeta>["steps"]): ModeratorContext<DocumentMeta> {
return {
threadId: "01TEST000000000000000000TR",
depth: 0,
@@ -25,7 +23,11 @@ function writerGenerateStep(): RoleStep<DocumentMeta> {
return {
role: "writer",
contentHash: "STUBHASHWRITER001",
meta: { mode: "generate", outputDocx: "/out/output.docx", sourceDocx: null } satisfies WriterMeta,
meta: {
mode: "generate",
outputDocx: "/out/output.docx",
sourceDocx: null,
} satisfies WriterMeta,
refs: [],
timestamp: 1,
};
@@ -35,7 +37,11 @@ function writerEditStep(): RoleStep<DocumentMeta> {
return {
role: "writer",
contentHash: "STUBHASHWRITER002",
meta: { mode: "edit", outputDocx: "/out/modified.docx", sourceDocx: "/out/original.docx" } satisfies WriterMeta,
meta: {
mode: "edit",
outputDocx: "/out/modified.docx",
sourceDocx: "/out/original.docx",
} satisfies WriterMeta,
refs: [],
timestamp: 1,
};
@@ -1,7 +1,11 @@
{
"name": "@uncaged/workflow-template-document",
"version": "0.1.0",
"files": ["src", "dist", "package.json"],
"files": [
"src",
"dist",
"package.json"
],
"type": "module",
"types": "src/index.ts",
"exports": {
+3 -1
View File
@@ -12,13 +12,15 @@
"test": "bun run --filter '*' test",
"changeset": "bunx changeset",
"version": "bunx changeset version",
"release": "bun run build && bun test && npx changeset publish --no-git-tag"
"release": "bun run build && bun test && node scripts/publish-all.mjs"
},
"devDependencies": {
"@agentclientprotocol/sdk": "^0.22.1",
"@biomejs/biome": "^2.4.14",
"@changesets/cli": "^2.31.0",
"@types/node": "^25.7.0",
"@types/xxhashjs": "^0.2.4",
"@uncaged/workflow-agent-hermes": "workspace:*",
"bun-types": "^1.3.13"
}
}
+3 -3
View File
@@ -1,6 +1,6 @@
{
"name": "@uncaged/cli-workflow",
"version": "0.1.0",
"version": "0.5.0",
"files": [
"src",
"dist",
@@ -11,8 +11,8 @@
"uwf": "./src/cli.ts"
},
"dependencies": {
"@uncaged/json-cas": "^0.3.0",
"@uncaged/json-cas-fs": "^0.3.0",
"@uncaged/json-cas": "^0.4.0",
"@uncaged/json-cas-fs": "^0.4.0",
"@uncaged/workflow-agent-kit": "workspace:^",
"@uncaged/workflow-moderator": "workspace:^",
"@uncaged/workflow-protocol": "workspace:^",
@@ -0,0 +1,181 @@
import { mkdir, readdir, rm, writeFile } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { afterEach, beforeEach, describe, expect, test } from "vitest";
import { cmdLogClean, cmdLogList, cmdLogShow } from "../commands/log.js";
let storageRoot: string;
beforeEach(async () => {
storageRoot = join(tmpdir(), `uwf-log-test-${Date.now()}-${Math.random().toString(36).slice(2)}`);
await mkdir(join(storageRoot, "logs"), { recursive: true });
});
afterEach(async () => {
await rm(storageRoot, { recursive: true, force: true });
});
const entry1 = JSON.stringify({
ts: "2026-05-20T10:00:00.000Z",
pid: "1716200000000-1234",
tag: "W9F3RK2M",
msg: "process start",
thread: "01J1234ABCDEF",
workflow: "solve-issue",
});
const entry2 = JSON.stringify({
ts: "2026-05-20T10:00:01.000Z",
pid: "1716200000000-1234",
tag: "ABC12345",
msg: "step executed",
thread: "01J1234ABCDEF",
workflow: "solve-issue",
});
const entry3 = JSON.stringify({
ts: "2026-05-20T10:00:02.000Z",
pid: "1716200000000-5678",
tag: "XYZ98765",
msg: "different process",
thread: "01JOTHER000000",
workflow: "review-code",
});
const oldEntry = JSON.stringify({
ts: "2026-05-19T08:00:00.000Z",
pid: "1716200000000-9999",
tag: "OLD1TAG1",
msg: "old entry",
thread: "01JOLD0000000",
workflow: "solve-issue",
});
const olderEntry = JSON.stringify({
ts: "2026-05-18T08:00:00.000Z",
pid: "1716200000000-0001",
tag: "OLD2TAG2",
msg: "older entry",
thread: "01JOLDER00000",
workflow: "review-code",
});
async function writeLogFiles(): Promise<void> {
const logsDir = join(storageRoot, "logs");
await writeFile(join(logsDir, "2026-05-20.jsonl"), [entry1, entry2, entry3].join("\n") + "\n");
await writeFile(join(logsDir, "2026-05-19.jsonl"), oldEntry + "\n");
await writeFile(join(logsDir, "2026-05-18.jsonl"), olderEntry + "\n");
}
describe("cmdLogList", () => {
test("lists log files with sizes sorted by date descending", async () => {
await writeLogFiles();
const result = await cmdLogList(storageRoot);
expect(result).toHaveLength(3);
expect(result[0].name).toBe("2026-05-20.jsonl");
expect(result[0].date).toBe("2026-05-20");
expect(result[0].size).toBeGreaterThan(0);
expect(result[1].name).toBe("2026-05-19.jsonl");
expect(result[2].name).toBe("2026-05-18.jsonl");
});
test("returns empty array when no log files exist", async () => {
const result = await cmdLogList(storageRoot);
expect(result).toEqual([]);
});
test("returns empty array when logs directory does not exist", async () => {
const noLogsRoot = join(storageRoot, "nonexistent");
await mkdir(noLogsRoot, { recursive: true });
const result = await cmdLogList(noLogsRoot);
expect(result).toEqual([]);
});
});
describe("cmdLogShow", () => {
test("filters by thread ID", async () => {
await writeLogFiles();
const result = await cmdLogShow(storageRoot, {
thread: "01J1234ABCDEF",
process: null,
date: null,
});
expect(result).toHaveLength(2);
expect(result.every((e) => e.thread === "01J1234ABCDEF")).toBe(true);
});
test("filters by process ID", async () => {
await writeLogFiles();
const result = await cmdLogShow(storageRoot, {
thread: null,
process: "1716200000000-1234",
date: null,
});
expect(result).toHaveLength(2);
expect(result.every((e) => e.pid === "1716200000000-1234")).toBe(true);
});
test("filters by date", async () => {
await writeLogFiles();
const result = await cmdLogShow(storageRoot, {
thread: null,
process: null,
date: "2026-05-19",
});
expect(result).toHaveLength(1);
expect(result[0].msg).toBe("old entry");
});
test("reads all files when no date filter", async () => {
await writeLogFiles();
const result = await cmdLogShow(storageRoot, { thread: null, process: null, date: null });
expect(result).toHaveLength(5);
// sorted by ts ascending
expect(result[0].ts).toBe("2026-05-18T08:00:00.000Z");
expect(result[4].ts).toBe("2026-05-20T10:00:02.000Z");
});
test("returns empty when no matches", async () => {
await writeLogFiles();
const result = await cmdLogShow(storageRoot, {
thread: "NONEXISTENT",
process: null,
date: null,
});
expect(result).toEqual([]);
});
test("combined thread + date filter", async () => {
await writeLogFiles();
const result = await cmdLogShow(storageRoot, {
thread: "01J1234ABCDEF",
process: null,
date: "2026-05-20",
});
expect(result).toHaveLength(2);
expect(result.every((e) => e.thread === "01J1234ABCDEF")).toBe(true);
});
});
describe("cmdLogClean", () => {
test("deletes files before given date", async () => {
await writeLogFiles();
const result = await cmdLogClean(storageRoot, "2026-05-20");
expect(result.deleted).toBe(2);
const remaining = await readdir(join(storageRoot, "logs"));
expect(remaining).toEqual(["2026-05-20.jsonl"]);
});
test("deletes nothing when all files are newer", async () => {
await writeLogFiles();
const result = await cmdLogClean(storageRoot, "2026-05-18");
expect(result.deleted).toBe(0);
});
test("handles missing logs directory gracefully", async () => {
const noLogsRoot = join(storageRoot, "nonexistent");
await mkdir(noLogsRoot, { recursive: true });
const result = await cmdLogClean(noLogsRoot, "2026-05-20");
expect(result).toEqual({ deleted: 0 });
});
});
@@ -0,0 +1,150 @@
import { mkdtemp, rm } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { afterEach, beforeEach, describe, expect, test, vi } from "vitest";
import { cmdSetup, validateModel } from "../commands/setup.js";
describe("validateModel", () => {
const BASE_URL = "https://api.example.com/v1";
const API_KEY = "sk-test-key";
const MODEL = "test-model";
afterEach(() => {
vi.restoreAllMocks();
});
test("success path — returns ok on 200", async () => {
const mockFetch = vi
.spyOn(globalThis, "fetch")
.mockResolvedValue(new Response(JSON.stringify({}), { status: 200 }));
const result = await validateModel(BASE_URL, API_KEY, MODEL);
expect(result).toEqual({ ok: true, value: undefined });
expect(mockFetch).toHaveBeenCalledOnce();
const [url, opts] = mockFetch.mock.calls[0]!;
expect(url).toBe(`${BASE_URL}/chat/completions`);
expect((opts as RequestInit).headers).toEqual(
expect.objectContaining({ Authorization: `Bearer ${API_KEY}` }),
);
const body = JSON.parse((opts as RequestInit).body as string);
expect(body).toEqual({
model: MODEL,
messages: [{ role: "user", content: "hi" }],
max_tokens: 1,
});
});
test("HTTP 401 — returns error containing 401", async () => {
vi.spyOn(globalThis, "fetch").mockResolvedValue(
new Response("Unauthorized", { status: 401, statusText: "Unauthorized" }),
);
const result = await validateModel(BASE_URL, API_KEY, MODEL);
expect(result.ok).toBe(false);
if (!result.ok) {
expect(result.error).toContain("401");
}
});
test("HTTP 404 — returns error containing 404", async () => {
vi.spyOn(globalThis, "fetch").mockResolvedValue(
new Response("Not Found", { status: 404, statusText: "Not Found" }),
);
const result = await validateModel(BASE_URL, API_KEY, MODEL);
expect(result.ok).toBe(false);
if (!result.ok) {
expect(result.error).toContain("404");
}
});
test("network timeout — returns error mentioning timeout", async () => {
const err = new DOMException("signal timed out", "AbortError");
vi.spyOn(globalThis, "fetch").mockRejectedValue(err);
const result = await validateModel(BASE_URL, API_KEY, MODEL);
expect(result.ok).toBe(false);
if (!result.ok) {
expect(result.error.toLowerCase()).toMatch(/timeout|timed out/);
}
});
test("network error (DNS/connection) — returns error mentioning connectivity", async () => {
vi.spyOn(globalThis, "fetch").mockRejectedValue(new TypeError("fetch failed"));
const result = await validateModel(BASE_URL, API_KEY, MODEL);
expect(result.ok).toBe(false);
if (!result.ok) {
expect(result.error.toLowerCase()).toMatch(/connect|reach|network/);
}
});
test("request body correctness", async () => {
const mockFetch = vi
.spyOn(globalThis, "fetch")
.mockResolvedValue(new Response(JSON.stringify({}), { status: 200 }));
await validateModel(BASE_URL, API_KEY, "my-special-model");
const body = JSON.parse((mockFetch.mock.calls[0]![1] as RequestInit).body as string);
expect(body).toEqual({
model: "my-special-model",
messages: [{ role: "user", content: "hi" }],
max_tokens: 1,
});
});
});
describe("cmdSetup with validation", () => {
let storageRoot: string;
beforeEach(async () => {
storageRoot = await mkdtemp(join(tmpdir(), "uwf-setup-validate-"));
});
afterEach(async () => {
vi.restoreAllMocks();
await rm(storageRoot, { recursive: true, force: true });
});
const setupArgs = () => ({
provider: "testprovider",
baseUrl: "https://api.test.com/v1",
apiKey: "sk-test",
model: "test-model",
storageRoot,
});
test("includes validation result on success", async () => {
vi.spyOn(globalThis, "fetch").mockResolvedValue(
new Response(JSON.stringify({}), { status: 200 }),
);
const result = await cmdSetup(setupArgs());
expect(result.validation).toEqual({ ok: true, value: undefined });
// Config files should still be written
expect(result.configPath).toBeTruthy();
expect(result.envPath).toBeTruthy();
});
test("includes validation failure — config still saved", async () => {
vi.spyOn(globalThis, "fetch").mockResolvedValue(
new Response("Unauthorized", { status: 401, statusText: "Unauthorized" }),
);
const result = await cmdSetup(setupArgs());
expect(result.validation).toBeDefined();
expect((result.validation as { ok: boolean }).ok).toBe(false);
// Config files should still be written despite validation failure
expect(result.configPath).toBeTruthy();
expect(result.envPath).toBeTruthy();
});
});
@@ -0,0 +1,71 @@
import { execFileSync } from "node:child_process";
import { join } from "node:path";
import { describe, expect, test } from "vitest";
const CLI_PATH = join(import.meta.dirname, "..", "cli.js");
function runCli(args: string[]): { stdout: string; stderr: string; exitCode: number } {
try {
const stdout = execFileSync("bun", ["run", CLI_PATH, ...args], {
encoding: "utf8",
env: { ...process.env, WORKFLOW_STORAGE_ROOT: "/tmp/uwf-test-nonexistent" },
stdio: ["ignore", "pipe", "pipe"],
});
return { stdout, stderr: "", exitCode: 0 };
} catch (e: unknown) {
const err = e as NodeJS.ErrnoException & { stdout?: string; stderr?: string; status?: number };
return {
stdout: err.stdout ?? "",
stderr: err.stderr ?? "",
exitCode: err.status ?? 1,
};
}
}
describe("thread step --count CLI parsing", () => {
test("--help shows -c/--count option", () => {
const result = runCli(["thread", "step", "--help"]);
expect(result.stdout).toContain("--count");
expect(result.stdout).toContain("-c");
});
test("description says 'one or more steps'", () => {
const result = runCli(["thread", "step", "--help"]);
expect(result.stdout).toContain("one or more steps");
});
});
describe("cmdThreadStep count logic", () => {
test("count=0 fails with validation error", () => {
const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "0"]);
expect(result.exitCode).not.toBe(0);
expect(result.stderr).toContain("positive integer");
});
test("negative count fails with validation error", () => {
const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "-1"]);
expect(result.exitCode).not.toBe(0);
expect(result.stderr).toContain("positive integer");
});
test("non-integer count fails with validation error", () => {
const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "1.5"]);
expect(result.exitCode).not.toBe(0);
expect(result.stderr).toContain("positive integer");
});
test("count=1 is the default (no -c flag)", () => {
// Without -c, it should attempt to run 1 step (failing on missing thread, not on count validation)
const result = runCli(["thread", "step", "FAKE_THREAD_ID"]);
expect(result.exitCode).not.toBe(0);
// Should NOT contain "positive integer" error — should fail on thread lookup instead
expect(result.stderr).not.toContain("positive integer");
});
test("count=3 passes validation (fails on thread lookup)", () => {
const result = runCli(["thread", "step", "FAKE_THREAD_ID", "-c", "3"]);
expect(result.exitCode).not.toBe(0);
// Should NOT contain "positive integer" error — should fail on thread/storage lookup
expect(result.stderr).not.toContain("positive integer");
});
});
@@ -211,8 +211,11 @@ describe("cmdThreadRead ### Content section", () => {
roles: {
writer: {
description: "Write",
systemPrompt: "You are a writer.",
outputSchema: "placeholder00" as CasRef,
goal: "You are a writer.",
capabilities: [],
procedure: "Write content as requested.",
output: "Summarize what was written.",
meta: "placeholder00" as CasRef,
},
},
conditions: {},
+90 -7
View File
@@ -7,13 +7,16 @@ import {
cmdCasGet,
cmdCasHas,
cmdCasPut,
cmdCasPutText,
cmdCasRefs,
cmdCasReindex,
cmdCasSchemaGet,
cmdCasSchemaList,
cmdCasWalk,
} from "./commands/cas.js";
import { cmdLogClean, cmdLogList, cmdLogShow } from "./commands/log.js";
import { cmdSetup, cmdSetupInteractive } from "./commands/setup.js";
import { cmdSkillCli } from "./commands/skill.js";
import {
cmdThreadFork,
cmdThreadKill,
@@ -45,7 +48,12 @@ function runAction(action: () => Promise<void>): void {
const program = new Command();
program.name("uwf").description("Stateless workflow CLI");
// eslint-disable-next-line -- dynamic import for version
const pkg = await import("../package.json", { with: { type: "json" } });
program
.name("uwf")
.description("Stateless workflow CLI")
.version(pkg.default.version, "-V, --version");
program.option("--format <fmt>", "Output format: json or yaml", "json");
const workflow = program.command("workflow").description("Workflow registry and CAS");
@@ -80,7 +88,7 @@ workflow
.action(() => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdWorkflowList(storageRoot);
const result = await cmdWorkflowList(storageRoot, process.cwd());
writeOutput(result);
});
});
@@ -95,22 +103,28 @@ thread
.action((workflow: string, opts: { prompt: string }) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdThreadStart(storageRoot, workflow, opts.prompt);
const result = await cmdThreadStart(storageRoot, workflow, opts.prompt, process.cwd());
writeOutput(result);
});
});
thread
.command("step")
.description("Execute one step")
.description("Execute one or more steps")
.argument("<thread-id>", "Thread ULID")
.option("--agent <cmd>", "Override agent command")
.action((threadId: string, opts: { agent: string | undefined }) => {
.option("-c, --count <number>", "Number of steps to run (default: 1)")
.action((threadId: string, opts: { agent: string | undefined; count: string | undefined }) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const agentOverride = opts.agent ?? null;
const result = await cmdThreadStep(storageRoot, threadId, agentOverride);
writeOutput(result);
const count = opts.count !== undefined ? Number(opts.count) : 1;
const results = await cmdThreadStep(storageRoot, threadId, agentOverride, count);
if (results.length === 1) {
writeOutput(results[0]);
} else {
writeOutput(results);
}
});
});
@@ -215,6 +229,15 @@ thread
});
});
const skill = program.command("skill").description("Built-in skill references for agents");
skill
.command("cli")
.description("Print a markdown reference of all uwf commands")
.action(() => {
console.log(cmdSkillCli());
});
program
.command("setup")
.description("Configure provider, model, and agent")
@@ -280,6 +303,17 @@ cas
});
});
cas
.command("put-text")
.description("Store a plain text string, print its hash")
.argument("<text>", "Text content to store")
.action((text: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
writeOutput(await cmdCasPutText(storageRoot, text));
});
});
cas
.command("has")
.description("Check if a hash exists")
@@ -346,6 +380,55 @@ casSchema
});
});
const log = program.command("log").description("Process-level debug logs");
log
.command("list")
.description("List log files with sizes")
.action(() => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdLogList(storageRoot);
writeOutput(result);
});
});
log
.command("show")
.description("Show and filter log entries")
.option("--thread <thread-id>", "Filter by thread ID")
.option("--process <pid>", "Filter by process ID")
.option("--date <date>", "Filter by date (YYYY-MM-DD)")
.action(
(opts: {
thread: string | undefined;
process: string | undefined;
date: string | undefined;
}) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdLogShow(storageRoot, {
thread: opts.thread ?? null,
process: opts.process ?? null,
date: opts.date ?? null,
});
writeOutput(result);
});
},
);
log
.command("clean")
.description("Delete log files older than given date")
.requiredOption("--before <date>", "Delete files before this date (YYYY-MM-DD)")
.action((opts: { before: string }) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdLogClean(storageRoot, opts.before);
writeOutput(result);
});
});
program.parseAsync(process.argv).catch((e: unknown) => {
const message = e instanceof Error ? e.message : String(e);
process.stderr.write(`${message}\n`);
+17 -24
View File
@@ -1,10 +1,12 @@
import { readFileSync } from "node:fs";
import { join } from "node:path";
import type { Hash, JSONSchema, Store } from "@uncaged/json-cas";
import { bootstrap, getSchema, refs, walk } from "@uncaged/json-cas";
import type { JSONSchema, Store } from "@uncaged/json-cas";
import { bootstrap, getSchema, putSchema, refs, walk } from "@uncaged/json-cas";
import { createFsStore } from "@uncaged/json-cas-fs";
import { TEXT_SCHEMA } from "../schemas.js";
// ---- Helpers ----
function openStore(storageRoot: string): Store {
@@ -53,18 +55,12 @@ export async function cmdCasPut(
return { hash };
}
export async function cmdCasHas(
storageRoot: string,
hash: string,
): Promise<{ exists: boolean }> {
export async function cmdCasHas(storageRoot: string, hash: string): Promise<{ exists: boolean }> {
const store = openStore(storageRoot);
return { exists: store.has(hash) };
}
export async function cmdCasRefs(
storageRoot: string,
hash: string,
): Promise<{ refs: string[] }> {
export async function cmdCasRefs(storageRoot: string, hash: string): Promise<{ refs: string[] }> {
const store = openStore(storageRoot);
const node = store.get(hash);
if (node === null) {
@@ -73,10 +69,7 @@ export async function cmdCasRefs(
return { refs: refs(store, node) };
}
export async function cmdCasWalk(
storageRoot: string,
hash: string,
): Promise<{ hashes: string[] }> {
export async function cmdCasWalk(storageRoot: string, hash: string): Promise<{ hashes: string[] }> {
const store = openStore(storageRoot);
const result: string[] = [];
walk(store, hash, (h) => {
@@ -90,9 +83,7 @@ export type SchemaListEntry = {
title: string;
};
export async function cmdCasSchemaList(
storageRoot: string,
): Promise<SchemaListEntry[]> {
export async function cmdCasSchemaList(storageRoot: string): Promise<SchemaListEntry[]> {
const store = openStore(storageRoot);
const metaHash = await bootstrap(store);
const entries: SchemaListEntry[] = [];
@@ -115,9 +106,7 @@ export async function cmdCasSchemaList(
return entries;
}
export async function cmdCasReindex(
storageRoot: string,
): Promise<{ status: string }> {
export async function cmdCasReindex(storageRoot: string): Promise<{ status: string }> {
const indexDir = join(storageRoot, "cas", "_index");
const { rmSync } = await import("node:fs");
rmSync(indexDir, { recursive: true, force: true });
@@ -126,10 +115,7 @@ export async function cmdCasReindex(
return { status: "reindexed" };
}
export async function cmdCasSchemaGet(
storageRoot: string,
hash: string,
): Promise<unknown> {
export async function cmdCasSchemaGet(storageRoot: string, hash: string): Promise<unknown> {
const store = openStore(storageRoot);
const schema = getSchema(store, hash);
if (schema === null) {
@@ -137,3 +123,10 @@ export async function cmdCasSchemaGet(
}
return schema;
}
export async function cmdCasPutText(storageRoot: string, text: string): Promise<{ hash: string }> {
const store = openStore(storageRoot);
const typeHash = await putSchema(store, TEXT_SCHEMA);
const hash = await store.put(typeHash, text);
return { hash };
}
+116
View File
@@ -0,0 +1,116 @@
import { readdir, readFile, stat, unlink } from "node:fs/promises";
import { join } from "node:path";
type LogListItem = {
name: string;
size: number;
date: string;
};
type LogShowFilter = {
thread: string | null;
process: string | null;
date: string | null;
};
type LogEntry = {
ts: string;
pid: string;
tag: string;
msg: string;
thread: string | null;
workflow: string | null;
};
type LogCleanResult = {
deleted: number;
};
function logsDir(storageRoot: string): string {
return join(storageRoot, "logs");
}
async function listLogFiles(dir: string): Promise<Array<string>> {
try {
const files = await readdir(dir);
return files.filter((f) => f.endsWith(".jsonl")).sort();
} catch {
return [];
}
}
function dateFromFilename(name: string): string {
return name.replace(".jsonl", "");
}
async function parseJsonlFile(path: string): Promise<Array<LogEntry>> {
const content = await readFile(path, "utf-8");
const lines = content
.trim()
.split("\n")
.filter((l) => l.length > 0);
return lines.map((line) => JSON.parse(line) as LogEntry);
}
export async function cmdLogList(storageRoot: string): Promise<Array<LogListItem>> {
const dir = logsDir(storageRoot);
const files = await listLogFiles(dir);
const items: Array<LogListItem> = [];
for (const name of files) {
const s = await stat(join(dir, name));
items.push({ name, size: s.size, date: dateFromFilename(name) });
}
// sort by date descending
items.sort((a, b) => (a.date > b.date ? -1 : a.date < b.date ? 1 : 0));
return items;
}
export async function cmdLogShow(
storageRoot: string,
filter: LogShowFilter,
): Promise<Array<LogEntry>> {
const dir = logsDir(storageRoot);
let files: Array<string>;
if (filter.date !== null) {
files = [`${filter.date}.jsonl`];
} else {
files = await listLogFiles(dir);
}
let entries: Array<LogEntry> = [];
for (const file of files) {
try {
const parsed = await parseJsonlFile(join(dir, file));
entries = entries.concat(parsed);
} catch {
// file doesn't exist or is unreadable, skip
}
}
if (filter.thread !== null) {
entries = entries.filter((e) => e.thread === filter.thread);
}
if (filter.process !== null) {
entries = entries.filter((e) => e.pid === filter.process);
}
entries.sort((a, b) => (a.ts < b.ts ? -1 : a.ts > b.ts ? 1 : 0));
return entries;
}
export async function cmdLogClean(storageRoot: string, before: string): Promise<LogCleanResult> {
const dir = logsDir(storageRoot);
const files = await listLogFiles(dir);
let deleted = 0;
for (const name of files) {
const date = dateFromFilename(name);
if (date < before) {
await unlink(join(dir, name));
deleted++;
}
}
return { deleted };
}
+89 -19
View File
@@ -1,10 +1,45 @@
import { existsSync, readFileSync, writeFileSync, mkdirSync } from "node:fs";
import { homedir } from "node:os";
import { join, resolve } from "node:path";
import { createInterface } from "node:readline/promises";
import { existsSync, mkdirSync, readFileSync, writeFileSync } from "node:fs";
import { join } from "node:path";
import { stdin as input, stdout as output } from "node:process";
import { createInterface } from "node:readline/promises";
import type { Result } from "@uncaged/workflow-util";
import { parse, stringify } from "yaml";
import { stringify, parse } from "yaml";
/**
* Send a minimal chat completion request to verify the model is reachable.
* Returns ok on 2xx, error with reason string otherwise.
*/
export async function validateModel(
baseUrl: string,
apiKey: string,
model: string,
): Promise<Result<void, string>> {
try {
const url = `${baseUrl.replace(/\/+$/, "")}/chat/completions`;
const res = await fetch(url, {
method: "POST",
headers: {
Authorization: `Bearer ${apiKey}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model,
messages: [{ role: "user", content: "hi" }],
max_tokens: 1,
}),
signal: AbortSignal.timeout(15_000),
});
if (!res.ok) {
return { ok: false, error: `HTTP ${res.status} ${res.statusText}` };
}
return { ok: true, value: undefined };
} catch (err: unknown) {
if (err instanceof DOMException && err.name === "AbortError") {
return { ok: false, error: "Request timed out — model endpoint unreachable" };
}
return { ok: false, error: `Network error — could not reach endpoint (${String(err)})` };
}
}
/**
* Preset provider list — embedded to avoid runtime YAML loading dependency.
@@ -17,10 +52,18 @@ const PRESET_PROVIDERS = [
{ name: "openrouter", label: "OpenRouter", baseUrl: "https://openrouter.ai/api/v1" },
{ name: "venice", label: "Venice", baseUrl: "https://api.venice.ai/api/v1" },
// China
{ name: "dashscope", label: "DashScope (Alibaba)", baseUrl: "https://dashscope.aliyuncs.com/compatible-mode/v1" },
{
name: "dashscope",
label: "DashScope (Alibaba)",
baseUrl: "https://dashscope.aliyuncs.com/compatible-mode/v1",
},
{ name: "deepseek", label: "DeepSeek", baseUrl: "https://api.deepseek.com/v1" },
{ name: "siliconflow", label: "SiliconFlow", baseUrl: "https://api.siliconflow.cn/v1" },
{ name: "volcengine", label: "Volcengine (ByteDance)", baseUrl: "https://ark.cn-beijing.volces.com/api/v3" },
{
name: "volcengine",
label: "Volcengine (ByteDance)",
baseUrl: "https://ark.cn-beijing.volces.com/api/v3",
},
{ name: "kimi", label: "Kimi (Moonshot)", baseUrl: "https://api.moonshot.cn/v1" },
{ name: "glm", label: "GLM (Zhipu AI)", baseUrl: "https://open.bigmodel.cn/api/paas/v4" },
{ name: "stepfun", label: "StepFun", baseUrl: "https://api.stepfun.com/v1" },
@@ -98,21 +141,27 @@ function apiKeyEnvName(providerName: string): string {
* Merge setup args into config.yaml structure. Non-destructive — preserves existing entries.
*/
function mergeConfig(existing: Record<string, unknown>, args: SetupArgs): Record<string, unknown> {
const providers = (typeof existing.providers === "object" && existing.providers !== null
? { ...(existing.providers as Record<string, unknown>) }
: {}) as Record<string, unknown>;
const providers = (
typeof existing.providers === "object" && existing.providers !== null
? { ...(existing.providers as Record<string, unknown>) }
: {}
) as Record<string, unknown>;
const envName = apiKeyEnvName(args.provider);
providers[args.provider] = { baseUrl: args.baseUrl, apiKeyEnv: envName };
const models = (typeof existing.models === "object" && existing.models !== null
? { ...(existing.models as Record<string, unknown>) }
: {}) as Record<string, unknown>;
const models = (
typeof existing.models === "object" && existing.models !== null
? { ...(existing.models as Record<string, unknown>) }
: {}
) as Record<string, unknown>;
models.default = { provider: args.provider, name: args.model };
const agents = (typeof existing.agents === "object" && existing.agents !== null
? { ...(existing.agents as Record<string, unknown>) }
: {}) as Record<string, unknown>;
const agents = (
typeof existing.agents === "object" && existing.agents !== null
? { ...(existing.agents as Record<string, unknown>) }
: {}
) as Record<string, unknown>;
const agentName = args.agent ?? "hermes";
if (Object.keys(agents).length === 0) {
@@ -150,12 +199,16 @@ export async function cmdSetup(args: SetupArgs): Promise<Record<string, unknown>
envData[envName] = args.apiKey;
saveEnvFile(envPath, envData);
// Validate model connectivity
const validation = await validateModel(args.baseUrl, args.apiKey, args.model);
return {
configPath,
envPath,
provider: args.provider,
model: args.model,
defaultAgent: merged.defaultAgent,
validation,
};
}
@@ -211,8 +264,12 @@ async function fetchModels(baseUrl: string, apiKey: string): Promise<string[]> {
if (!res.ok) return [];
const body = (await res.json()) as { data?: { id: string }[] };
if (!Array.isArray(body.data)) return [];
const NON_CHAT = /speech|embed|image|video|audio|ocr|rerank|tts|asr|paraformer|sambert|cosyvoice|wordart|wanx|wan2|flux|stable-diffusion|gui-/i;
return body.data.map((m) => m.id).filter((id) => !NON_CHAT.test(id)).sort();
const NON_CHAT =
/speech|embed|image|video|audio|ocr|rerank|tts|asr|paraformer|sambert|cosyvoice|wordart|wanx|wan2|flux|stable-diffusion|gui-/i;
return body.data
.map((m) => m.id)
.filter((id) => !NON_CHAT.test(id))
.sort();
} catch {
return [];
}
@@ -311,7 +368,7 @@ export async function cmdSetupInteractive(storageRoot: string): Promise<Record<s
console.log(`${providerName}/${model}\n`);
await cmdSetup({
const setupResult = await cmdSetup({
provider: providerName,
baseUrl,
apiKey,
@@ -319,6 +376,19 @@ export async function cmdSetupInteractive(storageRoot: string): Promise<Record<s
storageRoot,
});
// Show validation result
if (setupResult.validation && typeof setupResult.validation === "object") {
const v = setupResult.validation as { ok: boolean; error?: string };
if (v.ok) {
console.log("✓ Model verified — connection successful.\n");
} else {
console.log(`\n⚠ Warning: Could not reach model — ${v.error}`);
console.log(
" Config saved, but you may want to try a different model or check your API key.\n",
);
}
}
console.log("Setup complete! Get started:\n");
console.log(" uwf workflow put <workflow.yaml> Register a workflow");
console.log(' uwf thread start <name> -p "..." Start a thread');
@@ -0,0 +1 @@
export { generateCliReference as cmdSkillCli } from "@uncaged/workflow-util";
+153 -17
View File
@@ -1,4 +1,5 @@
import { execFileSync } from "node:child_process";
import { readFile } from "node:fs/promises";
import type { Store as CasStore, JSONSchema } from "@uncaged/json-cas";
import { getSchema, validate } from "@uncaged/json-cas";
import { getEnvPath, loadWorkflowConfig } from "@uncaged/workflow-agent-kit";
@@ -22,27 +23,42 @@ import type {
WorkflowConfig,
WorkflowPayload,
} from "@uncaged/workflow-protocol";
import { generateUlid } from "@uncaged/workflow-util";
import { createProcessLogger, generateUlid, type ProcessLogger } from "@uncaged/workflow-util";
import { config as loadDotenv } from "dotenv";
import { stringify } from "yaml";
import { parse, stringify } from "yaml";
import {
appendThreadHistory,
createUwfStore,
discoverProjectWorkflows,
findThreadInHistory,
loadThreadHistory,
loadThreadsIndex,
loadWorkflowRegistry,
resolveProjectWorkflowFile,
resolveWorkflowHash,
saveThreadsIndex,
type ThreadHistoryLine,
type UwfStore,
} from "../store.js";
import { isCasRef } from "../validate.js";
import { checkWorkflowFilenameConsistency, isCasRef, parseWorkflowPayload } from "../validate.js";
import { materializeWorkflowPayload } from "./workflow.js";
const END_ROLE = "$END";
export const THREAD_READ_DEFAULT_QUOTA = 4000;
const PL_THREAD_START = "7HNQ4B2X";
const PL_MODERATOR = "M3K8V9T1";
const PL_AGENT_SPAWN = "R5J2W8N4";
const PL_AGENT_DONE = "C6P9E3H7";
const PL_THREAD_ARCHIVED = "F4D8Q2K5";
const PL_STEP_ERROR = "B8T5N1V6";
function failStep(plog: ProcessLogger, message: string): never {
plog.log(PL_STEP_ERROR, message, null);
fail(message);
}
type ChainState = {
startHash: CasRef;
start: StartNodePayload;
@@ -66,11 +82,55 @@ function fail(message: string): never {
process.exit(1);
}
async function materializeLocalWorkflow(uwf: UwfStore, filePath: string): Promise<CasRef> {
let text: string;
try {
text = await readFile(filePath, "utf8");
} catch {
fail(`project workflow file not found: ${filePath}`);
}
let raw: unknown;
try {
raw = parse(text) as unknown;
} catch (e) {
fail(`invalid YAML in ${filePath}: ${e instanceof Error ? e.message : String(e)}`);
}
const payload = parseWorkflowPayload(raw);
if (payload === null) {
fail(`invalid workflow YAML in ${filePath}: expected WorkflowPayload shape`);
}
const filenameError = checkWorkflowFilenameConsistency(filePath, payload);
if (filenameError !== null) {
fail(filenameError);
}
const materialized = await materializeWorkflowPayload(uwf, payload);
const hash = await uwf.store.put(uwf.schemas.workflow, materialized);
const stored = uwf.store.get(hash);
if (stored === null || !validate(uwf.store, stored)) {
fail("stored local workflow failed schema validation");
}
return hash;
}
async function resolveWorkflowCasRef(
uwf: UwfStore,
storageRoot: string,
workflowId: string,
projectRoot: string,
): Promise<CasRef> {
// Project-local resolution: check .workflows/<workflowId>.yaml first
const localEntries = await discoverProjectWorkflows(projectRoot);
const localFile = resolveProjectWorkflowFile(localEntries, workflowId);
if (localFile !== null) {
return materializeLocalWorkflow(uwf, localFile);
}
// Global registry fallback
const registry = await loadWorkflowRegistry(storageRoot);
const hash = resolveWorkflowHash(registry, workflowId);
if (!isCasRef(hash)) {
@@ -114,11 +174,16 @@ export async function cmdThreadStart(
storageRoot: string,
workflowId: string,
prompt: string,
projectRoot: string,
): Promise<StartOutput> {
const uwf = await createUwfStore(storageRoot);
const workflowHash = await resolveWorkflowCasRef(uwf, storageRoot, workflowId);
const workflowHash = await resolveWorkflowCasRef(uwf, storageRoot, workflowId, projectRoot);
const threadId = generateUlid(Date.now()) as ThreadId;
const plog = createProcessLogger({
storageRoot,
context: { thread: threadId, workflow: workflowHash },
});
const startPayload: StartNodePayload = {
workflow: workflowHash,
prompt,
@@ -134,6 +199,12 @@ export async function cmdThreadStart(
index[threadId] = headHash;
await saveThreadsIndex(storageRoot, index);
plog.log(
PL_THREAD_START,
`thread created workflow=${workflowHash} thread=${threadId} head=${headHash}`,
null,
);
return { workflow: workflowHash, thread: threadId };
}
@@ -500,7 +571,8 @@ function formatThreadReadMarkdown(options: {
];
const roleDef = workflow.roles[item.payload.role];
if (roleDef) {
stepLines.push("", "### Prompt", "", roleDef.systemPrompt);
const prompt = roleDef.goal;
stepLines.push("", "### Prompt", "", prompt);
}
if (item.payload.detail) {
const content = extractLastAssistantContent(uwf, item.payload.detail);
@@ -574,13 +646,20 @@ function resolveAgentConfig(
return agentConfig;
}
function spawnAgent(agent: AgentConfig, threadId: ThreadId, role: string): CasRef {
function spawnAgent(
plog: ProcessLogger,
agent: AgentConfig,
threadId: ThreadId,
role: string,
edgePrompt: string,
): CasRef {
const argv = [...agent.args, threadId, role];
const env = { ...process.env, UWF_EDGE_PROMPT: edgePrompt };
let stdout: string;
try {
stdout = execFileSync(agent.command, argv, {
encoding: "utf8",
env: process.env,
env,
stdio: ["ignore", "pipe", "pipe"],
});
} catch (e) {
@@ -592,12 +671,12 @@ function spawnAgent(agent: AgentConfig, threadId: ThreadId, role: string): CasRe
? err.stderr
: err.stderr.toString("utf8");
const detail = stderr.trim() !== "" ? `: ${stderr.trim()}` : "";
fail(`agent command failed (${agent.command})${detail}`);
failStep(plog, `agent command failed (${agent.command})${detail}`);
}
const line = stdout.trim().split("\n").pop()?.trim() ?? "";
if (!isCasRef(line)) {
fail(`agent stdout is not a valid CAS hash: ${line || "(empty)"}`);
failStep(plog, `agent stdout is not a valid CAS hash: ${line || "(empty)"}`);
}
return line;
}
@@ -623,12 +702,54 @@ export async function cmdThreadStep(
storageRoot: string,
threadId: ThreadId,
agentOverride: string | null,
): Promise<StepOutput> {
count: number,
): Promise<StepOutput[]> {
if (count < 1 || !Number.isInteger(count)) {
fail(`--count must be a positive integer, got: ${count}`);
}
const workflowHash = await resolveActiveThreadWorkflowHash(storageRoot, threadId);
const plog = createProcessLogger({
storageRoot,
context: { thread: threadId, workflow: workflowHash },
});
const results: StepOutput[] = [];
for (let i = 0; i < count; i++) {
const result = await cmdThreadStepOnce(storageRoot, threadId, agentOverride, plog);
results.push(result);
if (result.done) {
break;
}
}
return results;
}
async function resolveActiveThreadWorkflowHash(
storageRoot: string,
threadId: ThreadId,
): Promise<CasRef> {
const index = await loadThreadsIndex(storageRoot);
const headHash = index[threadId];
if (headHash === undefined) {
fail(`thread not active: ${threadId}`);
}
const uwf = await createUwfStore(storageRoot);
const chain = walkChain(uwf, headHash);
return chain.start.workflow;
}
async function cmdThreadStepOnce(
storageRoot: string,
threadId: ThreadId,
agentOverride: string | null,
plog: ProcessLogger,
): Promise<StepOutput> {
const index = await loadThreadsIndex(storageRoot);
const headHash = index[threadId];
if (headHash === undefined) {
failStep(plog, `thread not active: ${threadId}`);
}
const uwf = await createUwfStore(storageRoot);
const chain = walkChain(uwf, headHash);
@@ -638,10 +759,17 @@ export async function cmdThreadStep(
const nextResult = await evaluate(workflow, context);
if (!nextResult.ok) {
fail(nextResult.error.message);
failStep(plog, `moderator evaluate failed: ${nextResult.error.message}`);
}
if (nextResult.value === END_ROLE) {
plog.log(
PL_MODERATOR,
`moderator role=${nextResult.value.role} prompt=${nextResult.value.prompt}`,
null,
);
if (nextResult.value.role === END_ROLE) {
plog.log(PL_THREAD_ARCHIVED, `thread archived head=${headHash}`, null);
await archiveThread(storageRoot, threadId, workflowHash, headHash);
return {
workflow: workflowHash,
@@ -651,18 +779,25 @@ export async function cmdThreadStep(
};
}
const role = nextResult.value;
const role = nextResult.value.role;
const edgePrompt = nextResult.value.prompt;
const config = await loadWorkflowConfig(storageRoot);
const agent = resolveAgentConfig(config, workflow, role, agentOverride);
plog.log(PL_AGENT_SPAWN, `spawning agent command=${agent.command}`, {
args: [...agent.args, threadId, role].join(" "),
});
loadDotenv({ path: getEnvPath(storageRoot) });
const newHead = spawnAgent(agent, threadId, role);
const newHead = spawnAgent(plog, agent, threadId, role, edgePrompt);
plog.log(PL_AGENT_DONE, `agent returned head=${newHead}`, null);
// Re-create store to pick up nodes written by the agent subprocess
const uwfAfter = await createUwfStore(storageRoot);
const newNode = uwfAfter.store.get(newHead);
if (newNode === null || newNode.type !== uwfAfter.schemas.stepNode) {
fail(`agent returned hash that is not a StepNode: ${newHead}`);
failStep(plog, `agent returned hash that is not a StepNode: ${newHead}`);
}
// Reload threads index to avoid overwriting changes made by the agent subprocess
@@ -674,11 +809,12 @@ export async function cmdThreadStep(
const contextAfter = buildModeratorContext(uwfAfter, chainAfter);
const afterResult = await evaluate(workflow, contextAfter);
if (!afterResult.ok) {
fail(afterResult.error.message);
failStep(plog, `post-step moderator evaluate failed: ${afterResult.error.message}`);
}
const done = afterResult.value === END_ROLE;
const done = afterResult.value.role === END_ROLE;
if (done) {
plog.log(PL_THREAD_ARCHIVED, `thread archived head=${newHead}`, null);
await archiveThread(storageRoot, threadId, workflowHash, newHead);
}
+70 -17
View File
@@ -2,22 +2,31 @@ import { readFile } from "node:fs/promises";
import type { JSONSchema } from "@uncaged/json-cas";
import { putSchema, validate } from "@uncaged/json-cas";
import type { CasRef, RoleDefinition, WorkflowPayload } from "@uncaged/workflow-protocol";
import type {
CasRef,
RoleDefinition,
Transition,
WorkflowPayload,
} from "@uncaged/workflow-protocol";
import { parse } from "yaml";
import {
createUwfStore,
discoverProjectWorkflows,
findRegistryName,
loadWorkflowRegistry,
resolveWorkflowHash,
saveWorkflowRegistry,
type UwfStore,
} from "../store.js";
import { parseWorkflowPayload } from "../validate.js";
import { checkWorkflowFilenameConsistency, parseWorkflowPayload } from "../validate.js";
export type WorkflowOrigin = "local" | "global";
export type WorkflowListEntry = {
name: string;
hash: CasRef;
origin: WorkflowOrigin;
};
export type WorkflowPutOutput = {
@@ -42,35 +51,55 @@ function isJsonSchema(value: unknown): value is JSONSchema {
return typeof value === "object" && value !== null && !Array.isArray(value);
}
async function resolveOutputSchemaRef(
/** Normalize graph transitions: ensure condition is null (not undefined) for fallback entries. */
function normalizeGraph(graph: Record<string, Transition[]>): Record<string, Transition[]> {
const result: Record<string, Transition[]> = {};
for (const [node, transitions] of Object.entries(graph)) {
result[node] = transitions.map((t) => {
if (typeof t.prompt !== "string" || t.prompt.trim() === "") {
fail(`graph[${node}] transition to "${t.role}": prompt is required (non-empty string)`);
}
return {
role: t.role,
condition: t.condition ?? null,
prompt: t.prompt,
};
});
}
return result;
}
async function resolveFrontmatterRef(
uwf: UwfStore,
roleName: string,
outputSchema: unknown,
frontmatter: unknown,
): Promise<CasRef> {
if (!isJsonSchema(outputSchema)) {
fail(`role "${roleName}": outputSchema must be a JSON Schema object`);
if (!isJsonSchema(frontmatter)) {
fail(`role "${roleName}": frontmatter must be a JSON Schema object`);
}
const schema: JSONSchema = outputSchema.title === undefined
? { ...outputSchema, title: roleName }
: outputSchema;
const schema: JSONSchema =
frontmatter.title === undefined ? { ...frontmatter, title: roleName } : frontmatter;
return putSchema(uwf.store, schema);
}
async function materializeWorkflowPayload(
export async function materializeWorkflowPayload(
uwf: UwfStore,
raw: WorkflowPayload,
): Promise<WorkflowPayload> {
const roles: Record<string, RoleDefinition> = {};
for (const [roleName, role] of Object.entries(raw.roles)) {
const outputSchema = await resolveOutputSchemaRef(
const frontmatter = await resolveFrontmatterRef(
uwf,
`${raw.name}.${roleName}`,
role.outputSchema,
role.frontmatter,
);
roles[roleName] = {
description: role.description,
systemPrompt: role.systemPrompt,
outputSchema,
goal: role.goal,
capabilities: role.capabilities,
procedure: role.procedure,
output: role.output,
frontmatter,
};
}
return {
@@ -78,7 +107,7 @@ async function materializeWorkflowPayload(
description: raw.description,
roles,
conditions: raw.conditions,
graph: raw.graph,
graph: normalizeGraph(raw.graph),
};
}
@@ -105,6 +134,11 @@ export async function cmdWorkflowPut(
fail("invalid workflow YAML: expected WorkflowPayload shape");
}
const filenameError = checkWorkflowFilenameConsistency(filePath, payload);
if (filenameError !== null) {
fail(filenameError);
}
const uwf = await createUwfStore(storageRoot);
const materialized = await materializeWorkflowPayload(uwf, payload);
@@ -147,7 +181,26 @@ export async function cmdWorkflowShow(
};
}
export async function cmdWorkflowList(storageRoot: string): Promise<WorkflowListEntry[]> {
export async function cmdWorkflowList(
storageRoot: string,
projectRoot: string,
): Promise<WorkflowListEntry[]> {
const localEntries = await discoverProjectWorkflows(projectRoot);
const registry = await loadWorkflowRegistry(storageRoot);
return Object.entries(registry).map(([name, hash]) => ({ name, hash }));
const result: WorkflowListEntry[] = [];
const localNames = new Set<string>();
for (const entry of localEntries) {
localNames.add(entry.name);
result.push({ name: entry.name, hash: "(local)", origin: "local" });
}
for (const [name, hash] of Object.entries(registry)) {
if (!localNames.has(name)) {
result.push({ name, hash, origin: "global" });
}
}
return result;
}
+7 -7
View File
@@ -1,15 +1,14 @@
import type { Hash, Store } from "@uncaged/json-cas";
import { putSchema } from "@uncaged/json-cas";
import {
START_NODE_SCHEMA,
STEP_NODE_SCHEMA,
WORKFLOW_SCHEMA,
} from "@uncaged/workflow-protocol";
import { START_NODE_SCHEMA, STEP_NODE_SCHEMA, WORKFLOW_SCHEMA } from "@uncaged/workflow-protocol";
export const TEXT_SCHEMA = { type: "string" as const };
export type UwfSchemaHashes = {
workflow: Hash;
startNode: Hash;
stepNode: Hash;
text: Hash;
};
/**
@@ -17,10 +16,11 @@ export type UwfSchemaHashes = {
* Idempotent: safe to call on every CLI invocation.
*/
export async function registerUwfSchemas(store: Store): Promise<UwfSchemaHashes> {
const [workflow, startNode, stepNode] = await Promise.all([
const [workflow, startNode, stepNode, text] = await Promise.all([
putSchema(store, WORKFLOW_SCHEMA),
putSchema(store, START_NODE_SCHEMA),
putSchema(store, STEP_NODE_SCHEMA),
putSchema(store, TEXT_SCHEMA),
]);
return { workflow, startNode, stepNode };
return { workflow, startNode, stepNode, text };
}
+57 -3
View File
@@ -1,8 +1,8 @@
import { appendFile, mkdir, readFile, writeFile } from "node:fs/promises";
import { appendFile, mkdir, readdir, readFile, writeFile } from "node:fs/promises";
import { homedir } from "node:os";
import { join } from "node:path";
import type { Hash, Store } from "@uncaged/json-cas";
import type { BootstrapCapableStore, Hash } from "@uncaged/json-cas";
import { createFsStore } from "@uncaged/json-cas-fs";
import type { CasRef, ThreadId, ThreadListItem, ThreadsIndex } from "@uncaged/workflow-protocol";
import { parse, stringify } from "yaml";
@@ -11,6 +11,44 @@ import { registerUwfSchemas, type UwfSchemaHashes } from "./schemas.js";
export type WorkflowRegistry = Record<string, CasRef>;
/** A workflow entry discovered from the project-local .workflows/ directory. */
export type ProjectWorkflowEntry = {
/** Workflow name (from YAML `name` field, equals filename stem). */
name: string;
/** Absolute path to the YAML file. */
filePath: string;
};
/**
* Scan `<projectRoot>/.workflows/*.yaml` (non-recursive) and return discovered entries.
* Returns an empty array if the directory does not exist.
*/
export async function discoverProjectWorkflows(
projectRoot: string,
): Promise<ProjectWorkflowEntry[]> {
const dir = join(projectRoot, ".workflows");
let entries: string[];
try {
entries = await readdir(dir);
} catch (e) {
const err = e as NodeJS.ErrnoException;
if (err.code === "ENOENT" || err.code === "ENOTDIR") {
return [];
}
throw e;
}
const result: ProjectWorkflowEntry[] = [];
for (const entry of entries) {
if (!entry.endsWith(".yaml") && !entry.endsWith(".yml")) {
continue;
}
const stem = entry.endsWith(".yaml") ? entry.slice(0, -5) : entry.slice(0, -4);
result.push({ name: stem, filePath: join(dir, entry) });
}
return result;
}
/** Default filesystem root for uwf data (`~/.uncaged/workflow`). */
export function getDefaultStorageRoot(): string {
return join(homedir(), ".uncaged", "workflow");
@@ -54,7 +92,7 @@ export type ThreadHistoryLine = ThreadListItem & {
export type UwfStore = {
storageRoot: string;
store: Store;
store: BootstrapCapableStore;
schemas: UwfSchemaHashes;
};
@@ -104,6 +142,22 @@ export function resolveWorkflowHash(registry: WorkflowRegistry, id: string): Cas
return registry[id] !== undefined ? registry[id] : id;
}
/**
* Resolve a workflow name to a project-local YAML file path.
* Returns null if the name is not found in the local entries.
*/
export function resolveProjectWorkflowFile(
localEntries: ProjectWorkflowEntry[],
name: string,
): string | null {
for (const entry of localEntries) {
if (entry.name === name) {
return entry.filePath;
}
}
return null;
}
export function findRegistryName(registry: WorkflowRegistry, hash: Hash): string | null {
for (const [name, h] of Object.entries(registry)) {
if (h === hash) {
+45 -4
View File
@@ -1,3 +1,4 @@
import { basename } from "node:path";
import type { CasRef, WorkflowPayload } from "@uncaged/workflow-protocol";
const CAS_REF_PATTERN = /^[0-9A-HJKMNP-TV-Z]{13}$/;
@@ -14,10 +15,18 @@ function isRoleDefinition(value: unknown): boolean {
if (!isRecord(value)) {
return false;
}
const outputSchema = value.outputSchema;
const schemaOk = isRecord(outputSchema) && typeof outputSchema.type === "string";
const frontmatter = value.frontmatter;
const frontmatterOk = isRecord(frontmatter) && typeof frontmatter.type === "string";
const capabilities = value.capabilities;
const capabilitiesOk =
Array.isArray(capabilities) && capabilities.every((c) => typeof c === "string");
return (
typeof value.description === "string" && typeof value.systemPrompt === "string" && schemaOk
typeof value.description === "string" &&
typeof value.goal === "string" &&
capabilitiesOk &&
typeof value.procedure === "string" &&
typeof value.output === "string" &&
frontmatterOk
);
}
@@ -33,7 +42,12 @@ function isTransition(value: unknown): boolean {
return false;
}
const condition = value.condition;
return typeof value.role === "string" && (condition === null || typeof condition === "string");
return (
typeof value.role === "string" &&
typeof value.prompt === "string" &&
value.prompt.trim() !== "" &&
(condition === null || condition === undefined || typeof condition === "string")
);
}
function isStringRecord(value: unknown, itemCheck: (item: unknown) => boolean): boolean {
@@ -52,6 +66,33 @@ function isGraph(value: unknown): boolean {
);
}
/**
* Derive the expected workflow name from a file path (stem without extension).
* Returns the stem for `.yaml` / `.yml` files.
*/
export function workflowNameFromPath(filePath: string): string {
const base = basename(filePath);
if (base.endsWith(".yaml")) return base.slice(0, -5);
if (base.endsWith(".yml")) return base.slice(0, -4);
return base;
}
/**
* Check that the `name` field in a parsed payload matches the expected name
* derived from the file path. Returns an error message string on mismatch,
* or null when the names are consistent.
*/
export function checkWorkflowFilenameConsistency(
filePath: string,
payload: WorkflowPayload,
): string | null {
const expected = workflowNameFromPath(filePath);
if (payload.name !== expected) {
return `workflow name mismatch: file "${basename(filePath)}" implies name "${expected}" but YAML declares name "${payload.name}"`;
}
return null;
}
/** Validate YAML-parsed workflow document shape (outputSchema may be inline JSON Schema). */
export function parseWorkflowPayload(raw: unknown): WorkflowPayload | null {
if (!isRecord(raw)) {
@@ -0,0 +1,16 @@
import { describe, expect, test } from "bun:test";
import type { LlmToolCall } from "../src/llm/types.js";
/** Mirror OpenAI response shape for parser coverage via chatCompletionWithTools integration later. */
describe("LlmToolCall shape", () => {
test("tool call record fields", () => {
const call: LlmToolCall = {
id: "call_1",
name: "read_file",
arguments: '{"path":"README.md"}',
};
expect(call.name).toBe("read_file");
expect(JSON.parse(call.arguments)).toEqual({ path: "README.md" });
});
});
@@ -0,0 +1,21 @@
import { describe, expect, test } from "bun:test";
import { resolvePath } from "../src/tools/path.js";
import { resolve } from "node:path";
describe("resolvePath", () => {
test("resolves relative paths against cwd", () => {
const root = "/workspace/project";
const resolved = resolvePath(root, "src/foo.ts");
expect(resolved).toBe(resolve(root, "src/foo.ts"));
});
test("resolves absolute paths as-is", () => {
const resolved = resolvePath("/workspace", "/etc/hosts");
expect(resolved).toBe("/etc/hosts");
});
test("resolves parent traversal normally", () => {
const resolved = resolvePath("/workspace/project", "../other/file.ts");
expect(resolved).toBe(resolve("/workspace/project", "../other/file.ts"));
});
});
@@ -0,0 +1,67 @@
import { describe, expect, test } from "bun:test";
import type { AgentContext } from "@uncaged/workflow-agent-kit";
import { buildBuiltinPrompt } from "../src/prompt.js";
function minimalContext(overrides: Partial<AgentContext> = {}): AgentContext {
return {
threadId: "00000000000000000000000000" as AgentContext["threadId"],
role: "developer",
store: {} as AgentContext["store"],
workflow: {
name: "test",
roles: {
developer: {
goal: "Ship the fix",
capabilities: ["file-edit"],
procedure: ["Edit files"],
output: "A patch",
frontmatter: "schema-hash",
},
},
conditions: {},
graph: {},
},
start: { workflow: "wf-hash", prompt: "Fix the bug" },
steps: [],
outputFormatInstruction: "---\nstatus: done\n---",
edgePrompt: "Implement the fix described in the plan.",
isFirstVisit: true,
...overrides,
};
}
describe("buildBuiltinPrompt", () => {
test("system includes output format and role goal", () => {
const { system } = buildBuiltinPrompt(minimalContext());
expect(system).toContain("status: done");
expect(system).toContain("## Goal");
expect(system).toContain("Ship the fix");
});
test("user includes task and edge prompt", () => {
const { user } = buildBuiltinPrompt(minimalContext());
expect(user).toContain("## Task");
expect(user).toContain("Fix the bug");
expect(user).toContain("## Current Step Instruction");
expect(user).toContain("Implement the fix");
});
test("user includes history when steps exist", () => {
const { user } = buildBuiltinPrompt(
minimalContext({
steps: [
{
role: "planner",
output: { plan: "step 1" },
agent: "uwf-builtin",
detail: "detail-hash",
},
],
}),
);
expect(user).toContain("## Previous Steps");
expect(user).toContain("planner");
});
});
@@ -0,0 +1,34 @@
{
"name": "@uncaged/workflow-agent-builtin",
"version": "0.5.0",
"files": [
"src",
"dist",
"package.json"
],
"type": "module",
"bin": {
"uwf-builtin": "./src/cli.ts"
},
"exports": {
".": {
"bun": "./src/index.ts",
"types": "./dist/index.d.ts",
"import": "./dist/index.js"
}
},
"scripts": {
"test": "bun test"
},
"dependencies": {
"@uncaged/json-cas": "^0.4.0",
"@uncaged/workflow-agent-kit": "workspace:^",
"@uncaged/workflow-util": "workspace:^"
},
"devDependencies": {
"typescript": "^5.8.3"
},
"publishConfig": {
"access": "public"
}
}
@@ -0,0 +1,126 @@
import type { Store } from "@uncaged/json-cas";
import {
type AgentContext,
type AgentRunResult,
createAgent,
loadWorkflowConfig,
resolveModel,
resolveStorageRoot,
} from "@uncaged/workflow-agent-kit";
import { generateUlid } from "@uncaged/workflow-util";
import { storeBuiltinDetail } from "./detail.js";
import type { ChatMessage } from "./llm/index.js";
import { BUILTIN_CONTINUE_MAX_TURNS, BUILTIN_MAX_TURNS, runBuiltinLoop } from "./loop.js";
import { buildBuiltinPrompt } from "./prompt.js";
import type { BuiltinSessionState } from "./types.js";
const sessions = new Map<string, BuiltinSessionState>();
function getSession(sessionId: string): BuiltinSessionState {
const session = sessions.get(sessionId);
if (session === undefined) {
throw new Error(`builtin session not found: ${sessionId}`);
}
return session;
}
function buildToolContext(storageRoot: string): { cwd: string; storageRoot: string } {
return {
cwd: process.cwd(),
storageRoot,
};
}
async function runBuiltinWithMessages(
storageRoot: string,
provider: ReturnType<typeof resolveModel>,
messages: ChatMessage[],
session: BuiltinSessionState,
store: Store,
maxTurns: number,
): Promise<AgentRunResult> {
const loopResult = await runBuiltinLoop({
provider,
messages,
toolCtx: buildToolContext(storageRoot),
maxTurns,
existingTurns: session.turns,
});
session.messages = loopResult.messages;
session.turns = loopResult.turns;
const { detailHash, output } = await storeBuiltinDetail(
store,
session.sessionId,
session.model,
session.startedAtMs,
session.turns,
);
const finalOutput = output !== "" ? output : loopResult.finalText;
return { output: finalOutput, detailHash, sessionId: session.sessionId };
}
async function runBuiltin(ctx: AgentContext): Promise<AgentRunResult> {
const storageRoot = resolveStorageRoot();
const config = await loadWorkflowConfig(storageRoot);
const provider = resolveModel(config, config.defaultModel);
const sessionId = generateUlid(Date.now());
const promptParts = buildBuiltinPrompt(ctx);
const messages: ChatMessage[] = [
{ role: "system", content: promptParts.system },
{ role: "user", content: promptParts.user },
];
const session: BuiltinSessionState = {
sessionId,
model: provider.model,
startedAtMs: Date.now(),
messages,
turns: [],
};
sessions.set(sessionId, session);
return runBuiltinWithMessages(
storageRoot,
provider,
messages,
session,
ctx.store,
BUILTIN_MAX_TURNS,
);
}
async function continueBuiltin(
sessionId: string,
message: string,
store: Store,
): Promise<AgentRunResult> {
const session = getSession(sessionId);
const storageRoot = resolveStorageRoot();
const config = await loadWorkflowConfig(storageRoot);
const provider = resolveModel(config, config.defaultModel);
const messages: ChatMessage[] = [...session.messages, { role: "user", content: message }];
return runBuiltinWithMessages(
storageRoot,
provider,
messages,
session,
store,
BUILTIN_CONTINUE_MAX_TURNS,
);
}
/** Agent CLI factory: built-in LLM loop with file/shell tools. */
export function createBuiltinAgent(): () => Promise<void> {
return createAgent({
name: "builtin",
run: runBuiltin,
continue: continueBuiltin,
});
}
+6
View File
@@ -0,0 +1,6 @@
#!/usr/bin/env bun
import { createBuiltinAgent } from "./agent.js";
const main = createBuiltinAgent();
void main();
@@ -0,0 +1,115 @@
import { bootstrap, putSchema, type Store } from "@uncaged/json-cas";
import { BUILTIN_DETAIL_SCHEMA, BUILTIN_TURN_SCHEMA } from "./schemas.js";
import type {
BuiltinDetailPayload,
BuiltinLoopTurn,
BuiltinToolCall,
BuiltinTurnPayload,
BuiltinTurnRole,
} from "./types.js";
function mapToolCalls(calls: NonNullable<BuiltinLoopTurn["toolCalls"]>): BuiltinToolCall[] {
return calls.map((call) => ({
name: call.name,
args: call.args,
}));
}
function loopTurnToAssistantPayload(turn: BuiltinLoopTurn, index: number): BuiltinTurnPayload {
return {
index,
role: "assistant",
content: turn.assistantContent ?? "",
toolCalls:
turn.toolCalls !== null && turn.toolCalls.length > 0 ? mapToolCalls(turn.toolCalls) : null,
reasoning: null,
};
}
function loopTurnToToolPayloads(turn: BuiltinLoopTurn, startIndex: number): BuiltinTurnPayload[] {
if (turn.toolResults === null || turn.toolResults.length === 0) {
return [];
}
const payloads: BuiltinTurnPayload[] = [];
let index = startIndex;
for (const result of turn.toolResults) {
payloads.push({
index,
role: "tool" as BuiltinTurnRole,
content: result.content,
toolCalls: null,
reasoning: null,
});
index += 1;
}
return payloads;
}
/** Last assistant message with non-empty text. */
export function extractFinalAssistantText(turns: BuiltinLoopTurn[]): string {
for (let i = turns.length - 1; i >= 0; i--) {
const turn = turns[i];
if (turn === undefined) {
continue;
}
const text = turn.assistantContent;
if (text !== null && text.trim() !== "") {
return text;
}
}
return "";
}
type BuiltinSchemaHashes = {
turn: string;
detail: string;
};
async function registerBuiltinSchemas(store: Store): Promise<BuiltinSchemaHashes> {
await bootstrap(store);
const [turn, detail] = await Promise.all([
putSchema(store, BUILTIN_TURN_SCHEMA),
putSchema(store, BUILTIN_DETAIL_SCHEMA),
]);
return { turn, detail };
}
export async function storeBuiltinDetail(
store: Store,
sessionId: string,
model: string,
startedAtMs: number,
turns: BuiltinLoopTurn[],
nowMs: number = Date.now(),
): Promise<{ detailHash: string; output: string }> {
const schemas = await registerBuiltinSchemas(store);
const turnHashes: string[] = [];
let turnIndex = 0;
for (const loopTurn of turns) {
const assistant = loopTurnToAssistantPayload(loopTurn, turnIndex);
const assistantHash = await store.put(schemas.turn, assistant);
turnHashes.push(assistantHash);
turnIndex += 1;
const toolPayloads = loopTurnToToolPayloads(loopTurn, turnIndex);
for (const toolPayload of toolPayloads) {
const toolHash = await store.put(schemas.turn, toolPayload);
turnHashes.push(toolHash);
turnIndex += 1;
}
}
const duration = Math.max(0, nowMs - startedAtMs);
const detail: BuiltinDetailPayload = {
sessionId,
model,
duration,
turnCount: turnHashes.length,
turns: turnHashes,
};
const detailHash = await store.put(schemas.detail, detail);
const output = extractFinalAssistantText(turns);
return { detailHash, output };
}
@@ -0,0 +1,14 @@
export { createBuiltinAgent } from "./agent.js";
export { extractFinalAssistantText, storeBuiltinDetail } from "./detail.js";
export type { ChatMessage, LlmAssistantResponse, LlmToolCall } from "./llm/index.js";
export { chatCompletionWithTools } from "./llm/index.js";
export { BUILTIN_CONTINUE_MAX_TURNS, BUILTIN_MAX_TURNS, runBuiltinLoop } from "./loop.js";
export { buildBuiltinPrompt } from "./prompt.js";
export type { BuiltinTool, ToolContext } from "./tools/index.js";
export { executeBuiltinTool, getBuiltinTools } from "./tools/index.js";
export type {
BuiltinDetailPayload,
BuiltinLoopTurn,
BuiltinSessionState,
BuiltinTurnPayload,
} from "./types.js";
@@ -0,0 +1,7 @@
export { chatCompletionWithTools } from "./llm.js";
export type {
ChatMessage,
LlmAssistantResponse,
LlmToolCall,
OpenAiToolDefinition,
} from "./types.js";
@@ -0,0 +1,135 @@
import type { ResolvedLlmProvider } from "@uncaged/workflow-agent-kit";
import type {
ChatMessage,
LlmAssistantResponse,
LlmToolCall,
OpenAiToolDefinition,
} from "./types.js";
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null && !Array.isArray(value);
}
function chatUrl(baseUrl: string): string {
const trimmed = baseUrl.replace(/\/+$/, "");
return `${trimmed}/chat/completions`;
}
function parseToolCalls(raw: unknown): LlmToolCall[] | null {
if (!Array.isArray(raw) || raw.length === 0) {
return null;
}
const calls: LlmToolCall[] = [];
for (const entry of raw) {
if (!isRecord(entry)) {
continue;
}
const id = entry.id;
const fn = entry.function;
if (typeof id !== "string" || !isRecord(fn)) {
continue;
}
const name = fn.name;
const args = fn.arguments;
if (typeof name !== "string" || typeof args !== "string") {
continue;
}
calls.push({ id, name, arguments: args });
}
return calls.length > 0 ? calls : null;
}
function parseAssistantMessage(parsed: unknown): LlmAssistantResponse {
if (!isRecord(parsed)) {
throw new Error("LLM response is not an object");
}
const choices = parsed.choices;
if (!Array.isArray(choices) || choices.length === 0) {
throw new Error("LLM response has no choices");
}
const c0 = choices[0];
if (!isRecord(c0)) {
throw new Error("LLM choice is not an object");
}
const messageObj = c0.message;
if (!isRecord(messageObj)) {
throw new Error("LLM message is not an object");
}
const contentRaw = messageObj.content;
const content =
typeof contentRaw === "string"
? contentRaw
: contentRaw === null || contentRaw === undefined
? null
: null;
const toolCalls = parseToolCalls(messageObj.tool_calls);
return { content, toolCalls };
}
function serializeMessage(message: ChatMessage): Record<string, unknown> {
if (message.role === "tool") {
return {
role: "tool",
tool_call_id: message.tool_call_id,
content: message.content,
};
}
if (message.role === "assistant") {
const base: Record<string, unknown> = {
role: "assistant",
content: message.content,
};
if (message.tool_calls !== null && message.tool_calls.length > 0) {
base.tool_calls = message.tool_calls.map((call) => ({
id: call.id,
type: "function",
function: { name: call.name, arguments: call.arguments },
}));
}
return base;
}
return { role: message.role, content: message.content };
}
/** OpenAI-compatible chat completion with tool calling (non-streaming). */
export async function chatCompletionWithTools(
provider: ResolvedLlmProvider,
messages: ChatMessage[],
tools: OpenAiToolDefinition[],
): Promise<LlmAssistantResponse> {
let response: Response;
try {
response = await fetch(chatUrl(provider.baseUrl), {
method: "POST",
headers: {
Authorization: `Bearer ${provider.apiKey}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: provider.model,
messages: messages.map(serializeMessage),
tools,
tool_choice: "auto",
}),
});
} catch (cause) {
const message = cause instanceof Error ? cause.message : String(cause);
throw new Error(`LLM network error: ${message}`);
}
const responseText = await response.text();
if (!response.ok) {
throw new Error(`LLM HTTP ${response.status}: ${responseText.slice(0, 2000)}`);
}
let parsed: unknown;
try {
parsed = JSON.parse(responseText) as unknown;
} catch (cause) {
const message = cause instanceof Error ? cause.message : String(cause);
throw new Error(`LLM invalid JSON response: ${message}`);
}
return parseAssistantMessage(parsed);
}
@@ -0,0 +1,29 @@
export type LlmToolCall = {
id: string;
name: string;
arguments: string;
};
export type LlmAssistantResponse = {
content: string | null;
toolCalls: LlmToolCall[] | null;
};
export type ChatMessage =
| { role: "system"; content: string }
| { role: "user"; content: string }
| {
role: "assistant";
content: string | null;
tool_calls: LlmToolCall[] | null;
}
| { role: "tool"; tool_call_id: string; content: string };
export type OpenAiToolDefinition = {
type: "function";
function: {
name: string;
description: string;
parameters: Record<string, unknown>;
};
};
+110
View File
@@ -0,0 +1,110 @@
import type { ResolvedLlmProvider } from "@uncaged/workflow-agent-kit";
import { createLogger } from "@uncaged/workflow-util";
import { type ChatMessage, chatCompletionWithTools, type LlmToolCall } from "./llm/index.js";
import {
builtinToolsToOpenAi,
executeBuiltinTool,
getBuiltinTools,
type ToolContext,
} from "./tools/index.js";
import type { BuiltinLoopTurn, BuiltinToolCallRecord, BuiltinToolResultRecord } from "./types.js";
const log = createLogger({ sink: { kind: "stderr" } });
export const BUILTIN_MAX_TURNS = 30;
export const BUILTIN_CONTINUE_MAX_TURNS = 5;
export type RunBuiltinLoopOptions = {
provider: ResolvedLlmProvider;
messages: ChatMessage[];
toolCtx: ToolContext;
maxTurns: number;
existingTurns: BuiltinLoopTurn[];
};
export type RunBuiltinLoopResult = {
finalText: string;
messages: ChatMessage[];
turns: BuiltinLoopTurn[];
};
function mapToolCalls(calls: LlmToolCall[]): BuiltinToolCallRecord[] {
return calls.map((call) => ({
id: call.id,
name: call.name,
args: call.arguments,
}));
}
/** Agent run loop: LLM ↔ tools until no tool_calls or maxTurns. */
export async function runBuiltinLoop(
options: RunBuiltinLoopOptions,
): Promise<RunBuiltinLoopResult> {
const messages = [...options.messages];
const turns = [...options.existingTurns];
const openAiTools = builtinToolsToOpenAi(getBuiltinTools());
let finalText = "";
for (let turn = 0; turn < options.maxTurns; turn++) {
log("8K2M4N7P", `builtin loop turn ${turn + 1}/${options.maxTurns}`);
const response = await chatCompletionWithTools(options.provider, messages, openAiTools);
const assistantMessage: ChatMessage = {
role: "assistant",
content: response.content,
tool_calls: response.toolCalls,
};
messages.push(assistantMessage);
if (response.toolCalls === null || response.toolCalls.length === 0) {
finalText = response.content ?? "";
turns.push({
assistantContent: response.content,
toolCalls: null,
toolResults: null,
});
break;
}
const toolCallRecords = mapToolCalls(response.toolCalls);
const toolResults: BuiltinToolResultRecord[] = [];
for (const call of response.toolCalls) {
const result = await executeBuiltinTool(call.name, call.arguments, options.toolCtx);
toolResults.push({
toolCallId: call.id,
name: call.name,
content: result,
});
messages.push({
role: "tool",
tool_call_id: call.id,
content: result,
});
}
turns.push({
assistantContent: response.content,
toolCalls: toolCallRecords,
toolResults,
});
}
if (finalText === "" && messages.length > 0) {
for (let i = messages.length - 1; i >= 0; i--) {
const msg = messages[i];
if (
msg !== undefined &&
msg.role === "assistant" &&
msg.content !== null &&
msg.content.trim() !== ""
) {
finalText = msg.content;
break;
}
}
}
return { finalText, messages, turns };
}
@@ -0,0 +1,50 @@
import { type AgentContext, buildRolePrompt } from "@uncaged/workflow-agent-kit";
function buildHistorySummary(steps: AgentContext["steps"]): string {
if (steps.length === 0) {
return "";
}
const lines: string[] = ["## Previous Steps"];
for (let i = 0; i < steps.length; i++) {
const step = steps[i];
if (step === undefined) {
continue;
}
lines.push("");
lines.push(`### Step ${i + 1}: ${step.role}`);
lines.push(`Output: ${JSON.stringify(step.output)}`);
lines.push(`Agent: ${step.agent}`);
}
return lines.join("\n");
}
export type BuiltinPromptParts = {
system: string;
user: string;
};
/** Assemble system prompt (role + format) and user prompt (task + edge + history). */
export function buildBuiltinPrompt(ctx: AgentContext): BuiltinPromptParts {
const roleDef = ctx.workflow.roles[ctx.role];
const rolePrompt = roleDef !== undefined ? buildRolePrompt(roleDef) : "";
const systemParts: string[] = [];
if (ctx.outputFormatInstruction !== "") {
systemParts.push(ctx.outputFormatInstruction, "");
}
systemParts.push(rolePrompt);
const userParts: string[] = ["## Task", ctx.start.prompt];
if (ctx.edgePrompt !== "") {
userParts.push("", "## Current Step Instruction", ctx.edgePrompt);
}
const historyBlock = buildHistorySummary(ctx.steps);
if (historyBlock !== "") {
userParts.push("", historyBlock);
}
return {
system: systemParts.join("\n"),
user: userParts.join("\n"),
};
}
@@ -0,0 +1,46 @@
import type { JSONSchema } from "@uncaged/json-cas";
const BUILTIN_TOOL_CALL_SCHEMA: JSONSchema = {
type: "object",
required: ["name", "args"],
properties: {
name: { type: "string" },
args: { type: "string" },
},
additionalProperties: false,
};
export const BUILTIN_TURN_SCHEMA: JSONSchema = {
title: "builtin-turn",
type: "object",
required: ["index", "role", "content"],
properties: {
index: { type: "integer" },
role: { type: "string", enum: ["assistant", "tool"] },
content: { type: "string" },
toolCalls: {
anyOf: [{ type: "array", items: BUILTIN_TOOL_CALL_SCHEMA }, { type: "null" }],
},
reasoning: {
anyOf: [{ type: "string" }, { type: "null" }],
},
},
additionalProperties: false,
};
export const BUILTIN_DETAIL_SCHEMA: JSONSchema = {
title: "builtin-detail",
type: "object",
required: ["sessionId", "model", "duration", "turnCount", "turns"],
properties: {
sessionId: { type: "string" },
model: { type: "string" },
duration: { type: "integer" },
turnCount: { type: "integer" },
turns: {
type: "array",
items: { type: "string", format: "cas_ref" },
},
},
additionalProperties: false,
};
@@ -0,0 +1,44 @@
import type { OpenAiToolDefinition } from "../llm/index.js";
import { readFileTool } from "./read-file.js";
import { runCommandTool } from "./run-command.js";
import type { BuiltinTool, ToolContext } from "./types.js";
import { writeFileTool } from "./write-file.js";
export { resolvePath } from "./path.js";
export type { BuiltinTool, ToolContext } from "./types.js";
const BUILTIN_TOOLS: BuiltinTool[] = [readFileTool, writeFileTool, runCommandTool];
export function getBuiltinTools(): readonly BuiltinTool[] {
return BUILTIN_TOOLS;
}
export function builtinToolsToOpenAi(tools: readonly BuiltinTool[]): OpenAiToolDefinition[] {
return tools.map((tool) => ({
type: "function",
function: {
name: tool.name,
description: tool.description,
parameters: tool.parameters as Record<string, unknown>,
},
}));
}
export async function executeBuiltinTool(
name: string,
argsJson: string,
ctx: ToolContext,
): Promise<string> {
const tool = BUILTIN_TOOLS.find((t) => t.name === name);
if (tool === undefined) {
return `Error: unknown tool ${name}`;
}
let args: unknown;
try {
args = JSON.parse(argsJson) as unknown;
} catch {
return "Error: tool arguments must be valid JSON";
}
return tool.execute(args, ctx);
}
@@ -0,0 +1,6 @@
import { resolve } from "node:path";
/** Resolve a path relative to the working directory. */
export function resolvePath(cwd: string, inputPath: string): string {
return resolve(cwd, inputPath);
}
@@ -0,0 +1,41 @@
import { readFile, stat } from "node:fs/promises";
import { resolvePath } from "./path.js";
import type { BuiltinTool } from "./types.js";
const MAX_READ_BYTES = 512 * 1024;
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null && !Array.isArray(value);
}
export const readFileTool: BuiltinTool = {
name: "read_file",
description: "Read a UTF-8 text file from the workspace.",
parameters: {
type: "object",
required: ["path"],
properties: {
path: { type: "string", description: "Relative or absolute path within the workspace." },
},
additionalProperties: false,
},
execute: async (args, ctx) => {
if (!isRecord(args) || typeof args.path !== "string") {
return "Error: path must be a string";
}
const resolved = resolvePath(ctx.cwd, args.path);
try {
const info = await stat(resolved);
if (!info.isFile()) {
return "Error: not a file";
}
if (info.size > MAX_READ_BYTES) {
return `Error: file exceeds ${MAX_READ_BYTES} byte limit`;
}
return await readFile(resolved, "utf8");
} catch (cause) {
const message = cause instanceof Error ? cause.message : String(cause);
return `Error: ${message}`;
}
},
};
@@ -0,0 +1,96 @@
import { spawn } from "node:child_process";
import { resolvePath } from "./path.js";
import type { BuiltinTool } from "./types.js";
const COMMAND_TIMEOUT_MS = 60_000;
const MAX_OUTPUT_CHARS = 32_000;
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null && !Array.isArray(value);
}
function truncate(text: string, maxChars: number): string {
if (text.length <= maxChars) {
return text;
}
return `${text.slice(0, maxChars)}\n...(truncated)`;
}
function runShell(
command: string,
cwd: string,
): Promise<{ stdout: string; stderr: string; code: number }> {
return new Promise((resolve, reject) => {
const child = spawn(command, {
cwd,
env: process.env,
shell: true,
stdio: ["ignore", "pipe", "pipe"],
});
let stdout = "";
let stderr = "";
child.stdout?.on("data", (chunk: Buffer) => {
stdout += chunk.toString();
});
child.stderr?.on("data", (chunk: Buffer) => {
stderr += chunk.toString();
});
const timer = setTimeout(() => {
child.kill("SIGTERM");
}, COMMAND_TIMEOUT_MS);
child.on("error", (cause) => {
clearTimeout(timer);
const message = cause instanceof Error ? cause.message : String(cause);
reject(new Error(message));
});
child.on("close", (code) => {
clearTimeout(timer);
resolve({ stdout, stderr, code: code ?? 1 });
});
});
}
export const runCommandTool: BuiltinTool = {
name: "run_command",
description:
"Run a shell command. Output is truncated to 32KB.",
parameters: {
type: "object",
required: ["command"],
properties: {
command: { type: "string", description: "Shell command to execute." },
cwd: {
type: "string",
description: "Optional working directory relative to workspace root.",
},
},
additionalProperties: false,
},
execute: async (args, ctx) => {
if (!isRecord(args) || typeof args.command !== "string") {
return "Error: command must be a string";
}
let workDir = ctx.cwd;
if (args.cwd !== undefined && args.cwd !== null) {
if (typeof args.cwd !== "string") {
return "Error: cwd must be a string";
}
workDir = resolvePath(ctx.cwd, args.cwd);
}
try {
const { stdout, stderr, code } = await runShell(args.command, workDir);
const out = truncate(
`exit_code: ${code}\n--- stdout ---\n${stdout}\n--- stderr ---\n${stderr}`,
MAX_OUTPUT_CHARS,
);
return out;
} catch (cause) {
const message = cause instanceof Error ? cause.message : String(cause);
return `Error: ${message}`;
}
},
};
@@ -0,0 +1,13 @@
import type { JSONSchema } from "@uncaged/json-cas";
export type ToolContext = {
cwd: string;
storageRoot: string;
};
export type BuiltinTool = {
name: string;
description: string;
parameters: JSONSchema;
execute: (args: unknown, ctx: ToolContext) => Promise<string>;
};
@@ -0,0 +1,36 @@
import { mkdir, writeFile } from "node:fs/promises";
import { dirname } from "node:path";
import { resolvePath } from "./path.js";
import type { BuiltinTool } from "./types.js";
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null && !Array.isArray(value);
}
export const writeFileTool: BuiltinTool = {
name: "write_file",
description: "Write UTF-8 text to a file in the workspace (creates parent directories).",
parameters: {
type: "object",
required: ["path", "content"],
properties: {
path: { type: "string", description: "Relative or absolute path within the workspace." },
content: { type: "string", description: "File contents to write." },
},
additionalProperties: false,
},
execute: async (args, ctx) => {
if (!isRecord(args) || typeof args.path !== "string" || typeof args.content !== "string") {
return "Error: path and content must be strings";
}
const resolved = resolvePath(ctx.cwd, args.path);
try {
await mkdir(dirname(resolved), { recursive: true });
await writeFile(resolved, args.content, "utf8");
return `Wrote ${args.content.length} bytes to ${args.path}`;
} catch (cause) {
const message = cause instanceof Error ? cause.message : String(cause);
return `Error: ${message}`;
}
},
};
@@ -0,0 +1,50 @@
import type { ChatMessage } from "./llm/index.js";
export type BuiltinToolCallRecord = {
id: string;
name: string;
args: string;
};
export type BuiltinToolResultRecord = {
toolCallId: string;
name: string;
content: string;
};
export type BuiltinLoopTurn = {
assistantContent: string | null;
toolCalls: BuiltinToolCallRecord[] | null;
toolResults: BuiltinToolResultRecord[] | null;
};
export type BuiltinSessionState = {
sessionId: string;
model: string;
startedAtMs: number;
messages: ChatMessage[];
turns: BuiltinLoopTurn[];
};
export type BuiltinTurnRole = "assistant" | "tool";
export type BuiltinToolCall = {
name: string;
args: string;
};
export type BuiltinTurnPayload = {
index: number;
role: BuiltinTurnRole;
content: string;
toolCalls: BuiltinToolCall[] | null;
reasoning: string | null;
};
export type BuiltinDetailPayload = {
sessionId: string;
model: string;
duration: number;
turnCount: number;
turns: string[];
};
@@ -0,0 +1,9 @@
{
"extends": "../../tsconfig.json",
"compilerOptions": {
"rootDir": "src",
"outDir": "dist"
},
"include": ["src"],
"references": [{ "path": "../workflow-agent-kit" }, { "path": "../workflow-util" }]
}
@@ -0,0 +1,63 @@
import { describe, expect, test } from "bun:test";
import type { AgentContext } from "@uncaged/workflow-agent-kit";
import type { ThreadId } from "@uncaged/workflow-protocol";
import { buildClaudeCodePrompt } from "../src/claude-code.js";
function makeCtx(overrides: Partial<AgentContext> = {}): AgentContext {
return {
threadId: "01JTEST0000000000000000000" as ThreadId,
edgePrompt: "Proceed with the assigned role.",
isFirstVisit: true,
workflow: {
roles: {
developer: {
description: "TDD implementation per test spec",
goal: "Write code",
capabilities: ["coding"],
procedure: "1. Read spec\n2. Write code",
output: "List files changed",
frontmatter: "",
},
},
conditions: {},
graph: {},
},
role: "developer",
start: { prompt: "Fix the bug", workflowHash: "abc123", threadId: "t1" },
steps: [],
store: {} as AgentContext["store"],
outputFormatInstruction: "Use YAML frontmatter",
...overrides,
};
}
describe("buildClaudeCodePrompt", () => {
test("assembles outputFormatInstruction + role prompt + task prompt", () => {
const result = buildClaudeCodePrompt(makeCtx());
expect(result).toMatch(/^Use YAML frontmatter/);
expect(result).toContain("Write code");
expect(result).toContain("## Task\nFix the bug");
});
test("includes previous steps as history summary", () => {
const ctx = makeCtx({
steps: [{ role: "planner", output: '{"plan":"do X"}', agent: "hermes" }],
});
const result = buildClaudeCodePrompt(ctx);
expect(result).toContain("## Previous Steps");
expect(result).toContain("Step 1: planner");
expect(result).toContain("do X");
});
test("omits history section when steps array is empty", () => {
const result = buildClaudeCodePrompt(makeCtx({ steps: [] }));
expect(result).not.toContain("## Previous Steps");
});
test("works without outputFormatInstruction", () => {
const result = buildClaudeCodePrompt(makeCtx({ outputFormatInstruction: "" }));
expect(result).not.toMatch(/^\s*\n/);
expect(result).toContain("Write code");
expect(result).toContain("## Task");
});
});
@@ -0,0 +1,218 @@
import { describe, expect, test } from "bun:test";
import { createMemoryStore, walk } from "@uncaged/json-cas";
import {
parseClaudeCodeJsonOutput,
parseClaudeCodeStreamOutput,
storeClaudeCodeDetail,
storeClaudeCodeRawOutput,
} from "../src/session-detail.js";
import type { ClaudeCodeParsedResult } from "../src/types.js";
describe("parseClaudeCodeJsonOutput", () => {
test("parses valid claude -p --output-format json output", () => {
const stdout = JSON.stringify({
type: "result",
subtype: "success",
result: "Done fixing bug",
session_id: "75e2167f-abc",
num_turns: 3,
total_cost_usd: 0.08,
duration_ms: 10276,
stop_reason: "end_turn",
usage: { input_tokens: 100, output_tokens: 50 },
});
const parsed = parseClaudeCodeJsonOutput(stdout);
expect(parsed).not.toBeNull();
expect(parsed!.type).toBe("result");
expect(parsed!.subtype).toBe("success");
expect(parsed!.result).toBe("Done fixing bug");
expect(parsed!.sessionId).toBe("75e2167f-abc");
expect(parsed!.numTurns).toBe(3);
expect(parsed!.totalCostUsd).toBe(0.08);
expect(parsed!.durationMs).toBe(10276);
expect(parsed!.stopReason).toBe("end_turn");
expect(parsed!.usage.inputTokens).toBe(100);
expect(parsed!.usage.outputTokens).toBe(50);
expect(parsed!.turns).toEqual([]);
});
test("returns null for non-JSON output", () => {
const parsed = parseClaudeCodeJsonOutput("Some random text\nwithout JSON");
expect(parsed).toBeNull();
});
test("returns null when session_id is missing", () => {
const stdout = JSON.stringify({ type: "result", result: "hi", subtype: "success" });
const parsed = parseClaudeCodeJsonOutput(stdout);
expect(parsed).toBeNull();
});
});
describe("parseClaudeCodeStreamOutput", () => {
test("parses stream-json output with turns", () => {
const lines = [
JSON.stringify({
type: "system",
subtype: "init",
session_id: "sess-123",
model: "claude-sonnet-4.5",
tools: ["Bash", "Read"],
}),
JSON.stringify({
type: "assistant",
message: {
role: "assistant",
content: [
{ type: "text", text: "I'll list the files." },
{ type: "tool_use", id: "tool_1", name: "Bash", input: { command: "ls" } },
],
},
session_id: "sess-123",
}),
JSON.stringify({
type: "user",
message: {
role: "user",
content: [
{ type: "tool_result", tool_use_id: "tool_1", content: "file1.ts\nfile2.ts" },
],
},
session_id: "sess-123",
}),
JSON.stringify({
type: "assistant",
message: {
role: "assistant",
content: [{ type: "text", text: "There are 2 files." }],
},
session_id: "sess-123",
}),
JSON.stringify({
type: "result",
subtype: "success",
result: "There are 2 files.",
session_id: "sess-123",
num_turns: 2,
total_cost_usd: 0.05,
duration_ms: 5000,
stop_reason: "end_turn",
usage: {
input_tokens: 200,
output_tokens: 30,
cache_read_input_tokens: 100,
cache_creation_input_tokens: 0,
},
}),
];
const stdout = lines.join("\n");
const parsed = parseClaudeCodeStreamOutput(stdout);
expect(parsed).not.toBeNull();
expect(parsed!.model).toBe("claude-sonnet-4.5");
expect(parsed!.sessionId).toBe("sess-123");
expect(parsed!.result).toBe("There are 2 files.");
expect(parsed!.stopReason).toBe("end_turn");
expect(parsed!.usage.inputTokens).toBe(200);
expect(parsed!.usage.outputTokens).toBe(30);
expect(parsed!.usage.cacheReadInputTokens).toBe(100);
// Turns: assistant(text+tool), tool_result, assistant(text)
expect(parsed!.turns).toHaveLength(3);
expect(parsed!.turns[0]!.role).toBe("assistant");
expect(parsed!.turns[0]!.content).toBe("I'll list the files.");
expect(parsed!.turns[0]!.toolCalls).toHaveLength(1);
expect(parsed!.turns[0]!.toolCalls![0]!.name).toBe("Bash");
expect(parsed!.turns[1]!.role).toBe("tool_result");
expect(parsed!.turns[1]!.content).toBe("file1.ts\nfile2.ts");
expect(parsed!.turns[2]!.role).toBe("assistant");
expect(parsed!.turns[2]!.content).toBe("There are 2 files.");
expect(parsed!.turns[2]!.toolCalls).toBeNull();
});
test("returns null when no result line", () => {
const stdout = JSON.stringify({ type: "system", model: "test" });
expect(parseClaudeCodeStreamOutput(stdout)).toBeNull();
});
test("skips invalid JSON lines gracefully", () => {
const lines = [
"not json",
JSON.stringify({
type: "result",
subtype: "success",
result: "ok",
session_id: "s1",
num_turns: 1,
total_cost_usd: 0.01,
duration_ms: 1000,
stop_reason: "end_turn",
usage: {},
}),
];
const parsed = parseClaudeCodeStreamOutput(lines.join("\n"));
expect(parsed).not.toBeNull();
expect(parsed!.result).toBe("ok");
expect(parsed!.turns).toHaveLength(0);
});
});
describe("storeClaudeCodeDetail", () => {
const baseParsed: ClaudeCodeParsedResult = {
type: "result",
subtype: "success",
result: "The answer",
sessionId: "abc-123",
numTurns: 5,
totalCostUsd: 0.12,
durationMs: 15000,
model: "claude-sonnet-4.5",
stopReason: "end_turn",
usage: { inputTokens: 100, outputTokens: 50, cacheReadInputTokens: 0, cacheCreationInputTokens: 0 },
turns: [
{ index: 0, role: "assistant", content: "hello", toolCalls: null },
{ index: 1, role: "tool_result", content: "world", toolCalls: null },
],
};
test("stores detail with per-turn CAS nodes", async () => {
const store = createMemoryStore();
const { detailHash, output, sessionId } = await storeClaudeCodeDetail(store, baseParsed);
expect(detailHash).toHaveLength(13);
expect(output).toBe("The answer");
expect(sessionId).toBe("abc-123");
const node = await store.get(detailHash);
expect(node).not.toBeNull();
expect(node!.payload.model).toBe("claude-sonnet-4.5");
expect(node!.payload.stopReason).toBe("end_turn");
expect(node!.payload.usage.inputTokens).toBe(100);
expect(node!.payload.turns).toHaveLength(2);
// Verify turn CAS nodes
const turn0 = await store.get(node!.payload.turns[0]);
expect(turn0).not.toBeNull();
expect(turn0!.payload.role).toBe("assistant");
expect(turn0!.payload.content).toBe("hello");
});
test("detail node is walkable from root", async () => {
const store = createMemoryStore();
const { detailHash } = await storeClaudeCodeDetail(store, baseParsed);
const visited: string[] = [];
walk(store, detailHash, (hash) => visited.push(hash));
expect(visited.length).toBeGreaterThan(0);
});
});
describe("storeClaudeCodeRawOutput", () => {
test("stores raw text when JSON parsing fails", async () => {
const store = createMemoryStore();
const rawText = "Claude produced plain text without JSON";
const hash = await storeClaudeCodeRawOutput(store, rawText);
expect(hash).toHaveLength(13);
const node = await store.get(hash);
expect(node).not.toBeNull();
expect(node!.payload.text).toBe(rawText);
});
});
@@ -0,0 +1,33 @@
{
"name": "@uncaged/workflow-agent-claude-code",
"version": "0.1.0",
"files": [
"src",
"dist",
"package.json"
],
"type": "module",
"bin": {
"uwf-claude-code": "./src/cli.ts"
},
"exports": {
".": {
"bun": "./src/index.ts",
"types": "./dist/index.d.ts",
"import": "./dist/index.js"
}
},
"scripts": {
"test": "bun test"
},
"dependencies": {
"@uncaged/json-cas": "^0.4.0",
"@uncaged/workflow-agent-kit": "workspace:^"
},
"devDependencies": {
"typescript": "^5.8.3"
},
"publishConfig": {
"access": "public"
}
}
@@ -0,0 +1,178 @@
import { spawn } from "node:child_process";
import type { Store } from "@uncaged/json-cas";
import { createLogger } from "@uncaged/workflow-util";
import {
type AgentContext,
type AgentRunResult,
buildRolePrompt,
createAgent,
getCachedSessionId,
setCachedSessionId,
} from "@uncaged/workflow-agent-kit";
import { parseClaudeCodeStreamOutput, storeClaudeCodeDetail } from "./session-detail.js";
const log = createLogger({ sink: { kind: "stderr" } });
const CLAUDE_COMMAND = "claude";
const CLAUDE_MAX_TURNS = 90;
function buildHistorySummary(steps: AgentContext["steps"]): string {
if (steps.length === 0) {
return "";
}
const lines: string[] = ["## Previous Steps"];
for (let i = 0; i < steps.length; i++) {
const step = steps[i];
if (step === undefined) {
continue;
}
lines.push("");
lines.push(`### Step ${i + 1}: ${step.role}`);
lines.push(`Output: ${JSON.stringify(step.output)}`);
lines.push(`Agent: ${step.agent}`);
}
return lines.join("\n");
}
/** Assemble system prompt, task, and prior step outputs for Claude Code. */
export function buildClaudeCodePrompt(ctx: AgentContext): string {
const roleDef = ctx.workflow.roles[ctx.role];
const rolePrompt = roleDef !== undefined ? buildRolePrompt(roleDef) : "";
const parts: string[] = [];
if (ctx.outputFormatInstruction !== undefined && ctx.outputFormatInstruction !== "") {
parts.push(ctx.outputFormatInstruction, "");
}
parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
const historyBlock = buildHistorySummary(ctx.steps);
if (historyBlock !== "") {
parts.push("", historyBlock);
}
return parts.join("\n");
}
function spawnClaude(args: string[]): Promise<{ stdout: string; stderr: string }> {
return new Promise((resolve, reject) => {
const child = spawn(CLAUDE_COMMAND, args, {
env: process.env,
shell: false,
stdio: ["ignore", "pipe", "pipe"],
});
let stdout = "";
let stderr = "";
child.stdout?.on("data", (chunk: Buffer) => {
stdout += chunk.toString();
});
child.stderr?.on("data", (chunk: Buffer) => {
stderr += chunk.toString();
});
child.on("error", (cause) => {
const message = cause instanceof Error ? cause.message : String(cause);
reject(new Error(`claude spawn failed: ${message}`));
});
child.on("close", (code) => {
if (code === 0) {
resolve({ stdout, stderr });
return;
}
const detail = stderr.trim() !== "" ? ` stderr=${stderr.trim()}` : "";
reject(new Error(`claude exited with code ${code ?? "null"}${detail}`));
});
});
}
function spawnClaudeRun(prompt: string): Promise<{ stdout: string; stderr: string }> {
return spawnClaude([
"-p",
prompt,
"--output-format",
"stream-json",
"--verbose",
"--dangerously-skip-permissions",
"--max-turns",
String(CLAUDE_MAX_TURNS),
]);
}
function spawnClaudeResume(
sessionId: string,
message: string,
): Promise<{ stdout: string; stderr: string }> {
return spawnClaude([
"-p",
message,
"--resume",
sessionId,
"--output-format",
"stream-json",
"--verbose",
"--dangerously-skip-permissions",
"--max-turns",
String(CLAUDE_MAX_TURNS),
]);
}
async function processClaudeOutput(stdout: string, store: Store): Promise<AgentRunResult> {
const parsed = parseClaudeCodeStreamOutput(stdout);
if (parsed !== null) {
const { detailHash, output, sessionId } = await storeClaudeCodeDetail(store, parsed);
return { output, detailHash, sessionId };
}
throw new Error(
`Claude Code returned unparseable output (first 200 chars): ${stdout.slice(0, 200)}`,
);
}
async function runClaudeCode(ctx: AgentContext): Promise<AgentRunResult> {
const fullPrompt = buildClaudeCodePrompt(ctx);
// Try resuming a cached session for re-entry scenarios (e.g. reviewer reject → developer re-entry).
if (!ctx.isFirstVisit) {
const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role);
if (cachedSessionId !== null) {
try {
const { stdout } = await spawnClaudeResume(cachedSessionId, fullPrompt);
const result = await processClaudeOutput(stdout, ctx.store);
if (result.sessionId !== undefined && result.sessionId !== "") {
await setCachedSessionId(ctx.threadId, ctx.role, result.sessionId);
}
return result;
} catch (err) {
log("5VKR8N3Q", "resume failed for session %s, falling back to fresh run: %s", cachedSessionId, err);
}
}
}
const { stdout } = await spawnClaudeRun(fullPrompt);
const result = await processClaudeOutput(stdout, ctx.store);
if (result.sessionId !== undefined && result.sessionId !== "") {
await setCachedSessionId(ctx.threadId, ctx.role, result.sessionId);
}
return result;
}
async function continueClaudeCode(
sessionId: string,
message: string,
store: Store,
): Promise<AgentRunResult> {
const { stdout } = await spawnClaudeResume(sessionId, message);
return processClaudeOutput(stdout, store);
}
/** Agent CLI factory: parses argv, runs Claude Code, extracts output, writes StepNode. */
export function createClaudeCodeAgent(): () => Promise<void> {
return createAgent({
name: "claude-code",
run: runClaudeCode,
continue: continueClaudeCode,
});
}
@@ -0,0 +1,6 @@
#!/usr/bin/env bun
import { createClaudeCodeAgent } from "./claude-code.js";
const main = createClaudeCodeAgent();
void main();
@@ -0,0 +1,7 @@
export { buildClaudeCodePrompt, createClaudeCodeAgent } from "./claude-code.js";
export {
parseClaudeCodeJsonOutput,
parseClaudeCodeStreamOutput,
storeClaudeCodeDetail,
storeClaudeCodeRawOutput,
} from "./session-detail.js";
@@ -0,0 +1,64 @@
import type { JSONSchema } from "@uncaged/json-cas";
export const CLAUDE_CODE_DETAIL_SCHEMA: JSONSchema = {
title: "claude-code-detail",
type: "object",
required: [
"sessionId",
"model",
"subtype",
"durationMs",
"numTurns",
"totalCostUsd",
"stopReason",
"usage",
"turns",
],
properties: {
sessionId: { type: "string" },
model: { type: "string" },
subtype: { type: "string" },
durationMs: { type: "integer" },
numTurns: { type: "integer" },
totalCostUsd: { type: "number" },
stopReason: { type: "string" },
usage: {
type: "object",
properties: {
inputTokens: { type: "integer" },
outputTokens: { type: "integer" },
cacheReadInputTokens: { type: "integer" },
cacheCreationInputTokens: { type: "integer" },
},
required: ["inputTokens", "outputTokens", "cacheReadInputTokens", "cacheCreationInputTokens"],
},
turns: {
type: "array",
items: { type: "string" },
},
},
additionalProperties: false,
};
export const CLAUDE_CODE_TURN_SCHEMA: JSONSchema = {
title: "claude-code-turn",
type: "object",
required: ["index", "role", "content", "toolCalls"],
properties: {
index: { type: "integer" },
role: { type: "string" },
content: { type: "string" },
toolCalls: {},
},
additionalProperties: false,
};
export const CLAUDE_CODE_RAW_OUTPUT_SCHEMA: JSONSchema = {
title: "claude-code-raw-output",
type: "object",
required: ["text"],
properties: {
text: { type: "string" },
},
additionalProperties: false,
};
@@ -0,0 +1,259 @@
import { bootstrap, putSchema, type Store } from "@uncaged/json-cas";
import {
CLAUDE_CODE_DETAIL_SCHEMA,
CLAUDE_CODE_RAW_OUTPUT_SCHEMA,
CLAUDE_CODE_TURN_SCHEMA,
} from "./schemas.js";
import type {
ClaudeCodeDetailPayload,
ClaudeCodeParsedResult,
ClaudeCodeToolCall,
ClaudeCodeTurnPayload,
} from "./types.js";
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null && !Array.isArray(value);
}
function safeNumber(v: unknown, fallback = 0): number {
return typeof v === "number" ? v : fallback;
}
function safeString(v: unknown, fallback = ""): string {
return typeof v === "string" ? v : fallback;
}
/**
* Extract tool calls from an assistant message content array.
*/
function extractToolCalls(content: unknown[]): ClaudeCodeToolCall[] {
const calls: ClaudeCodeToolCall[] = [];
for (const item of content) {
if (isRecord(item) && item.type === "tool_use" && typeof item.name === "string") {
calls.push({
name: item.name,
input: typeof item.input === "string" ? item.input : JSON.stringify(item.input ?? {}),
});
}
}
return calls;
}
/**
* Extract text content from a message content array.
*/
function extractTextContent(content: unknown[]): string {
const texts: string[] = [];
for (const item of content) {
if (isRecord(item) && item.type === "text" && typeof item.text === "string") {
texts.push(item.text);
}
}
return texts.join("\n");
}
/**
* Extract tool result content from a user message content array.
*/
function extractToolResultContent(content: unknown[]): string {
const results: string[] = [];
for (const item of content) {
if (isRecord(item) && item.type === "tool_result") {
const text = typeof item.content === "string" ? item.content : "";
results.push(text);
}
}
return results.join("\n");
}
/**
* Parse Claude Code stream-json (NDJSON) output.
* Each line is a JSON object with type: "system" | "assistant" | "user" | "result".
*/
export function parseClaudeCodeStreamOutput(stdout: string): ClaudeCodeParsedResult | null {
const lines = stdout.trim().split("\n");
const turns: ClaudeCodeTurnPayload[] = [];
let resultLine: Record<string, unknown> | null = null;
let model = "";
let turnIndex = 0;
for (const line of lines) {
let parsed: unknown;
try {
parsed = JSON.parse(line);
} catch {
continue;
}
if (!isRecord(parsed)) continue;
const type = parsed.type;
if (type === "system" && typeof parsed.model === "string") {
model = parsed.model;
}
if (type === "assistant" && isRecord(parsed.message)) {
const msg = parsed.message;
const content = Array.isArray(msg.content) ? msg.content : [];
const textContent = extractTextContent(content as unknown[]);
const toolCalls = extractToolCalls(content as unknown[]);
// Only record turns that have actual content
if (textContent !== "" || toolCalls.length > 0) {
turns.push({
index: turnIndex++,
role: "assistant",
content: textContent,
toolCalls: toolCalls.length > 0 ? toolCalls : null,
});
}
}
if (type === "user" && isRecord(parsed.message)) {
const msg = parsed.message;
const content = Array.isArray(msg.content) ? msg.content : [];
const resultContent = extractToolResultContent(content as unknown[]);
if (resultContent !== "") {
turns.push({
index: turnIndex++,
role: "tool_result",
content: resultContent,
toolCalls: null,
});
}
}
if (type === "result") {
resultLine = parsed;
}
}
if (resultLine === null) return null;
const sessionId = resultLine.session_id;
const result = resultLine.result;
const subtype = resultLine.subtype;
if (typeof sessionId !== "string" || typeof result !== "string" || typeof subtype !== "string") {
return null;
}
const usage = isRecord(resultLine.usage) ? resultLine.usage : {};
return {
type: safeString(resultLine.type, "result"),
subtype: subtype as ClaudeCodeParsedResult["subtype"],
result,
sessionId,
numTurns: safeNumber(resultLine.num_turns),
totalCostUsd: safeNumber(resultLine.total_cost_usd),
durationMs: safeNumber(resultLine.duration_ms),
model,
stopReason: safeString(resultLine.stop_reason),
usage: {
inputTokens: safeNumber(usage.input_tokens),
outputTokens: safeNumber(usage.output_tokens),
cacheReadInputTokens: safeNumber(usage.cache_read_input_tokens),
cacheCreationInputTokens: safeNumber(usage.cache_creation_input_tokens),
},
turns,
};
}
/**
* Legacy: parse Claude Code plain JSON output (non-streaming).
* Falls back when stream-json is not available.
*/
export function parseClaudeCodeJsonOutput(stdout: string): ClaudeCodeParsedResult | null {
let parsed: unknown;
try {
parsed = JSON.parse(stdout.trim());
} catch {
return null;
}
if (!isRecord(parsed)) return null;
const sessionId = parsed.session_id;
const result = parsed.result;
const subtype = parsed.subtype;
if (typeof sessionId !== "string" || typeof result !== "string" || typeof subtype !== "string") {
return null;
}
const usage = isRecord(parsed.usage) ? parsed.usage : {};
return {
type: safeString(parsed.type, "result"),
subtype: subtype as ClaudeCodeParsedResult["subtype"],
result,
sessionId,
numTurns: safeNumber(parsed.num_turns),
totalCostUsd: safeNumber(parsed.total_cost_usd),
durationMs: safeNumber(parsed.duration_ms),
model: "",
stopReason: safeString(parsed.stop_reason),
usage: {
inputTokens: safeNumber(usage.input_tokens),
outputTokens: safeNumber(usage.output_tokens),
cacheReadInputTokens: safeNumber(usage.cache_read_input_tokens),
cacheCreationInputTokens: safeNumber(usage.cache_creation_input_tokens),
},
turns: [],
};
}
type ClaudeCodeSchemaHashes = {
detail: string;
turn: string;
rawOutput: string;
};
async function registerSchemas(store: Store): Promise<ClaudeCodeSchemaHashes> {
await bootstrap(store);
const [detail, turn, rawOutput] = await Promise.all([
putSchema(store, CLAUDE_CODE_DETAIL_SCHEMA),
putSchema(store, CLAUDE_CODE_TURN_SCHEMA),
putSchema(store, CLAUDE_CODE_RAW_OUTPUT_SCHEMA),
]);
return { detail, turn, rawOutput };
}
/** Store parsed Claude Code result with per-turn breakdown as CAS detail nodes. */
export async function storeClaudeCodeDetail(
store: Store,
parsed: ClaudeCodeParsedResult,
): Promise<{ detailHash: string; output: string; sessionId: string }> {
const schemas = await registerSchemas(store);
// Store each turn as an individual CAS node
const turnHashes: string[] = [];
for (const turn of parsed.turns) {
const hash = await store.put(schemas.turn, turn);
turnHashes.push(hash);
}
const detail: ClaudeCodeDetailPayload = {
sessionId: parsed.sessionId,
model: parsed.model,
subtype: parsed.subtype,
durationMs: parsed.durationMs,
numTurns: parsed.numTurns,
totalCostUsd: parsed.totalCostUsd,
stopReason: parsed.stopReason,
usage: parsed.usage,
turns: turnHashes,
};
const detailHash = await store.put(schemas.detail, detail);
return { detailHash, output: parsed.result, sessionId: parsed.sessionId };
}
/** Fallback: store raw text output when JSON parsing fails. */
export async function storeClaudeCodeRawOutput(store: Store, rawOutput: string): Promise<string> {
const schemas = await registerSchemas(store);
return store.put(schemas.rawOutput, { text: rawOutput });
}
@@ -0,0 +1,53 @@
export type ClaudeCodeResultSubtype = "success" | "error_max_turns" | "error_budget";
/** A single tool call within an assistant turn. */
export type ClaudeCodeToolCall = {
name: string;
input: string;
};
/** A single turn (assistant text, tool use, or tool result). */
export type ClaudeCodeTurnPayload = {
index: number;
role: "assistant" | "tool_result";
content: string;
toolCalls: ClaudeCodeToolCall[] | null;
};
/** Top-level detail stored as CAS node. */
export type ClaudeCodeDetailPayload = {
sessionId: string;
model: string;
subtype: string;
durationMs: number;
numTurns: number;
totalCostUsd: number;
stopReason: string;
usage: {
inputTokens: number;
outputTokens: number;
cacheReadInputTokens: number;
cacheCreationInputTokens: number;
};
turns: string[]; // CAS hashes of ClaudeCodeTurnPayload
};
/** Intermediate parsed result from stream-json output. */
export type ClaudeCodeParsedResult = {
type: string;
subtype: ClaudeCodeResultSubtype;
result: string;
sessionId: string;
numTurns: number;
totalCostUsd: number;
durationMs: number;
model: string;
stopReason: string;
usage: {
inputTokens: number;
outputTokens: number;
cacheReadInputTokens: number;
cacheCreationInputTokens: number;
};
turns: ClaudeCodeTurnPayload[];
};
@@ -0,0 +1,6 @@
{
"extends": "../../tsconfig.json",
"compilerOptions": { "rootDir": "src", "outDir": "dist" },
"include": ["src"],
"references": [{ "path": "../workflow-agent-kit" }]
}
@@ -0,0 +1,77 @@
import { afterEach, beforeEach, describe, expect, it } from "bun:test";
import { HermesAcpClient } from "../src/acp-client.js";
const UUID_RE = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
describe("HermesAcpClient", () => {
let client: HermesAcpClient;
beforeEach(() => {
client = new HermesAcpClient();
});
afterEach(async () => {
await client.close();
});
it(
"connect() returns a UUID sessionId",
async () => {
const sessionId = await client.connect(process.cwd());
expect(typeof sessionId).toBe("string");
expect(sessionId).toMatch(UUID_RE);
},
{ timeout: 2 * 60 * 1000 },
);
it(
"prompt() returns a non-empty text response",
async () => {
await client.connect(process.cwd());
const result = await client.prompt("Reply with exactly the word: PONG");
expect(typeof result.text).toBe("string");
expect(result.text.length).toBeGreaterThan(0);
expect(typeof result.sessionId).toBe("string");
expect(result.sessionId).toMatch(UUID_RE);
},
{ timeout: 2 * 60 * 1000 },
);
it(
"prompt() can be called twice on the same session (resume)",
async () => {
await client.connect(process.cwd());
const first = await client.prompt("Say the word ALPHA and nothing else.");
expect(first.text.length).toBeGreaterThan(0);
const second = await client.prompt("Now say the word BETA and nothing else.");
expect(second.text.length).toBeGreaterThan(0);
expect(first.sessionId).toBe(second.sessionId);
},
{ timeout: 2 * 60 * 1000 },
);
it(
"prompt() collects structured messages including tool calls",
async () => {
await client.connect(process.cwd());
const result = await client.prompt("Run this command: echo TOOL_DETAIL_TEST");
expect(result.messages.length).toBeGreaterThan(0);
// Should have at least one tool message (the echo command)
const toolMessages = result.messages.filter((m) => m.role === "tool");
expect(toolMessages.length).toBeGreaterThan(0);
// Tool message should contain the output
const toolContent = toolMessages[0]?.content ?? "";
expect(toolContent).toContain("TOOL_DETAIL_TEST");
// Should have assistant messages with tool_calls
const assistantWithTools = result.messages.filter(
(m) => m.role === "assistant" && m.tool_calls !== null,
);
expect(assistantWithTools.length).toBeGreaterThan(0);
},
{ timeout: 2 * 60 * 1000 },
);
});
@@ -0,0 +1,78 @@
import { describe, expect, test } from "bun:test";
import type { AgentContext } from "@uncaged/workflow-agent-kit";
import type { ThreadId } from "@uncaged/workflow-protocol";
import { buildHermesPrompt } from "../src/hermes.js";
function makeCtx(overrides: Partial<AgentContext> = {}): AgentContext {
return {
threadId: "01JTEST0000000000000000000" as ThreadId,
edgePrompt: "Proceed with the assigned role.",
isFirstVisit: true,
workflow: {
roles: {
developer: {
description: "TDD implementation per test spec",
goal: "Write code",
capabilities: ["coding"],
procedure: "1. Read spec\n2. Write code",
output: "List files changed",
frontmatter: "",
},
},
conditions: {},
graph: {},
},
role: "developer",
start: { prompt: "Fix the bug", workflowHash: "abc123", threadId: "t1" },
steps: [],
store: {} as AgentContext["store"],
outputFormatInstruction: "Use YAML frontmatter",
...overrides,
};
}
describe("buildHermesPrompt", () => {
test("first visit uses full role prompt and includes moderator instruction", () => {
const result = buildHermesPrompt(
makeCtx({ edgePrompt: "Focus on the failing test.", isFirstVisit: true }),
);
expect(result).toMatch(/^Use YAML frontmatter/);
expect(result).toContain("Write code");
expect(result).toContain("## Task\nFix the bug");
expect(result).toContain("## Moderator Instruction");
expect(result).toContain("Focus on the failing test.");
});
test("re-entry uses continuation prompt with edge instruction", () => {
const ctx = makeCtx({
isFirstVisit: false,
edgePrompt: "The reviewer rejected your work. Fix the issues.",
steps: [
{ role: "developer", output: { summary: "Initial fix" }, agent: "uwf-hermes" },
{ role: "reviewer", output: { approved: false }, agent: "uwf-hermes" },
],
});
const result = buildHermesPrompt(ctx);
expect(result).not.toContain("## Task");
expect(result).toContain("## What Happened Since Your Last Turn");
expect(result).toContain("## Moderator Instruction");
expect(result).toContain("The reviewer rejected your work.");
});
test("forced first visit via isFirstVisit uses initial prompt even when role appears in history", () => {
const result = buildHermesPrompt(
makeCtx({
isFirstVisit: true,
steps: [{ role: "developer", output: { done: true }, agent: "uwf-hermes" }],
edgePrompt: "Retry with a fresh approach.",
}),
);
expect(result).toContain("## Task");
expect(result).toContain("Retry with a fresh approach.");
expect(result).not.toContain("## What Happened Since Your Last Turn");
});
});
@@ -0,0 +1,56 @@
import { afterEach, describe, expect, it } from "bun:test";
import { HermesAcpClient } from "../src/acp-client.js";
/**
* E2E test for cross-process session resume.
*
* Simulates the workflow re-entry scenario:
* 1. Client A: connect → prompt → close (developer first run)
* 2. Client B: resume(sessionId) → prompt (developer re-entry after reviewer reject)
*
* This is what happens when uwf thread step spawns uwf-hermes twice for the same role.
*/
describe("HermesAcpClient cross-process resume", () => {
const clients: HermesAcpClient[] = [];
afterEach(async () => {
for (const c of clients) {
await c.close();
}
clients.length = 0;
});
it(
"resume() after close — second prompt returns non-empty text",
async () => {
// --- Client A: first run ---
const clientA = new HermesAcpClient();
clients.push(clientA);
await clientA.connect(process.cwd());
const first = await clientA.prompt(
"Remember the secret code: WATERMELON. Reply with exactly: ACKNOWLEDGED",
);
expect(first.text.length).toBeGreaterThan(0);
const sessionId = first.sessionId;
// Close client A (simulates uwf-hermes process exit)
await clientA.close();
// --- Client B: resume (simulates re-entry) ---
const clientB = new HermesAcpClient();
clients.push(clientB);
await clientB.resume(sessionId, process.cwd());
const second = await clientB.prompt(
"What was the secret code I told you earlier? Reply with just the code word.",
);
// The critical assertion: resumed session produces non-empty output
expect(second.text.length).toBeGreaterThan(0);
expect(second.sessionId).toBe(sessionId);
},
{ timeout: 3 * 60 * 1000 },
);
});
+5 -3
View File
@@ -1,6 +1,6 @@
{
"name": "@uncaged/workflow-agent-hermes",
"version": "0.1.0",
"version": "0.5.0",
"files": [
"src",
"dist",
@@ -21,8 +21,10 @@
"test": "bun test"
},
"dependencies": {
"@uncaged/json-cas": "^0.3.0",
"@uncaged/workflow-agent-kit": "workspace:^"
"@uncaged/json-cas": "^0.4.0",
"@uncaged/workflow-agent-kit": "workspace:^",
"@uncaged/workflow-protocol": "workspace:^",
"@uncaged/workflow-util": "workspace:^"
},
"devDependencies": {
"typescript": "^5.8.3"
@@ -0,0 +1,393 @@
import type { ChildProcess } from "node:child_process";
import { spawn } from "node:child_process";
import { createInterface } from "node:readline";
import type { HermesSessionMessage } from "./types.js";
const HERMES_COMMAND = "hermes";
const PROTOCOL_VERSION = 1;
type JsonRpcResponse = {
jsonrpc: "2.0";
id: number;
result?: unknown;
error?: { code: number; message: string };
};
type PendingRequest = {
resolve: (value: JsonRpcResponse) => void;
reject: (reason: Error) => void;
};
/** Tracks in-flight tool calls so we can build complete messages when they finish. */
type PendingToolCall = {
name: string;
args: string;
};
export type AcpPromptResult = {
text: string;
sessionId: string;
messages: HermesSessionMessage[];
};
export class HermesAcpClient {
private process: ChildProcess | null = null;
private nextId = 1;
private sessionId: string | null = null;
private stderrBuffer = "";
private pending = new Map<number, PendingRequest>();
// Message collection state
private messageChunks: string[] = [];
private reasoningChunks: string[] = [];
private pendingTools = new Map<string, PendingToolCall>();
messages: HermesSessionMessage[] = [];
/** Spawn hermes acp, initialize, create session */
async connect(cwd: string): Promise<string> {
await this.ensureProcess();
await this.initialize();
const sessionResponse = (await this.sendRequest("session/new", {
cwd,
mcpServers: [],
})) as { result: { sessionId: string } };
const sessionId = sessionResponse.result?.sessionId;
if (typeof sessionId !== "string" || sessionId === "") {
throw new Error(`session/new did not return a sessionId: ${JSON.stringify(sessionResponse)}`);
}
this.sessionId = sessionId;
return sessionId;
}
/** Spawn hermes acp, initialize, resume an existing session */
async resume(sessionId: string, cwd: string): Promise<string> {
await this.ensureProcess();
await this.initialize();
const response = await this.sendRequest("session/resume", {
cwd,
sessionId,
mcpServers: [],
});
if ((response as { error?: unknown }).error !== undefined) {
throw new Error(
`session/resume failed: ${JSON.stringify((response as { error: unknown }).error)}`,
);
}
this.sessionId = sessionId;
return sessionId;
}
/** Send prompt and collect full response text + structured messages. */
async prompt(text: string): Promise<AcpPromptResult> {
if (this.sessionId === null) {
throw new Error("Not connected — call connect() first");
}
this.messageChunks = [];
this.reasoningChunks = [];
const response = await this.sendRequest("session/prompt", {
sessionId: this.sessionId,
prompt: [{ type: "text", text }],
});
if ((response as { error?: unknown }).error !== undefined) {
throw new Error(
`session/prompt failed: ${JSON.stringify((response as { error: unknown }).error)}`,
);
}
// Flush any trailing assistant text that wasn't followed by a tool call.
this.flushAssistantMessage();
// Extract the final assistant text from collected messages.
let finalText = "";
for (let i = this.messages.length - 1; i >= 0; i--) {
const msg = this.messages[i];
if (
msg !== undefined &&
msg.role === "assistant" &&
msg.content !== null &&
msg.content.trim() !== ""
) {
finalText = msg.content;
break;
}
}
return {
text: finalText,
sessionId: this.sessionId,
messages: this.messages,
};
}
/** Close the connection */
async close(): Promise<void> {
if (this.process === null) {
return;
}
this.sessionId = null;
this.process.stdin?.end();
const proc = this.process;
await new Promise<void>((resolve) => {
proc.on("close", () => resolve());
setTimeout(resolve, 5000);
});
this.process = null;
}
// ---- JSON-RPC transport ----
private sendRequest(
method: string,
params: Record<string, unknown>,
timeoutMs = 10 * 60 * 1000,
): Promise<JsonRpcResponse> {
const id = this.nextId++;
return new Promise<JsonRpcResponse>((resolve, reject) => {
const timer = setTimeout(() => {
this.pending.delete(id);
reject(new Error(`Timeout waiting for response to ${method} (id=${id})`));
}, timeoutMs);
this.pending.set(id, {
resolve: (value) => {
clearTimeout(timer);
resolve(value);
},
reject: (err) => {
clearTimeout(timer);
reject(err);
},
});
this.writeLine(JSON.stringify({ jsonrpc: "2.0", id, method, params }));
});
}
private sendNotification(method: string, params?: Record<string, unknown>): void {
const message: Record<string, unknown> = { jsonrpc: "2.0", method };
if (params !== undefined) {
message.params = params;
}
this.writeLine(JSON.stringify(message));
}
private writeLine(line: string): void {
if (this.process?.stdin === null || this.process?.stdin === undefined) {
throw new Error("Cannot write: hermes acp process stdin not available");
}
this.process.stdin.write(`${line}\n`);
}
private handleLine(line: string): void {
if (line === "") {
return;
}
let parsed: unknown;
try {
parsed = JSON.parse(line);
} catch {
return;
}
const msg = parsed as Record<string, unknown>;
const hasId = "id" in msg && msg.id !== undefined && msg.id !== null;
const hasMethod = typeof msg.method === "string";
// JSON-RPC response to one of our requests (has "id" but no "method")
if (hasId && !hasMethod) {
const response = msg as unknown as JsonRpcResponse;
const handler = this.pending.get(response.id);
if (handler !== undefined) {
this.pending.delete(response.id);
handler.resolve(response);
}
return;
}
// Server-initiated JSON-RPC request: session/request_permission (has "id" + "method")
if (msg.method === "session/request_permission" && hasId) {
const params = msg.params as Record<string, unknown> | undefined;
const options = (params?.options ?? []) as Array<{ optionId?: string }>;
const firstOptionId = options[0]?.optionId ?? "";
this.writeLine(
JSON.stringify({
jsonrpc: "2.0",
id: msg.id,
result: { outcome: { outcome: "selected", optionId: firstOptionId } },
}),
);
return;
}
// JSON-RPC notification — session/update (no "id")
if (msg.method === "session/update") {
const params = msg.params as Record<string, unknown> | undefined;
const update = params?.update as Record<string, unknown> | undefined;
if (update !== undefined) {
this.handleSessionUpdate(update);
}
return;
}
}
// ---- Session update → structured messages ----
private handleSessionUpdate(update: Record<string, unknown>): void {
const updateType = update.sessionUpdate as string;
switch (updateType) {
case "agent_message_chunk": {
const content = update.content as { type?: string; text?: string } | undefined;
if (content?.type === "text" && typeof content.text === "string") {
this.messageChunks.push(content.text);
}
break;
}
case "agent_thought_chunk": {
const content = update.content as { type?: string; text?: string } | undefined;
if (content?.type === "text" && typeof content.text === "string") {
this.reasoningChunks.push(content.text);
}
break;
}
case "tool_call": {
const title = (update.title as string) ?? "";
const rawInput = update.rawInput;
const args =
rawInput !== undefined && rawInput !== null ? JSON.stringify(rawInput) : "";
const toolCallId = update.toolCallId as string;
this.pendingTools.set(toolCallId, { name: title, args });
// Flush accumulated assistant text before tool call
this.flushAssistantMessage();
break;
}
case "tool_call_update": {
const status = update.status as string | undefined;
if (status === "completed" || status === "failed") {
const toolCallId = update.toolCallId as string;
const pending = this.pendingTools.get(toolCallId);
const toolName = pending?.name ?? toolCallId;
const rawOutput = update.rawOutput;
const outputStr =
rawOutput !== undefined && rawOutput !== null
? typeof rawOutput === "string"
? rawOutput
: JSON.stringify(rawOutput)
: "";
this.messages.push({
role: "assistant",
content: null,
reasoning: null,
tool_calls: [{ function: { name: toolName, arguments: pending?.args ?? "" } }],
});
this.messages.push({
role: "tool",
content: outputStr,
reasoning: null,
tool_calls: null,
});
this.pendingTools.delete(toolCallId);
}
break;
}
default:
break;
}
}
/** Flush any accumulated text/reasoning into an assistant message. */
private flushAssistantMessage(): void {
const text = this.messageChunks.join("");
const reasoning = this.reasoningChunks.join("");
if (text !== "" || reasoning !== "") {
this.messages.push({
role: "assistant",
content: text || null,
reasoning: reasoning || null,
tool_calls: null,
});
}
this.messageChunks = [];
this.reasoningChunks = [];
}
private rejectAll(err: Error): void {
for (const handler of this.pending.values()) {
handler.reject(err);
}
this.pending.clear();
}
private async ensureProcess(): Promise<void> {
if (this.process !== null) {
return;
}
const child = spawn(HERMES_COMMAND, ["acp"], {
env: process.env,
shell: false,
stdio: ["pipe", "pipe", "pipe"],
});
this.process = child;
child.stderr?.on("data", (chunk: Buffer) => {
this.stderrBuffer += chunk.toString();
});
child.on("error", (cause) => {
const message = cause instanceof Error ? cause.message : String(cause);
this.rejectAll(new Error(`hermes acp spawn failed: ${message}`));
});
child.on("close", (code) => {
if (code !== 0 && this.pending.size > 0) {
const detail = this.stderrBuffer.trim() !== "" ? ` stderr=${this.stderrBuffer.trim()}` : "";
this.rejectAll(
new Error(`hermes acp exited unexpectedly with code ${code ?? "null"}${detail}`),
);
}
});
if (child.stdout === null) {
throw new Error("hermes acp process stdout is not available");
}
const rl = createInterface({ input: child.stdout });
rl.on("line", (line) => {
this.handleLine(line.trim());
});
}
private async initialize(): Promise<void> {
const initResponse = await this.sendRequest("initialize", {
protocolVersion: PROTOCOL_VERSION,
clientInfo: { name: "uwf", version: "0.1.0" },
capabilities: {},
});
if ((initResponse as { error?: unknown }).error !== undefined) {
throw new Error(
`initialize failed: ${JSON.stringify((initResponse as { error: unknown }).error)}`,
);
}
this.sendNotification("initialized");
}
}
+144 -72
View File
@@ -1,16 +1,18 @@
import { spawn } from "node:child_process";
import { type AgentContext, type AgentRunResult, createAgent } from "@uncaged/workflow-agent-kit";
import type { Store } from "@uncaged/json-cas";
import {
loadHermesSession,
parseSessionIdFromStdout,
storeHermesRawOutput,
storeHermesSessionDetail,
} from "./session-detail.js";
type AgentContext,
type AgentRunResult,
buildContinuationPrompt,
buildRolePrompt,
createAgent,
} from "@uncaged/workflow-agent-kit";
import { createLogger } from "@uncaged/workflow-util";
const HERMES_COMMAND = "hermes";
const HERMES_MAX_TURNS = 90;
import { HermesAcpClient } from "./acp-client.js";
import { getCachedSessionId, isResumeDisabled, setCachedSessionId } from "./session-cache.js";
import { storeHermesSessionDetail } from "./session-detail.js";
const log = createLogger({ sink: { kind: "stderr" } });
function buildHistorySummary(steps: AgentContext["steps"]): string {
if (steps.length === 0) {
@@ -31,87 +33,157 @@ function buildHistorySummary(steps: AgentContext["steps"]): string {
return lines.join("\n");
}
/** Assemble system prompt, task, and prior step outputs for Hermes. */
export function buildHermesPrompt(ctx: AgentContext): string {
function buildInitialPrompt(ctx: AgentContext): string {
const roleDef = ctx.workflow.roles[ctx.role];
const systemPrompt = roleDef?.systemPrompt ?? "";
const rolePrompt = roleDef !== undefined ? buildRolePrompt(roleDef) : "";
const parts: string[] = [];
if (ctx.outputFormatInstruction !== undefined && ctx.outputFormatInstruction !== "") {
if (ctx.outputFormatInstruction !== "") {
parts.push(ctx.outputFormatInstruction, "");
}
parts.push(systemPrompt, "", "## Task", ctx.start.prompt);
parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
const historyBlock = buildHistorySummary(ctx.steps);
if (historyBlock !== "") {
parts.push("", historyBlock);
}
parts.push("", "## Moderator Instruction", "", ctx.edgePrompt);
return parts.join("\n");
}
function spawnHermesChat(prompt: string): Promise<{ stdout: string; stderr: string }> {
return new Promise((resolve, reject) => {
const args = [
"chat",
"-q",
prompt,
"--yolo",
"--max-turns",
String(HERMES_MAX_TURNS),
"--quiet",
];
const child = spawn(HERMES_COMMAND, args, {
env: process.env,
shell: false,
stdio: ["ignore", "pipe", "pipe"],
});
/** Assemble system prompt, task, and prior step outputs for Hermes. */
export function buildHermesPrompt(ctx: AgentContext): string {
if (!ctx.isFirstVisit) {
const parts: string[] = [];
if (ctx.outputFormatInstruction !== "") {
parts.push(ctx.outputFormatInstruction, "");
}
parts.push(buildContinuationPrompt(ctx.steps, ctx.role, ctx.edgePrompt));
return parts.join("\n");
}
let stdout = "";
let stderr = "";
child.stdout?.on("data", (chunk: Buffer) => {
stdout += chunk.toString();
});
child.stderr?.on("data", (chunk: Buffer) => {
stderr += chunk.toString();
});
child.on("error", (cause) => {
const message = cause instanceof Error ? cause.message : String(cause);
reject(new Error(`hermes spawn failed: ${message}`));
});
child.on("close", (code) => {
if (code === 0) {
resolve({ stdout, stderr });
return;
}
const detail = stderr.trim() !== "" ? ` stderr=${stderr.trim()}` : "";
reject(new Error(`hermes exited with code ${code ?? "null"}${detail}`));
});
});
return buildInitialPrompt(ctx);
}
async function runHermes(ctx: AgentContext): Promise<AgentRunResult> {
const fullPrompt = buildHermesPrompt(ctx);
const { stdout, stderr } = await spawnHermesChat(fullPrompt);
const { store } = ctx;
async function storePromptResult(
store: Store,
sessionId: string,
messages: Awaited<ReturnType<HermesAcpClient["prompt"]>>["messages"],
): Promise<{ detailHash: string }> {
const session = {
session_id: sessionId,
model: "",
session_start: new Date().toISOString(),
messages,
};
return storeHermesSessionDetail(store, session);
}
// --quiet mode: session_id may be on stdout or stderr
const sessionId = parseSessionIdFromStdout(stderr) ?? parseSessionIdFromStdout(stdout);
if (sessionId !== null) {
const session = await loadHermesSession(sessionId);
if (session !== null) {
const { detailHash, output } = await storeHermesSessionDetail(store, session);
return { output, detailHash };
type PromptAttempt = {
useContinuation: boolean;
resumed: boolean;
};
async function prepareSession(
client: HermesAcpClient,
ctx: AgentContext,
cwd: string,
): Promise<PromptAttempt> {
if (ctx.isFirstVisit || isResumeDisabled()) {
await client.connect(cwd);
return { useContinuation: false, resumed: false };
}
const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role);
if (cachedSessionId === null) {
log("6RWK3N8Q", `no cached session for ${ctx.threadId}:${ctx.role}, starting new session`);
await client.connect(cwd);
return { useContinuation: false, resumed: false };
}
try {
await client.resume(cachedSessionId, cwd);
log("9MHT4V2P", `resumed hermes session ${cachedSessionId} for ${ctx.threadId}:${ctx.role}`);
return { useContinuation: true, resumed: true };
} catch (error) {
const message = error instanceof Error ? error.message : String(error);
log("3XPN7K4W", `session resume failed, falling back to new session: ${message}`);
await client.close();
await client.connect(cwd);
return { useContinuation: false, resumed: false };
}
}
/**
* Agent CLI factory: parses argv, runs Hermes, extracts output, writes StepNode.
*
* A single ACP client is shared across run() and continue() calls so that
* frontmatter retry loops keep the same Hermes session context. The client
* is closed once the agent process exits (via process.on("exit")).
*/
export function createHermesAgent(): () => Promise<void> {
const client = new HermesAcpClient();
// Ensure cleanup regardless of how the process exits.
process.on("exit", () => {
void client.close();
});
async function runPrompt(ctx: AgentContext, useContinuation: boolean): Promise<AgentRunResult> {
const effectiveCtx = useContinuation ? ctx : { ...ctx, isFirstVisit: true };
const fullPrompt = buildHermesPrompt(effectiveCtx);
const { text, sessionId, messages } = await client.prompt(fullPrompt);
const { detailHash } = await storePromptResult(ctx.store, sessionId, messages);
if (!isResumeDisabled()) {
await setCachedSessionId(ctx.threadId, ctx.role, sessionId);
}
return { output: text, detailHash, sessionId };
}
async function runHermes(ctx: AgentContext): Promise<AgentRunResult> {
const cwd = process.cwd();
const attempt = await prepareSession(client, ctx, cwd);
try {
return await runPrompt(ctx, attempt.useContinuation);
} catch (error) {
if (!attempt.resumed) {
throw error;
}
const message = error instanceof Error ? error.message : String(error);
log("8FQW2R6N", `continuation prompt failed, retrying with initial prompt: ${message}`);
await client.close();
await client.connect(cwd);
return runPrompt(ctx, false);
}
}
const detailHash = await storeHermesRawOutput(store, stdout);
return { output: stdout, detailHash };
}
async function continueHermes(
_sessionId: string,
message: string,
store: Store,
): Promise<AgentRunResult> {
// Client is already connected from runHermes — same ACP session,
// so the agent sees the full conversation history (crucial for retries).
const { text, sessionId, messages } = await client.prompt(message);
const { detailHash } = await storePromptResult(store, sessionId, messages);
return { output: text, detailHash, sessionId };
}
/** Agent CLI factory: parses argv, runs Hermes, extracts output, writes StepNode. */
export function createHermesAgent(): () => Promise<void> {
return createAgent({
const agentMain = createAgent({
name: "hermes",
run: runHermes,
continue: continueHermes,
});
// Wrap to ensure ACP client is closed after agent completes,
// so the hermes subprocess exits and bun can terminate.
return async () => {
try {
await agentMain();
} finally {
await client.close();
}
};
}
@@ -1 +1,2 @@
export { HermesAcpClient } from "./acp-client.js";
export { buildHermesPrompt, createHermesAgent } from "./hermes.js";
@@ -0,0 +1,17 @@
// Re-export session cache from the shared agent-kit package.
export { getCachedSessionId, setCachedSessionId } from "@uncaged/workflow-agent-kit";
export function isResumeDisabled(): boolean {
// Hermes ACP session/resume is broken: _restore fails for custom providers
// because resolve_runtime_provider("custom") throws and base_url/api_mode
// are lost in the fallback path. Resume silently creates a new session
// (different sessionId, no history), causing empty-text responses.
// See: https://github.com/NousResearch/hermes-agent/issues/13489
// Disable by default until upstream fixes the bug. Set UWF_HERMES_RESUME=1
// to opt back in.
const enableFlag = process.env.UWF_HERMES_RESUME;
if (enableFlag === "1" || enableFlag === "true") {
return false;
}
return true;
}
@@ -0,0 +1,70 @@
import type { StepContext } from "@uncaged/workflow-protocol";
import { describe, expect, test } from "vitest";
import { buildContinuationPrompt } from "../src/build-continuation-prompt.js";
const reviewerStep: StepContext = {
role: "reviewer",
output: { approved: false, comments: "Missing tests" },
detail: "2MXBG6PN4A8JR",
agent: "uwf-hermes",
};
const developerStep: StepContext = {
role: "developer",
output: { filesChanged: ["src/app.ts"], summary: "Initial fix" },
detail: "1VPBG9SM5E7WK",
agent: "uwf-hermes",
};
describe("buildContinuationPrompt", () => {
test("includes steps after the last matching role and the edge prompt", () => {
const steps: StepContext[] = [
developerStep,
reviewerStep,
{
role: "planner",
output: { plan: "revise approach" },
detail: "7BQST3VW9F2MA",
agent: "uwf-hermes",
},
];
const result = buildContinuationPrompt(
steps,
"developer",
"The reviewer rejected your implementation. Read their feedback and fix the issues.",
);
expect(result).toContain("## What Happened Since Your Last Turn");
expect(result).toContain("### Step 2: reviewer");
expect(result).toContain("Missing tests");
expect(result).toContain("### Step 3: planner");
expect(result).toContain("## Moderator Instruction");
expect(result).toContain("The reviewer rejected your implementation.");
expect(result).not.toContain("Initial fix");
});
test("uses all steps when the role has not run before", () => {
const result = buildContinuationPrompt(
[developerStep, reviewerStep],
"planner",
"Continue from the reviewer feedback.",
);
expect(result).toContain("### Step 1: developer");
expect(result).toContain("### Step 2: reviewer");
expect(result).toContain("Continue from the reviewer feedback.");
});
test("still includes moderator instruction when there are no intervening steps", () => {
const result = buildContinuationPrompt(
[developerStep],
"developer",
"Please revise your work.",
);
expect(result).not.toContain("## What Happened Since Your Last Turn");
expect(result).toContain("## Moderator Instruction");
expect(result).toContain("Please revise your work.");
});
});
@@ -2,13 +2,32 @@ import { describe, expect, test } from "vitest";
import { buildOutputFormatInstruction } from "../src/build-output-format-instruction.js";
const PLANNER_SCHEMA = {
type: "object",
properties: {
status: { type: "string", enum: ["ready", "insufficient_info"] },
plan: { type: "string" },
},
required: ["status"],
additionalProperties: false,
};
const REVIEWER_SCHEMA = {
type: "object",
properties: {
approved: { type: "boolean" },
},
required: ["approved"],
additionalProperties: false,
};
describe("buildOutputFormatInstruction", () => {
test("always includes the frontmatter example block", () => {
const result = buildOutputFormatInstruction({});
expect(result).toContain("---");
expect(result).toContain("status: done");
expect(result).toContain("confidence:");
expect(result).toContain("scope: role");
expect(result).not.toContain("status: done");
expect(result).not.toContain("confidence:");
expect(result).not.toContain("scope: role");
});
test("always marks frontmatter as the primary deliverable", () => {
@@ -16,17 +35,36 @@ describe("buildOutputFormatInstruction", () => {
expect(result).toContain("primary deliverable");
});
test("lists fields from a flat object schema", () => {
test("generates planner-specific YAML example from schema", () => {
const result = buildOutputFormatInstruction(PLANNER_SCHEMA);
expect(result).toContain("status: ready # required | ready | insufficient_info");
expect(result).toContain("plan: <string>");
expect(result).not.toContain("status: done");
expect(result).not.toContain("confidence:");
expect(result).not.toContain("artifacts:");
});
test("generates reviewer-specific YAML example from schema", () => {
const result = buildOutputFormatInstruction(REVIEWER_SCHEMA);
expect(result).toContain("approved: true # required | true | false");
expect(result).not.toContain("status:");
});
test("lists fields from a flat object schema with required marker", () => {
const schema = {
type: "object",
properties: {
status: { type: "string" },
confidence: { type: "number" },
},
required: ["status"],
};
const result = buildOutputFormatInstruction(schema);
expect(result).toContain("`status`");
expect(result).toContain("`status` (required)");
expect(result).toContain("`confidence`");
expect(result).not.toContain("`confidence` (required)");
expect(result).toContain("status: <string> # required");
expect(result).toContain("confidence: <number>");
});
test("lists union of fields from an anyOf schema", () => {
@@ -45,6 +83,8 @@ describe("buildOutputFormatInstruction", () => {
const result = buildOutputFormatInstruction(schema);
expect(result).toContain("`alpha`");
expect(result).toContain("`beta`");
expect(result).toContain("alpha: <string>");
expect(result).toContain("beta: <number>");
});
test("lists union of fields from a oneOf schema", () => {
@@ -63,6 +103,8 @@ describe("buildOutputFormatInstruction", () => {
const result = buildOutputFormatInstruction(schema);
expect(result).toContain("`foo`");
expect(result).toContain("`bar`");
expect(result).toContain("foo: <string>");
expect(result).toContain("bar: true # true | false");
});
test("falls back gracefully for a non-object schema with no properties", () => {
@@ -80,6 +122,45 @@ describe("buildOutputFormatInstruction", () => {
const result = buildOutputFormatInstruction(schema);
const matches = [...result.matchAll(/`shared`/g)];
expect(matches.length).toBe(1);
expect(result).toContain("shared: <string>");
});
test("marks required when any union variant requires the field", () => {
const schema = {
anyOf: [
{
type: "object",
properties: { shared: { type: "string" } },
required: ["shared"],
},
{ type: "object", properties: { shared: { type: "number" } } },
],
};
const result = buildOutputFormatInstruction(schema);
expect(result).toContain("`shared` (required)");
expect(result).toContain("shared: <string> # required");
});
test("explicitly forbids extra frontmatter fields", () => {
const result = buildOutputFormatInstruction(PLANNER_SCHEMA);
expect(result).toMatch(/\b(only|exclusively)\b.*fields/i);
expect(result).toMatch(/do not add (extra|additional|other) fields/i);
});
test("forbids extra fields even for empty schema", () => {
const result = buildOutputFormatInstruction({});
expect(result).toMatch(/do not add (extra|additional|other) fields/i);
});
test("forbids extra fields for anyOf/oneOf schemas", () => {
const schema = {
anyOf: [
{ type: "object", properties: { alpha: { type: "string" } } },
{ type: "object", properties: { beta: { type: "number" } } },
],
};
const result = buildOutputFormatInstruction(schema);
expect(result).toMatch(/do not add (extra|additional|other) fields/i);
});
test("includes focus reminder about role scope", () => {
@@ -0,0 +1,81 @@
import type { RoleDefinition } from "@uncaged/workflow-protocol";
import { describe, expect, test } from "vitest";
import { buildRolePrompt } from "../src/build-role-prompt.js";
describe("buildRolePrompt", () => {
test("all fields present", () => {
const role: RoleDefinition = {
description: "A coder",
goal: "You are a senior developer.",
capabilities: ["cursor-agent", "file-edit"],
procedure: "Implement the feature.",
output: "Summarize changes.",
meta: "placeholder00000" as string,
};
const result = buildRolePrompt(role);
expect(result).toContain("## Goal");
expect(result).toContain("You are a senior developer.");
expect(result).toContain("## Capabilities");
expect(result).toContain("- cursor-agent");
expect(result).toContain("- file-edit");
expect(result).toContain("## Prepare");
expect(result).toContain("uwf CLI Reference");
expect(result).toContain("cursor-agent, file-edit");
expect(result).toContain("## Procedure");
expect(result).toContain("Implement the feature.");
expect(result).toContain("## Output");
expect(result).toContain("Summarize changes.");
});
test("empty fields are omitted but Prepare is always present", () => {
const role: RoleDefinition = {
description: "A reviewer",
goal: "You are a code reviewer.",
capabilities: [],
procedure: "Review the PR diff carefully.",
output: "",
meta: "placeholder00000" as string,
};
const result = buildRolePrompt(role);
expect(result).toContain("## Goal");
expect(result).toContain("## Prepare");
expect(result).toContain("uwf CLI Reference");
expect(result).toContain("## Procedure");
expect(result).not.toContain("## Capabilities");
expect(result).not.toContain("## Output");
});
test("all empty still includes Prepare section", () => {
const role: RoleDefinition = {
description: "Minimal",
goal: "",
capabilities: [],
procedure: "",
output: "",
meta: "placeholder00000" as string,
};
const result = buildRolePrompt(role);
expect(result).toContain("## Prepare");
expect(result).toContain("uwf CLI Reference");
expect(result).not.toContain("## Goal");
expect(result).not.toContain("## Capabilities");
expect(result).not.toContain("## Procedure");
expect(result).not.toContain("## Output");
});
test("capabilities rendered as bullet list", () => {
const role: RoleDefinition = {
description: "Agent",
goal: "",
capabilities: ["search", "code", "browse"],
procedure: "",
output: "",
meta: "placeholder00000" as string,
};
const result = buildRolePrompt(role);
expect(result).toContain("## Capabilities");
expect(result).toContain("- search");
expect(result).toContain("- code");
expect(result).toContain("- browse");
});
});
@@ -1,6 +1,5 @@
import { describe, expect, test } from "vitest";
import { createMemoryStore, putSchema } from "@uncaged/json-cas";
import { describe, expect, test } from "vitest";
import { tryFrontmatterFastPath } from "../src/frontmatter.js";
@@ -30,6 +29,27 @@ const STRICT_SCHEMA = {
additionalProperties: false,
};
/** Role-specific schema (reviewer) — only approved, no standard agent fields. */
const REVIEWER_SCHEMA = {
type: "object",
properties: {
approved: { type: "boolean" },
},
required: ["approved"],
additionalProperties: false,
};
/** Role-specific schema (planner) — custom status enum + plan hash. */
const PLANNER_SCHEMA = {
type: "object",
properties: {
status: { type: "string", enum: ["ready", "insufficient_info"] },
plan: { type: "string" },
},
required: ["status"],
additionalProperties: false,
};
async function makeStoreWithSchema(schema: Record<string, unknown>) {
const store = createMemoryStore();
const schemaHash = await putSchema(store, schema);
@@ -68,7 +88,8 @@ describe("tryFrontmatterFastPath — happy path", () => {
test("stored CAS node payload matches frontmatter fields", async () => {
const { store, schemaHash } = await makeStoreWithSchema(FRONTMATTER_SCHEMA);
const raw = "---\nstatus: done\nnext: null\nconfidence: null\nartifacts: []\nscope: role\n---\n\nBody.";
const raw =
"---\nstatus: done\nnext: null\nconfidence: null\nartifacts: []\nscope: role\n---\n\nBody.";
const result = await tryFrontmatterFastPath(raw, schemaHash, store);
expect(result).not.toBeNull();
@@ -134,3 +155,48 @@ describe("tryFrontmatterFastPath — fallback: schema mismatch", () => {
expect(result).toBeNull();
});
});
// ── Role-specific schema fields ───────────────────────────────────────────────
describe("tryFrontmatterFastPath — role-specific fields", () => {
test("extracts approved only for reviewer schema (no extra standard fields)", async () => {
const { store, schemaHash } = await makeStoreWithSchema(REVIEWER_SCHEMA);
const raw = "---\napproved: true\n---\n\nReview passed.";
const result = await tryFrontmatterFastPath(raw, schemaHash, store);
expect(result).not.toBeNull();
const node = store.get(result!.outputHash);
expect(node).not.toBeNull();
const payload = node!.payload as Record<string, unknown>;
expect(payload).toEqual({ approved: true });
expect(payload.status).toBeUndefined();
expect(payload.scope).toBeUndefined();
});
test("extracts plan and role-specific status for planner schema", async () => {
const { store, schemaHash } = await makeStoreWithSchema(PLANNER_SCHEMA);
const raw = "---\nstatus: ready\nplan: 01HASHPLANNER0001\n---\n\nSpec summary.";
const result = await tryFrontmatterFastPath(raw, schemaHash, store);
expect(result).not.toBeNull();
const node = store.get(result!.outputHash);
expect(node).not.toBeNull();
const payload = node!.payload as Record<string, unknown>;
expect(payload.status).toBe("ready");
expect(payload.plan).toBe("01HASHPLANNER0001");
expect(payload.scope).toBeUndefined();
});
test("returns null when required role-specific field is missing", async () => {
const { store, schemaHash } = await makeStoreWithSchema(REVIEWER_SCHEMA);
const raw = "---\nstatus: done\nscope: role\n---\n\nBody.";
const result = await tryFrontmatterFastPath(raw, schemaHash, store);
expect(result).toBeNull();
});
});
+3 -3
View File
@@ -1,6 +1,6 @@
{
"name": "@uncaged/workflow-agent-kit",
"version": "0.1.0",
"version": "0.5.0",
"files": [
"src",
"dist",
@@ -18,8 +18,8 @@
"test": "bun test"
},
"dependencies": {
"@uncaged/json-cas": "^0.3.0",
"@uncaged/json-cas-fs": "^0.3.0",
"@uncaged/json-cas": "^0.4.0",
"@uncaged/json-cas-fs": "^0.4.0",
"@uncaged/workflow-protocol": "workspace:^",
"@uncaged/workflow-util": "workspace:^",
"dotenv": "^16.6.1",
@@ -0,0 +1,53 @@
import type { StepContext } from "@uncaged/workflow-protocol";
function formatStep(step: StepContext, stepNumber: number): string {
return [
`### Step ${stepNumber}: ${step.role}`,
`Output: ${JSON.stringify(step.output)}`,
`Agent: ${step.agent}`,
].join("\n");
}
function findLastRoleIndex(steps: StepContext[], role: string): number {
for (let i = steps.length - 1; i >= 0; i--) {
const step = steps[i];
if (step !== undefined && step.role === role) {
return i;
}
}
return -1;
}
/**
* Build a continuation prompt for a role re-entry.
*
* Finds the most recent step for `role`, collects everything after it as context,
* and appends the moderator edge prompt as the instruction.
*/
export function buildContinuationPrompt(
steps: StepContext[],
role: string,
edgePrompt: string,
): string {
const lastIndex = findLastRoleIndex(steps, role);
const sinceSteps = lastIndex >= 0 ? steps.slice(lastIndex + 1) : steps;
const parts: string[] = [];
if (sinceSteps.length > 0) {
parts.push("## What Happened Since Your Last Turn");
const baseStepNumber = lastIndex >= 0 ? lastIndex + 2 : 1;
for (let i = 0; i < sinceSteps.length; i++) {
const step = sinceSteps[i];
if (step === undefined) {
continue;
}
parts.push("");
parts.push(formatStep(step, baseStepNumber + i));
}
parts.push("");
}
parts.push("## Moderator Instruction", "", edgePrompt);
return parts.join("\n");
}
@@ -1,5 +1,11 @@
import type { JSONSchema } from "@uncaged/json-cas";
type SchemaProperty = {
name: string;
schema: JSONSchema;
required: boolean;
};
/**
* Extract top-level property names from a JSON Schema object.
*
@@ -9,9 +15,44 @@ import type { JSONSchema } from "@uncaged/json-cas";
*
* Returns an empty array for schemas with no inspectable property definitions.
*/
function extractSchemaFields(schema: JSONSchema): string[] {
export function extractSchemaFields(schema: JSONSchema): string[] {
return extractSchemaProperties(schema).map((p) => p.name);
}
function extractSchemaProperties(schema: JSONSchema): SchemaProperty[] {
const objectSchemas = collectObjectSchemas(schema);
if (objectSchemas.length === 0) {
return [];
}
const byName = new Map<string, SchemaProperty>();
for (const objectSchema of objectSchemas) {
const requiredSet = new Set(
Array.isArray(objectSchema.required) ? (objectSchema.required as string[]) : [],
);
const properties = objectSchema.properties as Record<string, JSONSchema> | null | undefined;
if (typeof properties !== "object" || properties === null) {
continue;
}
for (const [name, propSchema] of Object.entries(properties)) {
const required = requiredSet.has(name);
const existing = byName.get(name);
if (existing === undefined) {
byName.set(name, { name, schema: propSchema, required });
} else if (required) {
byName.set(name, { ...existing, required: true });
}
}
}
return [...byName.values()];
}
function collectObjectSchemas(schema: JSONSchema): JSONSchema[] {
if (typeof schema.properties === "object" && schema.properties !== null) {
return Object.keys(schema.properties as Record<string, unknown>);
return [schema];
}
const unionKey = Array.isArray(schema.anyOf)
@@ -20,18 +61,109 @@ function extractSchemaFields(schema: JSONSchema): string[] {
? "oneOf"
: null;
if (unionKey !== null) {
const variants = schema[unionKey] as JSONSchema[];
const fieldSet = new Set<string>();
for (const variant of variants) {
for (const field of extractSchemaFields(variant)) {
fieldSet.add(field);
}
}
return [...fieldSet];
if (unionKey === null) {
return [];
}
return [];
const variants = schema[unionKey] as JSONSchema[];
const result: JSONSchema[] = [];
for (const variant of variants) {
result.push(...collectObjectSchemas(variant));
}
return result;
}
function resolvePropertySchema(prop: JSONSchema): JSONSchema {
if (Array.isArray(prop.enum) && prop.enum.length > 0) {
return prop;
}
const unionKey = Array.isArray(prop.anyOf) ? "anyOf" : Array.isArray(prop.oneOf) ? "oneOf" : null;
if (unionKey !== null) {
const variants = prop[unionKey] as JSONSchema[];
const nonNull = variants.filter((v) => v.type !== "null");
if (nonNull.length === 1) {
return nonNull[0];
}
}
return prop;
}
function formatYamlScalar(value: unknown): string {
if (typeof value === "boolean") {
return String(value);
}
if (typeof value === "number") {
return String(value);
}
return String(value);
}
function buildPropertyComment(parts: string[]): string {
const filtered = parts.filter((p) => p.length > 0);
return filtered.length > 0 ? ` # ${filtered.join(" | ")}` : "";
}
function buildPropertyExampleLine(prop: SchemaProperty): string {
const resolved = resolvePropertySchema(prop.schema);
const commentParts: string[] = [];
if (prop.required) {
commentParts.push("required");
}
if (Array.isArray(resolved.enum) && resolved.enum.length > 0) {
const enumValues = resolved.enum.map((v) => String(v));
commentParts.push(...enumValues);
const first = resolved.enum[0];
return `${prop.name}: ${formatYamlScalar(first)}${buildPropertyComment(commentParts)}`;
}
if (resolved.type === "boolean") {
commentParts.push("true", "false");
return `${prop.name}: true${buildPropertyComment(commentParts)}`;
}
if (resolved.type === "string") {
return `${prop.name}: <string>${buildPropertyComment(commentParts)}`;
}
if (resolved.type === "number" || resolved.type === "integer") {
return `${prop.name}: <number>${buildPropertyComment(commentParts)}`;
}
if (resolved.type === "array") {
return `${prop.name}:\n - <item>${buildPropertyComment(commentParts)}`;
}
if (resolved.type === "object") {
return `${prop.name}: <object>${buildPropertyComment(commentParts)}`;
}
return `${prop.name}: <value>${buildPropertyComment(commentParts)}`;
}
function buildYamlExampleBlock(properties: SchemaProperty[]): string {
if (properties.length === 0) {
return "---\n\n... your markdown work here ...";
}
const lines = properties.map((p) => buildPropertyExampleLine(p));
return `---\n${lines.join("\n")}\n---\n\n... your markdown work here ...`;
}
function buildFieldList(properties: SchemaProperty[]): string {
if (properties.length === 0) {
return " (schema fields will be extracted automatically)";
}
return properties
.map((p) => {
const suffix = p.required ? " (required)" : "";
return ` - \`${p.name}\`${suffix}`;
})
.join("\n");
}
/**
@@ -42,28 +174,16 @@ function extractSchemaFields(schema: JSONSchema): string[] {
* system prompt so the deliverable format is the first thing the agent sees.
*/
export function buildOutputFormatInstruction(schema: JSONSchema): string {
const fields = extractSchemaFields(schema);
const fieldList =
fields.length > 0
? fields.map((f) => ` - \`${f}\``).join("\n")
: " (schema fields will be extracted automatically)";
const properties = extractSchemaProperties(schema);
const yamlExample = buildYamlExampleBlock(properties);
const fieldList = buildFieldList(properties);
return `## Deliverable Format
Your response MUST begin with a YAML frontmatter block followed by your markdown work:
\`\`\`
---
status: done # done | needs_input | in_progress | failed
next: <role-name> # suggested next role, or omit
confidence: 0.9 # 0.0–1.0, your self-assessed confidence
artifacts: # list of file paths or CAS hashes you produced
- path/to/file.ts
scope: role # role | thread
---
... your markdown work here ...
${yamlExample}
\`\`\`
The frontmatter is the **primary deliverable** — the engine reads it directly.
@@ -71,5 +191,7 @@ Your meta output must satisfy these fields:
${fieldList}
Output ONLY the fields listed above. Do not add extra fields that are not specified in the schema.
Focus exclusively on YOUR role's deliverable. Do not perform actions outside your role's scope.`;
}
@@ -0,0 +1,44 @@
import type { RoleDefinition } from "@uncaged/workflow-protocol";
import { generateCliReference } from "@uncaged/workflow-util";
/**
* Build the role prompt from a RoleDefinition.
*
* Assembles structured sections: Goal, Capabilities, Prepare, Procedure, Output.
* Empty strings and empty arrays are omitted from the output.
*
* The Prepare section always inlines the uwf CLI reference so the agent has
* workflow knowledge without needing to run an external command. The capabilities
* array is rendered as keyword hints for implicit skill loading.
*/
export function buildRolePrompt(role: RoleDefinition): string {
const sections: string[] = [];
if (role.goal !== "") {
sections.push(`## Goal\n\n${role.goal}`);
}
if (role.capabilities.length > 0) {
const list = role.capabilities.map((c) => `- ${c}`).join("\n");
sections.push(`## Capabilities\n\n${list}`);
}
const prepareLines: string[] = [generateCliReference()];
if (role.capabilities.length > 0) {
const keywords = role.capabilities.join(", ");
prepareLines.push(
`You have the following capabilities: ${keywords}. Load relevant skills matching these keywords before starting work.`,
);
}
sections.push(`## Prepare\n\n${prepareLines.join("\n\n")}`);
if (role.procedure !== "") {
sections.push(`## Procedure\n\n${role.procedure}`);
}
if (role.output !== "") {
sections.push(`## Output\n\n${role.output}`);
}
return sections.join("\n\n");
}
+20 -15
View File
@@ -6,8 +6,8 @@ import type {
StepNodePayload,
ThreadId,
} from "@uncaged/workflow-protocol";
import { createAgentStore, loadThreadsIndex, resolveStorageRoot } from "./storage.js";
import type { AgentStore } from "./storage.js";
import { createAgentStore, loadThreadsIndex, resolveStorageRoot } from "./storage.js";
import type { AgentContext } from "./types.js";
type ChainState = {
@@ -21,11 +21,15 @@ function fail(message: string): never {
throw new Error(message);
}
function walkChain(
store: Store,
schemas: AgentStore["schemas"],
headHash: CasRef,
): ChainState {
function readEdgePrompt(): string {
const value = process.env.UWF_EDGE_PROMPT;
if (value === undefined || value === "") {
fail("UWF_EDGE_PROMPT environment variable is required");
}
return value;
}
function walkChain(store: Store, schemas: AgentStore["schemas"], headHash: CasRef): ChainState {
const headNode = store.get(headHash);
if (headNode === null) {
fail(`CAS node not found: ${headHash}`);
@@ -78,10 +82,7 @@ function walkChain(
};
}
function expandOutput(
store: Store,
outputRef: CasRef,
): unknown {
function expandOutput(store: Store, outputRef: CasRef): unknown {
const node = store.get(outputRef);
if (node === null) {
return {};
@@ -106,11 +107,7 @@ async function buildHistory(
return history;
}
async function loadWorkflow(
store: Store,
schemas: AgentStore["schemas"],
workflowRef: CasRef,
) {
async function loadWorkflow(store: Store, schemas: AgentStore["schemas"], workflowRef: CasRef) {
const node = store.get(workflowRef);
if (node === null) {
fail(`workflow CAS node not found: ${workflowRef}`);
@@ -144,6 +141,8 @@ export async function buildContext(threadId: ThreadId, role: string): Promise<Ag
}
const steps = await buildHistory(store, chain.stepsNewestFirst);
const edgePrompt = readEdgePrompt();
const isFirstVisit = !steps.some((s) => s.role === role);
return {
threadId,
@@ -153,6 +152,8 @@ export async function buildContext(threadId: ThreadId, role: string): Promise<Ag
workflow,
store,
outputFormatInstruction: "",
edgePrompt,
isFirstVisit,
};
}
@@ -189,6 +190,8 @@ export async function buildContextWithMeta(
}
const steps = await buildHistory(store, chain.stepsNewestFirst);
const edgePrompt = readEdgePrompt();
const isFirstVisit = !steps.some((s) => s.role === role);
return {
threadId,
@@ -198,6 +201,8 @@ export async function buildContextWithMeta(
workflow,
store,
outputFormatInstruction: "",
edgePrompt,
isFirstVisit,
meta: { storageRoot, store, schemas, headHash, chain },
};
}
+143 -9
View File
@@ -1,13 +1,139 @@
import { validate } from "@uncaged/json-cas";
import type { Store } from "@uncaged/json-cas";
import { getSchema, validate } from "@uncaged/json-cas";
import type { CasRef } from "@uncaged/workflow-protocol";
import { parseFrontmatterMarkdown, validateFrontmatter } from "@uncaged/workflow-util";
import {
type AgentFrontmatter,
createLogger,
parseFrontmatterMarkdown,
validateFrontmatter,
} from "@uncaged/workflow-util";
import { parse as parseYaml } from "yaml";
import { extractSchemaFields } from "./build-output-format-instruction.js";
const log = createLogger({ sink: { kind: "stderr" } });
const STANDARD_KEYS = ["status", "next", "confidence", "artifacts", "scope"] as const;
type StandardKey = (typeof STANDARD_KEYS)[number];
export type FrontmatterFastPathResult = {
body: string;
outputHash: CasRef;
};
function extractYamlBlock(raw: string): string | null {
const fence = "---";
if (!raw.startsWith(fence)) {
return null;
}
const rest = raw.slice(fence.length);
if (rest.length > 0 && rest[0] !== "\n" && rest[0] !== "\r") {
return null;
}
const afterOpen = rest.startsWith("\n") ? rest.slice(1) : rest;
const closeIndex = afterOpen.indexOf(`\n${fence}`);
if (closeIndex === -1) {
return null;
}
return afterOpen.slice(0, closeIndex);
}
function parseRawFrontmatterFields(raw: string): Record<string, unknown> {
const yamlText = extractYamlBlock(raw);
if (yamlText === null) {
return {};
}
try {
const parsed = parseYaml(yamlText);
if (typeof parsed !== "object" || parsed === null || Array.isArray(parsed)) {
return {};
}
return parsed as Record<string, unknown>;
} catch {
return {};
}
}
function defaultCandidate(frontmatter: AgentFrontmatter): Record<string, unknown> {
return {
status: frontmatter.status,
next: frontmatter.next,
confidence: frontmatter.confidence,
artifacts: [...frontmatter.artifacts],
scope: frontmatter.scope,
};
}
function pickStandardField(frontmatter: AgentFrontmatter, key: StandardKey): unknown {
switch (key) {
case "status":
return frontmatter.status;
case "next":
return frontmatter.next;
case "confidence":
return frontmatter.confidence;
case "artifacts":
return [...frontmatter.artifacts];
case "scope":
return frontmatter.scope;
}
}
function isStandardKey(key: string): key is StandardKey {
return (STANDARD_KEYS as readonly string[]).includes(key);
}
function pickFieldValue(
field: string,
frontmatter: AgentFrontmatter,
rawFields: Record<string, unknown>,
): unknown | undefined {
if (!isStandardKey(field)) {
return Object.hasOwn(rawFields, field) ? rawFields[field] : undefined;
}
const coerced = pickStandardField(frontmatter, field);
if (field === "artifacts" || field === "scope") {
return coerced;
}
if (coerced !== null) {
return coerced;
}
return Object.hasOwn(rawFields, field) ? rawFields[field] : coerced;
}
/**
* Build a CAS candidate object from schema property keys and parsed frontmatter.
*
* When the schema has no inspectable properties, falls back to the five standard
* agent frontmatter fields for backward compatibility.
*/
function buildCandidate(
frontmatter: AgentFrontmatter,
rawFields: Record<string, unknown>,
schemaFields: string[],
): Record<string, unknown> {
if (schemaFields.length === 0) {
return defaultCandidate(frontmatter);
}
const candidate: Record<string, unknown> = {};
for (const field of schemaFields) {
const value = pickFieldValue(field, frontmatter, rawFields);
if (value !== undefined) {
candidate[field] = value;
}
}
return candidate;
}
/**
* Try to satisfy `outputSchema` from frontmatter fields alone.
*
@@ -32,16 +158,22 @@ export async function tryFrontmatterFastPath(
const validationErrors = validateFrontmatter(frontmatter);
if (validationErrors.length > 0) {
log(
"9GNPS4WY",
`frontmatter validation errors: ${validationErrors.map((e) => e.message).join("; ")}`,
);
return null;
}
const candidate: Record<string, unknown> = {
status: frontmatter.status,
next: frontmatter.next,
confidence: frontmatter.confidence,
artifacts: [...frontmatter.artifacts],
scope: frontmatter.scope,
};
const schema = getSchema(store, outputSchema);
if (schema === null) {
log("8FHMR2QX", `output schema not found in CAS: ${outputSchema}`);
return null;
}
const schemaFields = extractSchemaFields(schema);
const rawFields = parseRawFrontmatterFields(raw);
const candidate = buildCandidate(frontmatter, rawFields, schemaFields);
let outputHash: CasRef;
let node: ReturnType<Store["get"]>;
@@ -50,10 +182,12 @@ export async function tryFrontmatterFastPath(
outputHash = await store.put(outputSchema, candidate);
node = store.get(outputHash);
} catch {
log("2KMQT7NR", "failed to store frontmatter candidate in CAS");
return null;
}
if (node === null || !validate(store, node)) {
log("2KMQT7NR", "stored frontmatter candidate failed schema validation");
return null;
}
+12 -3
View File
@@ -1,3 +1,6 @@
export { buildContinuationPrompt } from "./build-continuation-prompt.js";
export { buildOutputFormatInstruction } from "./build-output-format-instruction.js";
export { buildRolePrompt } from "./build-role-prompt.js";
export type { BuildContextMeta } from "./context.js";
export { buildContext, buildContextWithMeta } from "./context.js";
export type { ExtractResult, ResolvedLlmProvider } from "./extract.js";
@@ -6,9 +9,15 @@ export {
resolveExtractModelAlias,
resolveModel,
} from "./extract.js";
export { buildOutputFormatInstruction } from "./build-output-format-instruction.js";
export type { FrontmatterFastPathResult } from "./frontmatter.js";
export { tryFrontmatterFastPath } from "./frontmatter.js";
export { createAgent } from "./run.js";
export { getConfigPath, getEnvPath, loadWorkflowConfig } from "./storage.js";
export type { AgentContext, AgentOptions, AgentRunFn, AgentRunResult } from "./types.js";
export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
export { getCachedSessionId, setCachedSessionId } from "./session-cache.js";
export type {
AgentContext,
AgentContinueFn,
AgentOptions,
AgentRunFn,
AgentRunResult,
} from "./types.js";
+37 -34
View File
@@ -1,14 +1,14 @@
import { getSchema, validate } from "@uncaged/json-cas";
import type { CasRef, StepNodePayload, ThreadId } from "@uncaged/workflow-protocol";
import { config as loadDotenv } from "dotenv";
import { buildContextWithMeta } from "./context.js";
import { buildOutputFormatInstruction } from "./build-output-format-instruction.js";
import { extract } from "./extract.js";
import { buildContextWithMeta } from "./context.js";
import { tryFrontmatterFastPath } from "./frontmatter.js";
import type { AgentStore } from "./storage.js";
import { getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
import type { AgentContext, AgentOptions, AgentRunResult } from "./types.js";
import { getEnvPath, resolveStorageRoot } from "./storage.js";
import type { AgentOptions } from "./types.js";
const MAX_FRONTMATTER_RETRIES = 2;
function fail(message: string): never {
process.stderr.write(`${message}\n`);
@@ -67,31 +67,16 @@ async function writeStepNode(options: {
return hash;
}
async function runAgent(options: AgentOptions, ctx: AgentContext): Promise<AgentRunResult> {
return runWithMessage("agent run failed", () => options.run(ctx));
}
async function extractOutput(
async function tryExtractOutput(
rawOutput: string,
outputSchema: CasRef,
storageRoot: string,
ctx: Awaited<ReturnType<typeof buildContextWithMeta>>,
): Promise<CasRef> {
const fastPath = await runWithMessage("frontmatter fast path", () =>
tryFrontmatterFastPath(rawOutput, outputSchema, ctx.meta.store),
).catch(() => null);
): Promise<CasRef | null> {
const fastPath = await tryFrontmatterFastPath(rawOutput, outputSchema, ctx.meta.store);
if (fastPath !== null) {
return fastPath.outputHash;
}
const config = await runWithMessage("failed to load config", () =>
loadWorkflowConfig(storageRoot),
);
const extracted = await runWithMessage("extract failed", () =>
extract(rawOutput, outputSchema, config),
);
return extracted.hash;
return null;
}
async function persistStep(options: {
@@ -113,11 +98,6 @@ async function persistStep(options: {
});
}
/**
* Create an agent CLI entrypoint.
* Parses argv (`<thread-id> <role>`), runs the agent, extracts structured output,
* writes StepNode to CAS, and prints the new node hash to stdout.
*/
export function createAgent(options: AgentOptions): () => Promise<void> {
return async function main(): Promise<void> {
const { threadId, role } = parseArgv(process.argv);
@@ -131,13 +111,36 @@ export function createAgent(options: AgentOptions): () => Promise<void> {
fail(`unknown role: ${role}`);
}
const outputSchema = getSchema(ctx.meta.store, roleDef.outputSchema);
if (outputSchema !== null) {
ctx.outputFormatInstruction = buildOutputFormatInstruction(outputSchema);
const frontmatterSchema = getSchema(ctx.meta.store, roleDef.frontmatter);
if (frontmatterSchema !== null) {
ctx.outputFormatInstruction = buildOutputFormatInstruction(frontmatterSchema);
}
let agentResult = await runWithMessage("agent run failed", () => options.run(ctx));
// Try to extract frontmatter; retry via continue if it fails
let outputHash = await tryExtractOutput(agentResult.output, roleDef.frontmatter, ctx);
for (let retry = 0; retry < MAX_FRONTMATTER_RETRIES && outputHash === null; retry++) {
const correctionMessage =
"Your previous response did not contain valid YAML frontmatter matching the role schema.\n" +
"You MUST begin your response with a YAML frontmatter block (--- delimited).\n" +
"Please output ONLY the corrected frontmatter block followed by your work.";
agentResult = await runWithMessage("agent continue failed", () =>
options.continue(agentResult.sessionId, correctionMessage, ctx.meta.store),
);
outputHash = await tryExtractOutput(agentResult.output, roleDef.frontmatter, ctx);
}
if (outputHash === null) {
fail(
"Agent output does not contain valid YAML frontmatter matching the role schema " +
`after ${MAX_FRONTMATTER_RETRIES} retries.\n` +
`Raw output (first 500 chars): ${agentResult.output.slice(0, 500)}`,
);
}
const agentResult = await runAgent(options, ctx);
const outputHash = await extractOutput(agentResult.output, roleDef.outputSchema, storageRoot, ctx);
const stepHash = await persistStep({
ctx,
outputHash,
@@ -0,0 +1,75 @@
import { mkdir, readFile, rename, writeFile } from "node:fs/promises";
import { randomBytes } from "node:crypto";
import { dirname, join } from "node:path";
import type { ThreadId } from "@uncaged/workflow-protocol";
import { resolveStorageRoot } from "./storage.js";
type SessionCache = Record<string, string>;
function getCachePath(): string {
return join(resolveStorageRoot(), "cache", "agent-sessions.json");
}
function cacheKey(threadId: ThreadId, role: string): string {
return `${threadId}:${role}`;
}
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null && !Array.isArray(value);
}
async function readCache(): Promise<SessionCache> {
const path = getCachePath();
try {
const text = await readFile(path, "utf8");
const raw = JSON.parse(text) as unknown;
if (!isRecord(raw)) {
return {};
}
const cache: SessionCache = {};
for (const [key, value] of Object.entries(raw)) {
if (typeof value === "string" && value !== "") {
cache[key] = value;
}
}
return cache;
} catch (e) {
const err = e as NodeJS.ErrnoException;
if (err.code === "ENOENT") {
return {};
}
throw e;
}
}
async function writeCache(cache: SessionCache): Promise<void> {
const path = getCachePath();
const dir = dirname(path);
await mkdir(dir, { recursive: true });
// Atomic write: write to temp file then rename to avoid partial reads on concurrent access.
// NOTE: Current workflow execution is serial (execFileSync), so true concurrency doesn't occur.
// This is a safety net for future parallel execution.
const tmpPath = join(dir, `.agent-sessions.${randomBytes(4).toString("hex")}.tmp`);
await writeFile(tmpPath, `${JSON.stringify(cache, null, 2)}\n`, "utf8");
await rename(tmpPath, path);
}
/** Read the cached session ID for a thread+role pair. */
export async function getCachedSessionId(threadId: ThreadId, role: string): Promise<string | null> {
const cache = await readCache();
const sessionId = cache[cacheKey(threadId, role)];
return sessionId ?? null;
}
/** Write the session ID for a thread+role pair into the cache. */
export async function setCachedSessionId(
threadId: ThreadId,
role: string,
sessionId: string,
): Promise<void> {
const cache = await readCache();
cache[cacheKey(threadId, role)] = sessionId;
await writeCache(cache);
}
+18 -1
View File
@@ -7,21 +7,38 @@ export type AgentContext = ModeratorContext & {
store: Store;
workflow: WorkflowPayload;
/**
* Prepend to the role's systemPrompt when building the agent prompt.
* Prepend to the role's prompt when building the agent prompt.
* Contains the frontmatter deliverable format instruction derived from the
* role's output schema. Populated by `createAgent` at run time.
*/
outputFormatInstruction: string;
/**
* Edge prompt from the graph transition that led to this role (UWF_EDGE_PROMPT).
* Always the real moderator instruction for this step.
*/
edgePrompt: string;
/**
* True when the current role has not appeared in steps history before this invocation.
*/
isFirstVisit: boolean;
};
export type AgentRunResult = {
output: string;
detailHash: string;
sessionId: string;
};
export type AgentContinueFn = (
sessionId: string,
message: string,
store: AgentContext["store"],
) => Promise<AgentRunResult>;
export type AgentRunFn = (ctx: AgentContext) => Promise<AgentRunResult>;
export type AgentOptions = {
name: string;
run: AgentRunFn;
continue: AgentContinueFn;
};
@@ -0,0 +1,25 @@
{
"$schema": "https://ui.shadcn.com/schema.json",
"style": "base-nova",
"rsc": false,
"tsx": true,
"tailwind": {
"config": "",
"css": "src/index.css",
"baseColor": "neutral",
"cssVariables": true,
"prefix": ""
},
"iconLibrary": "lucide",
"rtl": false,
"aliases": {
"components": "@/components",
"utils": "@/lib/utils",
"ui": "@/components/ui",
"lib": "@/lib",
"hooks": "@/hooks"
},
"menuColor": "default",
"menuAccent": "subtle",
"registries": {}
}

Some files were not shown because too many files have changed in this diff Show More