Compare commits

..

34 Commits

Author SHA1 Message Date
xiaoju 09b7ddf6d0 feat: add developer skill — coding conventions + architecture guide
CI / test (pull_request) Successful in 1m26s
Adds 'uwf skill developer' for contributors to the workflow engine.
Covers: monorepo structure, dependency layers, functional-first conventions,
error handling, logging with tagged logger, development workflow,
testing, publishing, key modules (moderator, extract pipeline, createAgent).

Refs #541
2026-05-26 17:11:07 +00:00
xiaomo c4e94bbe56 Merge pull request 'feat: add author skill — workflow YAML design guide' (#547) from feat/539-skill-author into main
CI / test (push) Successful in 1m11s
2026-05-26 17:04:50 +00:00
xiaoju dbefe793f2 feat: add author skill — workflow YAML design guide
CI / test (pull_request) Successful in 1m4s
Adds 'uwf skill author' for agents/humans designing workflow definitions.
Covers: YAML structure, role definition, frontmatter schema design,
graph routing, edge prompts, self-testing, and common pitfalls.

Refs #539
2026-05-26 17:02:53 +00:00
xiaomo 6483bc4861 Merge pull request 'feat: add user skill — CLI guide with quick start' (#546) from feat/538-skill-user into main
CI / test (push) Successful in 1m40s
2026-05-26 16:27:43 +00:00
xiaoju fecb02b115 feat: add user skill — CLI guide with quick start and typical workflows
CI / test (pull_request) Successful in 1m26s
Adds 'uwf skill user' command for agents/humans using the uwf CLI.
Covers setup, workflow management, thread lifecycle, step operations,
CAS queries, logging, and global options with a Quick Start guide.

Refs #538
2026-05-26 16:24:39 +00:00
xiaomo 87938c1886 Merge pull request 'feat: add actor skill — frontmatter protocol + CAS reference' (#545) from feat/540-skill-actor into main
CI / test (push) Failing after 23s
2026-05-26 15:44:31 +00:00
xiaoju 95a130136b feat: add actor skill — frontmatter protocol + CAS reference
CI / test (pull_request) Failing after 8m9s
Adds 'uwf skill actor' command for agents executing workflow roles.
Covers the two things an actor needs to know:
1. Frontmatter output protocol (status field, schema-defined fields)
2. CAS operations (put, get, refs, walk, merkle DAG pattern)

Refs #540
2026-05-26 15:32:03 +00:00
xiaomo aba5642908 Merge pull request 'ci: use test:ci to skip integration tests in CI' (#543) from fix/ci-skip-integration-tests into main
CI / test (push) Successful in 3m32s
2026-05-26 15:26:02 +00:00
xingyue 168e604602 ci: use test:ci to skip integration tests in CI
CI / test (pull_request) Successful in 9m13s
The HermesAcpClient integration tests require a live Hermes agent
process and always timeout (3 × 120s) in CI containers, causing
every CI run to fail for ~6 minutes before reporting failure.

Switch from `bun run test` to `bun run test:ci` which was already
defined in all testable packages — workflow-agent-hermes's test:ci
runs only unit tests (__tests__/*.test.ts), skipping integration/.
2026-05-26 23:08:16 +08:00
xiaoju d50159c5a7 refactor: split e2e-walkthrough into 6 roles with dedicated cleanup
CI / test (push) Failing after 11m29s
- bootstrap: Docker + bun install + bun link + verify
- config-and-registry: config get/set/list + workflow add/show/list
- thread-ops: thread start/list/show/exec
- inspect: step list/show + thread read + CAS get/has/refs/walk
- cancel-and-fork: cancel + fork + logs
- cleanup: docker rm -f (all fail paths route here)

小橘 🍊
2026-05-26 14:47:44 +00:00
xiaoju 9a7ad34e55 chore: move e2e-walkthrough to .workflows/, fix CI, clean .plan/
CI / test (push) Failing after 11m54s
- e2e-walkthrough.yaml: examples/ → .workflows/ (project workflows, not examples)
- .gitea/workflows/ci.yml: bun test → bun run test (avoid legacy-packages)
- .plan/: removed stale test spec from #335

小橘 🍊
2026-05-26 14:37:46 +00:00
xiaoju 4193157124 refactor(hermes): clean up loadHermesSessionFromDb
CI / test (push) Failing after 11m14s
- Remove unnecessary Promise.resolve() wrappers (sync function)
- Use try/finally for db.close() instead of manual close at each exit
- Flatten nested try/catch

Follow-up to #535 review nits.

小橘 🍊
2026-05-26 14:27:31 +00:00
xiaomo 6ff1414cf0 Merge pull request 'fix(hermes): add SQLite fallback for loadHermesSession' (#536) from fix/535-sqlite-fallback into main
CI / test (push) Failing after 9m23s
Merge pull request #536: fix(hermes): add SQLite fallback for loadHermesSession
2026-05-26 14:24:42 +00:00
xiaoju 37f4203b40 fix(hermes): add SQLite fallback for loadHermesSession (#535)
CI / test (pull_request) Failing after 9m52s
When sessions.write_json_snapshots is disabled, Hermes only writes to
state.db (SQLite). loadHermesSession now falls back to reading from
~/.hermes/state.db when the JSON file is missing.

- Add getHermesDbPath() and loadHermesSessionFromDb() functions
- Use bun:sqlite with readonly mode, try-catch for graceful errors
- JSON file still takes priority (fast path)
- Filter messages to user/assistant/tool roles
- Convert unix timestamps to ISO 8601 strings
2026-05-26 14:19:15 +00:00
xiaoju c4ec22bb4f chore: e2e-walkthrough uses bun link for container-internal uwf
CI / test (push) Failing after 8m19s
外层: bun install -g @uncaged/cli-workflow@0.5.0 (+ agents)
内层: bun link 本地 packages,完全隔离

小橘 🍊
2026-05-26 13:14:54 +00:00
xiaoju 427f47d72c fix: release script uses filtered test, publish 0.5.0
CI / test (push) Failing after 10m18s
小橘 🍊
2026-05-26 13:02:45 +00:00
xiaoju 9f25745e1e chore: exit pre mode, clean stale changesets for 0.5.0 release
小橘 🍊
2026-05-26 13:00:49 +00:00
xiaoju 82247c86ce feat: add e2e-walkthrough workflow definition
CI / test (push) Failing after 8m29s
Dogfooding: uwf tests uwf. Replaces the monolithic bash script with a
4-role workflow (bootstrap → setup-and-registry → thread-lifecycle →
cancel-fork-and-logs), each executing inside an isolated Docker container.

小橘 🍊
2026-05-26 12:49:13 +00:00
xiaoju 0ef2d8fec2 feat: add E2E walkthrough script (Docker-based, WIP)
CI / test (push) Failing after 7m41s
Runs full uwf CLI walkthrough inside a Docker container with isolated
storage root. Tests: setup, workflow add/list/show, thread start/exec/
cancel/fork, step list/show, CAS operations, config get/set, logs.

Approach: mount host $HOME into node:22-bookworm container, override
UNCAGED_WORKFLOW_STORAGE_ROOT with tmpdir. No mock LLM — real agents.

Known issues documented in header comments (jq quoting, apt-get
startup time, lockfile conflicts).

小橘 🍊(NEKO Team)
2026-05-26 12:40:47 +00:00
xiaoju aa14fd08e0 chore: add dev environment check script
CI / test (push) Failing after 8m26s
scripts/check-dev-env.sh validates all prerequisites:
- Runtime: bun, node, python3
- Tools: hermes, claude-code
- Workflow: repo, build, uwf/agent symlinks, config
- Docker (optional, for E2E tests)

Non-interactive, actionable fix instructions on failure.
Designed for both humans and agents.
2026-05-26 12:25:25 +00:00
xiaonuo e43d4f3bbf Merge pull request 'fix: config validation and agent name normalization (#531, #532, #533)' (#534) from fix/531-532-533 into main
CI / test (push) Failing after 9m10s
2026-05-26 06:09:56 +00:00
xiaoju b0c73b5439 fix(cli): fix config masking, agent normalization, and add key validation
CI / test (pull_request) Failing after 17m6s
This commit addresses three related issues in the CLI config and setup commands:

1. Issue #531: Fix config list apiKey masking
   - maskApiKeys() now checks for 'apiKey' instead of 'apiKeyEnv'
   - Updated tests to use apiKey field throughout

2. Issue #532: Add config set key validation
   - Reject unknown top-level keys with helpful error messages
   - Reject unknown nested fields in providers/models/agents
   - Reject incomplete paths and nested paths on scalar keys
   - Added VALID_CONFIG_KEYS schema and validateConfigKey() function

3. Issue #533: Fix agent name double-prefix in setup
   - mergeConfig() now uses _agentNameFromBinary() to normalize agent names
   - 'uwf-hermes' input now produces 'hermes' key with 'uwf-hermes' command
   - Added tests for prefixed agent names

All tests passing, no regressions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-26 05:57:55 +00:00
xiaonuo bbbe4651c2 Merge pull request 'refactor: apiKeyEnv → apiKey, store actual secret in config' (#530) from fix/528-refactor-apikey into main
CI / test (push) Failing after 35s
2026-05-26 05:37:51 +00:00
xiaonuo 7dfe0eb6a9 Merge pull request 'feat(cli): add uwf config get/set/list subcommand' (#527) from fix/526-config-subcommand into main
CI / test (push) Has been cancelled
2026-05-26 05:37:32 +00:00
xiaoju 47a4268b9b docs: update all documentation to reflect apiKey refactoring (#528)
CI / test (pull_request) Failing after 33s
Update all documentation files that contained outdated apiKeyEnv
references to use the new apiKey approach.

## Changes

- docs/architecture.md: Update config example to use apiKey field
- docs/wf-stateless-design.md: Update config examples and type
  definitions to use apiKey instead of apiKeyEnv
- docs/builtin-agent-research.md: Update ProviderConfig type
  definition and code examples

All documentation now consistent with the code implementation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-26 05:34:49 +00:00
xiaoju 0c90b88e08 refactor(protocol,cli,agent): replace apiKeyEnv with apiKey (#528)
CI / test (pull_request) Failing after 7m20s
Breaking change: Store API keys directly in config.yaml instead of
environment variable names.

## Changes

### @uncaged/workflow-protocol
- Change ProviderConfig.apiKeyEnv: string → apiKey: string
- Update README to reflect new type

### @uncaged/workflow-util-agent
- extract.ts: Remove dotenv loading, use providerEntry.apiKey directly
- storage.ts: Update normalizeProviders to validate apiKey field
- Update error messages to reference apiKey instead of apiKeyEnv

### @uncaged/cli-workflow
- setup.ts: Write actual API key to config.yaml, not .env
- Remove apiKeyEnvName(), loadEnvFile(), saveEnvFile() functions
- Remove getEnvPath() function
- Update cmdSetup to not return envPath in result
- Update README to reflect config.yaml stores API keys
- Fix setup-validate.test.ts to not expect envPath in result

## Verification
-  bun run build passes
-  All tests pass (260/260 in cli-workflow, 55/55 in util-agent)
-  bun run check passes (only pre-existing warnings)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-26 05:23:33 +00:00
xiaoju 5583a9da00 chore: retrigger CI
CI / test (pull_request) Failing after 1m36s
2026-05-26 05:21:11 +00:00
xiaoju 4a0cb7c615 ci: replace lint+typecheck with unified check step
CI / test (pull_request) Failing after 9m1s
Fixes CI failure — 'lint' script didn't exist in package.json.
bun run check already covers tsc + biome + log-tag lint.
2026-05-26 05:04:47 +00:00
xiaoju fa97a7c92a feat(cli): add uwf config get/set/list subcommand
CI / test (pull_request) Failing after 23m14s
Add configuration management commands to uwf CLI:
- uwf config list: display all config values (masks API keys)
- uwf config get <key>: retrieve specific value using dot notation
- uwf config set <key> <value>: update config value with auto-creation

Implementation:
- New file packages/cli-workflow/src/commands/config.ts with helper functions
- Comprehensive test coverage (32 tests) in config.test.ts
- Supports nested path navigation via dot notation
- Auto-creates intermediate objects when setting new paths
- Masks apiKeyEnv values in list output for security

Resolves #526

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-25 16:21:51 +00:00
xiaomo 0097633a3b Merge pull request 'fix: cancelled threads show distinct "cancelled" status' (#525) from fix/522-cancelled-thread-status into main
CI / test (push) Failing after 5m57s
2026-05-25 15:51:29 +00:00
xiaomo 04591296b2 Merge pull request 'fix: bin entry point to dist/cli.js for node compatibility' (#524) from fix/523-bin-entry-point into main
CI / test (push) Has been cancelled
2026-05-25 15:51:18 +00:00
xiaoju 96039dbbbf fix: cancelled threads show distinct status instead of completed
CI / test (pull_request) Failing after 34s
Fixes #522
2026-05-25 15:39:59 +00:00
xiaoju 5230462b8d fix: bin entry point to dist/cli.js for node compatibility
CI / test (pull_request) Failing after 9m12s
Fixes #523
2026-05-25 15:35:55 +00:00
xiaomo 4a39d3fdef Merge pull request 'feat(skill): expand uwf skill with architecture, yaml, moderator, list subcommands' (#521) from fix/517-expand-skill into main
CI / test (push) Failing after 22m38s
2026-05-25 15:00:34 +00:00
49 changed files with 2856 additions and 508 deletions
-5
View File
@@ -1,5 +0,0 @@
---
"@uncaged/workflow-util": patch
---
Replace optionalEnv/requireEnv with unified env(name, fallback) API
-5
View File
@@ -1,5 +0,0 @@
---
"@uncaged/workflow-protocol": patch
---
fix: correct internal dependency versions for prerelease
-5
View File
@@ -1,5 +0,0 @@
---
"@uncaged/workflow-util-agent": patch
---
fix: include create-agent-adapter.ts in published src
-5
View File
@@ -1,5 +0,0 @@
---
"@uncaged/workflow-protocol": patch
---
fix: use npm publish with pinned deps instead of bun publish (workspace:^ resolution bug)
+1 -1
View File
@@ -1,5 +1,5 @@
{
"mode": "pre",
"mode": "exit",
"tag": "alpha",
"initialVersions": {
"@uncaged/cli-workflow": "0.4.5",
-5
View File
@@ -1,5 +0,0 @@
---
"@uncaged/workflow-protocol": minor
---
feat: AgentFn<Opt> type boundary and createAgentAdapter bridging function (RFC #252)
+3 -6
View File
@@ -18,11 +18,8 @@ jobs:
- name: Install dependencies
run: bun install
- name: Lint
run: bun run lint
- name: Type check
run: bun run typecheck
- name: Check
run: bun run check
- name: Test
run: bun test
run: bun run test:ci
+2 -1
View File
@@ -12,4 +12,5 @@ packages/workflow-template-develop/develop.esm.js
.DS_Store
*.py
.claude
tmp
tmp.worktrees/
.worktrees/
-83
View File
@@ -1,83 +0,0 @@
# Test Spec: uwf setup model connectivity validation (#335)
## Context
File: `packages/cli-workflow/src/commands/setup.ts`
Test file: `packages/cli-workflow/src/__tests__/setup-validate.test.ts`
After `cmdSetup` writes config, it should send a test chat completion request to verify the configured model is reachable. If validation fails, warn the user (don't abort — config is already saved).
## Implementation Notes
- Add a `validateModel(baseUrl, apiKey, model)` function that sends a minimal chat completion request (`POST /chat/completions` with `messages: [{role:"user",content:"hi"}]`, `max_tokens: 1`)
- Returns `Result<void, string>` — ok if 2xx response, error with reason string otherwise
- Use `AbortSignal.timeout(15_000)` for the request
- Both `cmdSetup` and `cmdSetupInteractive` should call it after saving config
- `cmdSetup` returns validation result in its return object: `{ ...existing, validation: { ok: true } | { ok: false, error: string } }`
- `cmdSetupInteractive` prints a warning to console if validation fails, success message if it passes
- Use the project logger (`createLogger`) — no raw `console.log` except in interactive CLI output (per CLAUDE.md)
## Test Cases (vitest)
### 1. `validateModel` — success path
- Mock `fetch` to return `{ status: 200, ok: true, json: () => ({}) }`
- Call `validateModel(baseUrl, apiKey, model)`
- Assert returns `{ ok: true, value: undefined }`
- Assert fetch was called with correct URL (`${baseUrl}/chat/completions`), correct headers (`Authorization: Bearer ${apiKey}`), correct body (model, messages, max_tokens: 1)
### 2. `validateModel` — HTTP error (401 unauthorized)
- Mock `fetch` to return `{ status: 401, ok: false, statusText: "Unauthorized" }`
- Call `validateModel(baseUrl, apiKey, model)`
- Assert returns `{ ok: false, error: <string containing "401"> }`
### 3. `validateModel` — HTTP error (404 model not found)
- Mock `fetch` to return `{ status: 404, ok: false, statusText: "Not Found" }`
- Assert returns `{ ok: false, error: <string containing "404"> }`
### 4. `validateModel` — network timeout
- Mock `fetch` to throw `DOMException` with name `AbortError`
- Assert returns `{ ok: false, error: <string containing "timeout" or "unreachable"> }`
### 5. `validateModel` — network error (DNS failure, connection refused)
- Mock `fetch` to throw `TypeError("fetch failed")`
- Assert returns `{ ok: false, error: <string mentioning connectivity> }`
### 6. `cmdSetup` — includes validation result on success
- Mock global `fetch` for `/chat/completions` to succeed
- Call `cmdSetup({ provider, baseUrl, apiKey, model, storageRoot })`
- Assert returned object has `validation: { ok: true, value: undefined }`
- Assert config files are still written (existing behavior preserved)
### 7. `cmdSetup` — includes validation result on failure (config still saved)
- Mock global `fetch` for `/chat/completions` to return 401
- Call `cmdSetup({ ... })`
- Assert returned object has `validation: { ok: false, error: ... }`
- Assert `config.yaml` and `.env` are still written (validation failure doesn't prevent saving)
### 8. `cmdSetupInteractive` — prints success message on validation pass
- Mock `fetch` for both `/models` and `/chat/completions` to succeed
- Mock stdin to provide valid selections
- Capture console output
- Assert output contains a success message like "Model verified" or "✓"
### 9. `cmdSetupInteractive` — prints warning on validation failure
- Mock `fetch`: `/models` succeeds, `/chat/completions` returns 401
- Mock stdin for valid selections
- Capture console output
- Assert output contains a warning about model not being reachable and suggests trying a different model
### 10. `validateModel` — request body correctness
- Mock `fetch` to capture the request body
- Call `validateModel(baseUrl, apiKey, "test-model")`
- Assert body is `{ model: "test-model", messages: [{role: "user", content: "hi"}], max_tokens: 1 }`
## Export Requirements
- `validateModel` must be exported (for direct unit testing)
- Signature: `async function validateModel(baseUrl: string, apiKey: string, model: string): Promise<Result<void, string>>`
- `Result` type: `{ ok: true; value: T } | { ok: false; error: E }` (project convention)
## Files to Create/Modify
- **New**: `packages/cli-workflow/src/__tests__/setup-validate.test.ts` — all test cases above
- **Modify**: `packages/cli-workflow/src/commands/setup.ts` — add `validateModel`, integrate into `cmdSetup` and `cmdSetupInteractive`
+269
View File
@@ -0,0 +1,269 @@
name: "e2e-walkthrough"
description: "End-to-end walkthrough of uwf CLI. Dogfooding: uwf tests uwf. Each role validates a phase of the CLI surface inside an isolated Docker container."
roles:
bootstrap:
description: "Start Docker container with isolated storage, verify uwf is runnable"
goal: "You are an E2E test runner. Set up an isolated Docker environment and verify basic uwf functionality."
capabilities:
- docker
- shell
procedure: |
1. Start a Docker container with isolated storage:
```
docker run -d --name uwf-e2e-$$ \
-v $HOME:$HOME \
-e HOME=$HOME \
-e UNCAGED_WORKFLOW_STORAGE_ROOT=/tmp/uwf-e2e-storage \
-w ~/repos/workflow \
node:22-bookworm \
sleep infinity
```
2. Inside the container, install bun, install deps, then `bun link` all packages
so that `uwf`, `uwf-hermes`, `uwf-builtin` are on PATH (from source):
```
docker exec uwf-e2e-$$ bash -c '
# Install bun
curl -fsSL https://bun.sh/install | bash
export PATH="$HOME/.bun/bin:$PATH"
# Isolated storage
mkdir -p $UNCAGED_WORKFLOW_STORAGE_ROOT
# Install workspace deps
cd ~/repos/workflow && bun install --frozen-lockfile
# bun link each package that has a bin entry
cd packages/cli-workflow && bun link && cd ../..
cd packages/workflow-agent-hermes && bun link && cd ../..
cd packages/workflow-agent-builtin && bun link && cd ../..
'
```
3. Verify all three commands are available inside the container:
```
docker exec uwf-e2e-$$ bash -c 'export PATH="$HOME/.bun/bin:$PATH" && uwf --version'
docker exec uwf-e2e-$$ bash -c 'export PATH="$HOME/.bun/bin:$PATH" && uwf-hermes --help'
docker exec uwf-e2e-$$ bash -c 'export PATH="$HOME/.bun/bin:$PATH" && uwf-builtin --help'
```
4. Copy host config if it exists:
```
docker exec uwf-e2e-$$ bash -c '
if [ -f $HOME/.uncaged/workflow/config.yaml ]; then
cp $HOME/.uncaged/workflow/config.yaml $UNCAGED_WORKFLOW_STORAGE_ROOT/config.yaml
fi
'
```
Report the container name and confirm uwf + agents are working.
Set containerName to the Docker container name for subsequent roles.
output: "Report uwf version and container readiness. Set $status to pass with containerName, or fail with error."
frontmatter:
oneOf:
- properties:
$status: { const: "pass" }
containerName: { type: string }
required: [$status, containerName]
- properties:
$status: { const: "fail" }
error: { type: string }
required: [$status, error]
config-and-registry:
description: "Validate uwf config commands and workflow registration"
goal: "You are an E2E test runner. Validate uwf config operations and workflow registration inside the Docker container."
capabilities:
- docker
- shell
procedure: |
Use the container from the previous step (containerName is in your prompt).
All commands run via: `docker exec <containerName> bash -c '...'`
All commands use `uwf` (installed via `bun link` inside the container).
Remember to set env vars in each exec:
export PATH="$HOME/.bun/bin:$PATH"
export UNCAGED_WORKFLOW_STORAGE_ROOT=/tmp/uwf-e2e-storage
Config tests:
1. `uwf config list` — verify it returns valid JSON
2. `uwf config set models.test.name test-model` — set a test key
3. `uwf config get models.test.name` — verify it returns "test-model"
Workflow registration tests:
4. `uwf workflow add ~/repos/workflow/examples/solve-issue.yaml` — register workflow
5. Verify the output contains a hash
6. `uwf workflow list` — verify non-empty array
7. Capture the workflow name from the list
8. `uwf workflow show <name>` — verify it returns roles
Report all test results with pass/fail counts.
output: "Report test results. Set $status to pass (with workflowName and containerName) or fail."
frontmatter:
oneOf:
- properties:
$status: { const: "pass" }
workflowName: { type: string }
containerName: { type: string }
required: [$status, workflowName, containerName]
- properties:
$status: { const: "fail" }
error: { type: string }
containerName: { type: string }
required: [$status, error, containerName]
thread-ops:
description: "Test thread start, list, show, and exec"
goal: "You are an E2E test runner. Validate thread creation and execution inside the Docker container."
capabilities:
- docker
- shell
procedure: |
Use the container (containerName) and workflow (workflowName) from your prompt.
All commands via: `docker exec <containerName> bash -c '...'`
Set env: PATH="$HOME/.bun/bin:$PATH" UNCAGED_WORKFLOW_STORAGE_ROOT=/tmp/uwf-e2e-storage
1. `uwf thread start <workflowName> -p 'E2E test: what is 2+2?'` — capture thread ID from JSON output
2. `uwf thread list` — verify the thread appears in the list
3. `uwf thread show <threadId>` — verify head pointer exists
4. `uwf thread exec <threadId> --agent uwf-builtin` — execute one step
5. Verify exec returns JSON with a head field
Report results. Pass threadId and containerName forward.
output: "Report test results. Set $status to pass (with threadId, workflowName, containerName) or fail."
frontmatter:
oneOf:
- properties:
$status: { const: "pass" }
threadId: { type: string }
workflowName: { type: string }
containerName: { type: string }
required: [$status, threadId, workflowName, containerName]
- properties:
$status: { const: "fail" }
error: { type: string }
containerName: { type: string }
required: [$status, error, containerName]
inspect:
description: "Test step list/show, thread read, and CAS operations"
goal: "You are an E2E test runner. Validate read and inspect operations inside the Docker container."
capabilities:
- docker
- shell
procedure: |
Use the container (containerName) and threadId from your prompt.
All commands via: `docker exec <containerName> bash -c '...'`
Set env: PATH="$HOME/.bun/bin:$PATH" UNCAGED_WORKFLOW_STORAGE_ROOT=/tmp/uwf-e2e-storage
Step inspection:
1. `uwf step list <threadId>` — verify steps array has length > 1
2. Capture the last step hash from the output
3. `uwf step show <lastStepHash>` — verify it returns a role field
Thread read:
4. `uwf thread read <threadId>` — verify non-empty output
CAS operations:
5. `uwf cas get <lastStepHash>` — verify returns a type field
6. `uwf cas has <lastStepHash>` — verify exits 0
7. `uwf cas refs <lastStepHash>` — list refs (may be empty)
8. `uwf cas walk <lastStepHash>` — verify returns non-empty array
Report results. Pass threadId, lastStepHash, workflowName, containerName forward.
output: "Report test results. Set $status to pass (with threadId, lastStepHash, workflowName, containerName) or fail."
frontmatter:
oneOf:
- properties:
$status: { const: "pass" }
threadId: { type: string }
lastStepHash: { type: string }
workflowName: { type: string }
containerName: { type: string }
required: [$status, threadId, lastStepHash, workflowName, containerName]
- properties:
$status: { const: "fail" }
error: { type: string }
containerName: { type: string }
required: [$status, error, containerName]
cancel-and-fork:
description: "Test thread cancel, step fork, and log inspection"
goal: "You are an E2E test runner. Validate cancel, fork, and log operations inside the Docker container."
capabilities:
- docker
- shell
procedure: |
Use containerName, threadId, lastStepHash, and workflowName from your prompt.
All commands via: `docker exec <containerName> bash -c '...'`
Set env: PATH="$HOME/.bun/bin:$PATH" UNCAGED_WORKFLOW_STORAGE_ROOT=/tmp/uwf-e2e-storage
Cancel:
1. Start a second thread: `uwf thread start <workflowName> -p 'E2E cancel test'`
2. Cancel it: `uwf thread cancel <secondThreadId>`
3. Verify it appears in completed list: `uwf thread list --status completed`
Fork:
4. Fork from the first thread's last step: `uwf step fork <lastStepHash>`
5. Verify fork creates a new thread with a different ID
Logs:
6. `uwf log list` — verify output (may be empty)
7. `uwf log show --thread <threadId>` — verify runs without error
Report results with summary.
output: "Report test results with summary. Set $status to pass or fail."
frontmatter:
oneOf:
- properties:
$status: { const: "pass" }
containerName: { type: string }
summary: { type: string }
required: [$status, containerName, summary]
- properties:
$status: { const: "fail" }
error: { type: string }
containerName: { type: string }
required: [$status, error, containerName]
cleanup:
description: "Remove Docker container"
goal: "You are an E2E test runner. Clean up the Docker container used for testing."
capabilities:
- docker
- shell
procedure: |
Remove the Docker container (containerName is in your prompt):
1. `docker rm -f <containerName>`
2. Verify the container is gone: `docker ps -a --filter name=<containerName> --format '{{.Names}}'` should return empty
Report cleanup result.
output: "Report cleanup result. Set $status to pass or fail."
frontmatter:
oneOf:
- properties:
$status: { const: "pass" }
summary: { type: string }
required: [$status, summary]
- properties:
$status: { const: "fail" }
error: { type: string }
required: [$status, error]
graph:
$START:
_: { role: "bootstrap", prompt: "Set up the Docker container and verify uwf is runnable." }
bootstrap:
pass: { role: "config-and-registry", prompt: "Container {{{containerName}}} is ready. Validate config and workflow registration." }
fail: { role: "$END", prompt: "Bootstrap failed: {{{error}}}. No container was created." }
config-and-registry:
pass: { role: "thread-ops", prompt: "Config and registry OK. Workflow '{{{workflowName}}}' registered. Container: {{{containerName}}}. Now test thread operations." }
fail: { role: "cleanup", prompt: "Config/registry failed: {{{error}}}. Clean up container {{{containerName}}}." }
thread-ops:
pass: { role: "inspect", prompt: "Thread ops OK. threadId={{{threadId}}}, workflowName={{{workflowName}}}, containerName={{{containerName}}}. Now test inspect operations." }
fail: { role: "cleanup", prompt: "Thread ops failed: {{{error}}}. Clean up container {{{containerName}}}." }
inspect:
pass: { role: "cancel-and-fork", prompt: "Inspect OK. threadId={{{threadId}}}, lastStepHash={{{lastStepHash}}}, workflowName={{{workflowName}}}, containerName={{{containerName}}}. Now test cancel, fork, and logs." }
fail: { role: "cleanup", prompt: "Inspect failed: {{{error}}}. Clean up container {{{containerName}}}." }
cancel-and-fork:
pass: { role: "cleanup", prompt: "All tests passed! {{{summary}}}. Clean up container {{{containerName}}}." }
fail: { role: "cleanup", prompt: "Cancel/fork failed: {{{error}}}. Clean up container {{{containerName}}}." }
cleanup:
pass: { role: "$END", prompt: "E2E walkthrough complete. {{{summary}}}" }
fail: { role: "$END", prompt: "Cleanup failed: {{{error}}}. Manual cleanup may be needed." }
-220
View File
@@ -1,220 +0,0 @@
name: "retrospect-workflow"
description: "Post-execution retrospective: analyze a completed thread, find inefficiencies, and improve the workflow definition."
roles:
analyst:
description: "Scans thread execution for anomalies and produces a findings report"
goal: "You are a workflow execution analyst. You review completed thread data to find inefficiencies, wasted effort, and procedure gaps."
capabilities:
- data-analysis
procedure: |
You receive a completed thread ID in your task prompt.
Phase 0 — Validation (must pass before any analysis):
1. Run `uwf step list <thread-id>` to get thread metadata including the workflow hash
2. Run `uwf workflow show <workflow-hash>` to get the workflow name
3. Verify the workflow exists locally: check `.workflows/<name>.yaml` in the current repo
- If NOT found: output $status=wrong_project with the workflow name. Do NOT proceed.
4. Compare the thread's workflow hash against the current registered version:
- Run `uwf workflow show <name>` to get the current hash
- If hashes differ: the thread ran on an older version. Note this — you will need to diff versions after analysis.
Phase 1 — Overview scan:
5. From the step list, compute a health signal for each step:
- Duration: flag if >2x the median of other steps
- Output tokens: flag if >2x the median
- Status flow: flag non-happy-path transitions (rejected, fix_code, fix_spec, hook_failed)
- Step count: flag if the same role appears more than expected (indicates loops)
6. If no anomalies found AND versions match: output $status=clean
7. If no anomalies found BUT versions differ:
- Diff the two workflow versions to check if any procedure changes are relevant
- If the current version already addresses potential concerns: output $status=clean with a note
- Otherwise: proceed to Phase 2
Phase 2 — Targeted deep-dive (only for flagged steps):
8. For each flagged step, run `uwf step show <hash>` to get the detail with turns
9. Analyze the turn sequence for:
- Repeated tool calls with the same or similar input (blind retries)
- Tool errors followed by no strategy change (same approach retried)
- Unnecessary exploration (reading files or running commands unrelated to the task)
- Hallucinated commands or flags (commands that don't exist or wrong syntax)
- Excessive turns before reaching the goal
10. For each finding, record:
- Which role and step hash
- What happened (specific turn indices and commands)
- Root cause hypothesis (procedure gap, missing pitfall, unclear instruction)
- Suggested fix (what to add/change in the procedure)
11. If versions differ: compare findings against the version diff.
Mark any finding that is already fixed in the current version as "resolved_in_current".
Only report findings that are NOT yet addressed.
Output a structured findings report. Set $status=clean if nothing actionable, $status=findings if unresolved issues exist, or $status=wrong_project if the workflow doesn't belong here.
output: "A findings report with per-issue root cause and suggested procedure fixes. Set $status to clean or findings (with report hash)."
frontmatter:
oneOf:
- properties:
$status: { const: "clean" }
summary: { type: string }
required: [$status, summary]
- properties:
$status: { const: "findings" }
report: { type: string }
targetWorkflow: { type: string }
required: [$status, report, targetWorkflow]
- properties:
$status: { const: "wrong_project" }
workflowName: { type: string }
required: [$status, workflowName]
proposer:
description: "Translates findings into concrete workflow edits"
goal: "You are a workflow improvement proposer. You read the analyst's findings and produce specific, minimal edits to the workflow YAML."
capabilities:
- planning
procedure: |
1. Read the analyst's findings report from your task prompt
2. Locate the target workflow YAML:
- Workflow definitions live in the WORKFLOW ENGINE repo (where `uwf` is developed), NOT in the repo that was analyzed.
- Find it via: `uwf workflow show <targetWorkflow> --format yaml` to read the current definition
- The physical file is `.workflows/<targetWorkflow>.yaml` in the workflow engine repo
- Use `git rev-parse --show-toplevel` in the current directory to find the workflow engine repo root
3. Read the current workflow YAML to understand existing procedures
4. For each finding, draft a minimal edit:
- Prefer adding a pitfall note or clarifying instruction over restructuring
- If a procedure step is ambiguous, make it explicit
- If a tool usage pattern is wrong, add a "Do NOT" or "IMPORTANT" note
- Keep edits surgical — don't rewrite procedures that work fine
5. Check if existing tests need updating (search for test files referencing the workflow)
6. Produce a change plan as CAS text node via `uwf cas put-text "<plan>"`
The plan should list each edit with:
- File path
- What to change (old text → new text, or addition)
- Why (linked to which finding)
- Any test updates needed
output: "A change plan stored in CAS. Set $status to ready (with plan hash and repoPath) or no_action (if findings don't warrant changes)."
frontmatter:
oneOf:
- properties:
$status: { const: "ready" }
plan: { type: string }
repoPath: { type: string }
required: [$status, plan, repoPath]
- properties:
$status: { const: "no_action" }
reason: { type: string }
required: [$status, reason]
developer:
description: "Applies the proposed workflow edits"
goal: "You are a developer agent. You apply workflow YAML edits and update related tests."
capabilities:
- coding
procedure: |
IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
The workflow definitions live in THIS repo (the workflow engine), not the repo that was analyzed.
Before starting any work, set up an isolated worktree:
1. Use `git rev-parse --show-toplevel` to find the repo root (do NOT use repoPath from proposer — that's the analyzed repo)
2. `git fetch origin` to get latest refs
3. `git worktree add .worktrees/retrospect/<short-slug> -b retrospect/<short-slug> origin/main`
4. `cd .worktrees/retrospect/<short-slug> && bun install`
5. ALL subsequent work must happen inside the worktree directory.
Then apply changes:
6. Read the change plan from CAS: `uwf cas get <plan hash>`
7. Apply each edit from the plan to the workflow YAML
8. Update or add tests as specified in the plan
9. Run `bun run build` and `bun test` to verify
10. Run `bun run check` for lint
11. Commit with message: `improve: <workflow-name> — <brief summary>`
output: "List all files changed and provide a summary. Set $status to done (with branch/worktree), or failed (with reason)."
frontmatter:
oneOf:
- properties:
$status: { const: "done" }
branch: { type: string }
worktree: { type: string }
required: [$status, branch, worktree]
- properties:
$status: { const: "failed" }
reason: { type: string }
required: [$status, reason]
reviewer:
description: "Reviews the workflow edits for correctness"
goal: "You are a reviewer. You verify that workflow edits are minimal, correct, and actually address the findings."
capabilities:
- code-review
procedure: |
The worktree path is provided in your task prompt. cd into it first.
Review criteria:
1. Each edit must trace back to a specific finding — no drive-by changes
2. Edits should be minimal — don't rewrite working procedures
3. New pitfall notes or instructions must be clear and actionable
4. Tests must be updated if assertions changed
5. `bun run build` and `bun test` must pass
6. `bunx biome check` must pass
IMPORTANT: `tea pr create` must run from the MAIN repo directory (not a worktree), because tea cannot detect the repo from worktree `.git` files.
output: "Explain your decision. Set $status to approved (with branch/worktree) or rejected (with comments)."
frontmatter:
oneOf:
- properties:
$status: { const: "approved" }
branch: { type: string }
worktree: { type: string }
required: [$status, branch, worktree]
- properties:
$status: { const: "rejected" }
comments: { type: string }
worktree: { type: string }
required: [$status, comments, worktree]
committer:
description: "Commits and creates PR"
goal: "You are a committer agent. You create a clean commit and push a PR."
capabilities: []
procedure: |
The worktree path, branch name, and repo info are provided in your task prompt.
cd into the worktree first.
Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
1. Stage all changes: `git add -A`
2. Commit with a descriptive message: `git commit -m "improve: <workflow> — <summary>"`
3. Push the branch: `git push -u origin <branch-name>`
- If push hook fails: capture the error log in your output, mark hook_failed
4. On push success: create a PR via `tea pr create --title "..." --description "..."`
- IMPORTANT: `tea pr create` must run from the MAIN repo directory (not a worktree), because tea cannot detect the repo from worktree `.git` files. cd to the repo root first.
- Do NOT pass `--repo` — let tea auto-detect from the main repo's git remote.
- PR description must include: What / Why / Findings / Changes sections
- On tea failure: capture stderr/stdout, include PR details for manual creation, mark hook_failed
5. After PR creation, clean up the worktree:
- cd to the repo root (parent of .worktrees)
- `git worktree remove <worktree-path>`
output: "Include PR URL on success or error log on failure. Set $status to committed (with prUrl) or hook_failed (with error)."
frontmatter:
oneOf:
- properties:
$status: { const: "committed" }
prUrl: { type: string }
required: [$status, prUrl]
- properties:
$status: { const: "hook_failed" }
error: { type: string }
required: [$status, error]
graph:
$START:
_: { role: "analyst", prompt: "Analyze completed thread {{{threadId}}} for execution anomalies." }
analyst:
clean: { role: "$END", prompt: "No issues found. Thread executed cleanly." }
findings: { role: "proposer", prompt: "Findings report: {{{report}}}. Target workflow: {{{targetWorkflow}}}. Propose minimal edits." }
wrong_project: { role: "$END", prompt: "Thread uses workflow '{{{workflowName}}}' which does not exist in this project. Run retrospect from the correct repo." }
proposer:
no_action: { role: "$END", prompt: "No actionable changes needed: {{{reason}}}." }
ready: { role: "developer", prompt: "Apply the change plan (CAS hash: {{{plan}}}) to the workflow definitions in this repo." }
developer:
done: { role: "reviewer", prompt: "Review workflow edits on branch {{{branch}}} at {{{worktree}}}." }
failed: { role: "$END", prompt: "Developer failed: {{{reason}}}. Ending workflow." }
reviewer:
rejected: { role: "developer", prompt: "Reviewer rejected: {{{comments}}}. Fix the issues in {{{worktree}}}." }
approved: { role: "committer", prompt: "Approved. Commit and push branch {{{branch}}} from {{{worktree}}}." }
committer:
hook_failed: { role: "developer", prompt: "Push hook failed: {{{error}}}. Fix and re-submit." }
committed: { role: "$END", prompt: "PR created: {{{prUrl}}}. Workflow improved." }
+4 -5
View File
@@ -155,13 +155,12 @@ roles:
cd into the worktree first.
Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
1. Check `git status` — if working tree is clean and branch is ahead of origin, skip to step 3 (push).
2. If there are unstaged/uncommitted changes: `git add -A` then `git commit -m "type: description\n\nFixes #N"`
1. Stage all changes: `git add -A`
2. Commit with a descriptive message referencing the issue: `git commit -m "type: description\n\nFixes #N"`
3. Push the branch: `git push -u origin <branch-name>`
- If push hook fails: capture the error log in your output, mark hook_failed
4. On push success: create a PR via `tea pr create --title "..." --description "..."`
- IMPORTANT: `tea pr create` must run from the MAIN repo directory (not a worktree), because tea cannot detect the repo from worktree `.git` files. cd to the repo root first.
- Do NOT pass `--repo` — let tea auto-detect from the main repo's git remote.
4. On push success: create a PR via `tea pr create --repo <owner/repo> --title "..." --description "..."`
- Extract owner/repo from: `git remote get-url origin | sed 's/.*[:/]\([^/]*\/[^.]*\).*/\1/'`
- PR description must include: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
- On tea failure: capture stderr/stdout, include PR details for manual creation, mark hook_failed
5. After PR creation, clean up the worktree:
+1
View File
@@ -4,6 +4,7 @@
"includes": [
"**",
"!**/dist",
"!.worktrees",
"!**/node_modules",
"!**/legacy-packages",
"!scripts",
+1 -1
View File
@@ -391,7 +391,7 @@ Everything else is immutable CAS content.
providers:
openrouter:
baseUrl: "https://openrouter.ai/api/v1"
apiKeyEnv: "OPENROUTER_API_KEY"
apiKey: "sk-..."
models:
sonnet:
+2 -2
View File
@@ -402,7 +402,7 @@ workflow 怎么配置和使用 model?
```136:160:packages/workflow-protocol/src/types.ts
export type ProviderConfig = {
baseUrl: string;
apiKeyEnv: string;
apiKey: string;
};
export type ModelConfig = {
@@ -429,7 +429,7 @@ export type WorkflowConfig = {
export function resolveModel(config: WorkflowConfig, alias: ModelAlias): ResolvedLlmProvider {
const modelEntry = config.models[alias];
const providerEntry = config.providers[modelEntry.provider];
const apiKey = process.env[providerEntry.apiKeyEnv];
const apiKey = providerEntry.apiKey;
return { baseUrl: providerEntry.baseUrl, apiKey, model: modelEntry.name };
}
```
+4 -4
View File
@@ -280,13 +280,13 @@ threads.yaml: { "01J7K9M2XNPQR5VWBCDF8G3H4T": "8FWKR3TN5V1QA" }
providers:
openai:
baseUrl: "https://api.openai.com/v1"
apiKeyEnv: "OPENAI_API_KEY"
apiKey: "sk-..."
anthropic:
baseUrl: "https://api.anthropic.com/v1"
apiKeyEnv: "ANTHROPIC_API_KEY"
apiKey: "sk-ant-..."
openrouter:
baseUrl: "https://openrouter.ai/api/v1"
apiKeyEnv: "OPENROUTER_API_KEY"
apiKey: "sk-or-..."
models:
sonnet:
@@ -465,7 +465,7 @@ type Scenario = string; // e.g. "extract"
type ProviderConfig = {
baseUrl: string;
apiKeyEnv: string; // env var name to read API key from
apiKey: string; // API key stored directly
};
type ModelConfig = {
+1 -1
View File
@@ -14,7 +14,7 @@
"test:ci": "bun run --filter './packages/*' test:ci",
"changeset": "bunx changeset",
"version": "bunx changeset version",
"release": "bun run build && bun test && node scripts/publish-all.mjs"
"release": "bun run build && bun run test && node scripts/publish-all.mjs"
},
"devDependencies": {
"@agentclientprotocol/sdk": "^0.22.1",
+1 -1
View File
@@ -119,7 +119,7 @@ uwf setup --provider openai --base-url https://api.openai.com/v1 \
--api-key sk-... --model gpt-4o --agent hermes
```
Config: `~/.uncaged/workflow/config.yaml`. API keys: `~/.uncaged/workflow/.env`.
Config: `~/.uncaged/workflow/config.yaml` (includes API keys).
### Skill
+1 -1
View File
@@ -8,7 +8,7 @@
],
"type": "module",
"bin": {
"uwf": "./src/cli.ts"
"uwf": "./dist/cli.js"
},
"dependencies": {
"@uncaged/json-cas": "^0.5.3",
@@ -0,0 +1,622 @@
import { mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { describe, expect, test } from "vitest";
import {
cmdConfigGet,
cmdConfigList,
cmdConfigSet,
getConfigPath,
getNestedValue,
maskApiKeys,
parseDotPath,
setNestedValue,
} from "../commands/config.js";
describe("config command", () => {
// Helper function to create a test config
function createTestConfig(tempDir: string, content: string): string {
const configPath = getConfigPath(tempDir);
writeFileSync(configPath, content, "utf8");
return configPath;
}
// Sample test config
const sampleConfig = `providers:
dashscope:
baseUrl: https://dashscope.aliyuncs.com/compatible-mode/v1
apiKey: sk-test-dashscope-key
openai:
baseUrl: https://api.openai.com/v1
apiKey: sk-test-openai-key
models:
default:
provider: dashscope
name: qwen-max
gpt4:
provider: openai
name: gpt-4
agents:
hermes:
command: uwf-hermes
args:
- --provider
- dashscope
claude-code:
command: claude-code
args:
- --profile
- work
defaultAgent: hermes
defaultModel: default
`;
describe("helper functions", () => {
describe("parseDotPath", () => {
test("splits dot notation correctly", () => {
expect(parseDotPath("a.b.c")).toEqual(["a", "b", "c"]);
expect(parseDotPath("defaultAgent")).toEqual(["defaultAgent"]);
expect(parseDotPath("providers.dashscope.baseUrl")).toEqual([
"providers",
"dashscope",
"baseUrl",
]);
});
});
describe("getNestedValue", () => {
test("traverses nested objects", () => {
const obj = {
a: { b: { c: "value" } },
x: "simple",
};
expect(getNestedValue(obj, ["a", "b", "c"])).toBe("value");
expect(getNestedValue(obj, ["x"])).toBe("simple");
});
test("returns undefined for non-existent paths", () => {
const obj = { a: { b: "value" } };
expect(getNestedValue(obj, ["a", "c"])).toBeUndefined();
expect(getNestedValue(obj, ["x", "y"])).toBeUndefined();
});
});
describe("setNestedValue", () => {
test("creates intermediate objects and sets value", () => {
const obj: Record<string, unknown> = {};
setNestedValue(obj, ["a", "b", "c"], "value");
expect(obj).toEqual({ a: { b: { c: "value" } } });
});
test("preserves existing values", () => {
const obj: Record<string, unknown> = { a: { x: "keep" } };
setNestedValue(obj, ["a", "b"], "new");
expect(obj).toEqual({ a: { x: "keep", b: "new" } });
});
test("overwrites existing value at path", () => {
const obj: Record<string, unknown> = { a: { b: "old" } };
setNestedValue(obj, ["a", "b"], "new");
expect(obj).toEqual({ a: { b: "new" } });
});
});
describe("maskApiKeys", () => {
test("deep clones and masks all apiKey values in providers", () => {
const config = {
providers: {
dashscope: {
baseUrl: "https://example.com",
apiKey: "sk-test-key-12345",
},
openai: {
baseUrl: "https://api.openai.com",
apiKey: "sk-another-secret",
},
},
models: {
default: { provider: "dashscope" },
},
};
const masked = maskApiKeys(config);
expect(masked).toEqual({
providers: {
dashscope: {
baseUrl: "https://example.com",
apiKey: "***MASKED***",
},
openai: {
baseUrl: "https://api.openai.com",
apiKey: "***MASKED***",
},
},
models: {
default: { provider: "dashscope" },
},
});
// Ensure it's a deep clone
expect(masked).not.toBe(config);
});
test("handles config without providers", () => {
const config = { models: { default: { provider: "test" } } };
const masked = maskApiKeys(config);
expect(masked).toEqual(config);
});
});
});
describe("cmdConfigList", () => {
test("returns full config when file exists", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const result = await cmdConfigList(tempDir);
expect(result).toBeDefined();
expect(typeof result).toBe("object");
expect(result).toHaveProperty("providers");
expect(result).toHaveProperty("models");
expect(result).toHaveProperty("agents");
expect(result).toHaveProperty("defaultAgent");
expect(result).toHaveProperty("defaultModel");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("masks all apiKey values in providers section", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const result = (await cmdConfigList(tempDir)) as Record<string, unknown>;
const providers = result.providers as Record<string, unknown>;
const dashscope = providers.dashscope as Record<string, unknown>;
const openai = providers.openai as Record<string, unknown>;
expect(dashscope.apiKey).toBe("***MASKED***");
expect(openai.apiKey).toBe("***MASKED***");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("throws error when config file doesn't exist", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
await expect(cmdConfigList(tempDir)).rejects.toThrow();
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("returns empty object when config file is empty", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, "");
const result = await cmdConfigList(tempDir);
expect(result).toEqual({});
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("throws error when config file is invalid YAML", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, "invalid: yaml: [broken");
await expect(cmdConfigList(tempDir)).rejects.toThrow();
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
});
describe("cmdConfigGet", () => {
test("retrieves top-level string value (defaultAgent)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const result = await cmdConfigGet(tempDir, "defaultAgent");
expect(result).toBe("hermes");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("retrieves top-level string value (defaultModel)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const result = await cmdConfigGet(tempDir, "defaultModel");
expect(result).toBe("default");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("retrieves nested object (providers.dashscope)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const result = await cmdConfigGet(tempDir, "providers.dashscope");
expect(result).toEqual({
baseUrl: "https://dashscope.aliyuncs.com/compatible-mode/v1",
apiKey: "sk-test-dashscope-key",
});
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("retrieves deeply nested string (providers.dashscope.baseUrl)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const result = await cmdConfigGet(tempDir, "providers.dashscope.baseUrl");
expect(result).toBe("https://dashscope.aliyuncs.com/compatible-mode/v1");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("retrieves nested string in models (models.default.provider)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const result = await cmdConfigGet(tempDir, "models.default.provider");
expect(result).toBe("dashscope");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("retrieves array value (agents.hermes.args)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const result = await cmdConfigGet(tempDir, "agents.hermes.args");
expect(result).toEqual(["--provider", "dashscope"]);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("throws error when key doesn't exist", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(cmdConfigGet(tempDir, "nonexistent.key")).rejects.toThrow(/Key not found/);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("throws error when config file doesn't exist", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
await expect(cmdConfigGet(tempDir, "defaultAgent")).rejects.toThrow();
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("throws error when accessing property on non-object", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(cmdConfigGet(tempDir, "defaultAgent.foo")).rejects.toThrow();
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
});
describe("cmdConfigSet", () => {
test("sets top-level string value (defaultAgent)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const result = await cmdConfigSet(tempDir, "defaultAgent", "claude-code");
expect(result).toEqual({ key: "defaultAgent", value: "claude-code" });
// Verify it was written
const updated = await cmdConfigGet(tempDir, "defaultAgent");
expect(updated).toBe("claude-code");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("sets nested string value (providers.dashscope.baseUrl)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const newUrl = "https://new-api.example.com/v1";
const result = await cmdConfigSet(tempDir, "providers.dashscope.baseUrl", newUrl);
expect(result).toEqual({
key: "providers.dashscope.baseUrl",
value: newUrl,
});
// Verify it was written
const updated = await cmdConfigGet(tempDir, "providers.dashscope.baseUrl");
expect(updated).toBe(newUrl);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("creates new nested path (providers.newprovider.baseUrl)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const newUrl = "https://new-provider.com/v1";
const result = await cmdConfigSet(tempDir, "providers.newprovider.baseUrl", newUrl);
expect(result).toEqual({
key: "providers.newprovider.baseUrl",
value: newUrl,
});
// Verify it was created
const updated = await cmdConfigGet(tempDir, "providers.newprovider.baseUrl");
expect(updated).toBe(newUrl);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("sets array value for args key with valid JSON array", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const newArgs = '["--new", "--flags"]';
const result = await cmdConfigSet(tempDir, "agents.hermes.args", newArgs);
expect(result).toEqual({
key: "agents.hermes.args",
value: ["--new", "--flags"],
});
// Verify it was written
const updated = await cmdConfigGet(tempDir, "agents.hermes.args");
expect(updated).toEqual(["--new", "--flags"]);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("preserves existing config values when updating one key", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await cmdConfigSet(tempDir, "defaultAgent", "claude-code");
// Verify other values are preserved
const defaultModel = await cmdConfigGet(tempDir, "defaultModel");
expect(defaultModel).toBe("default");
const dashscopeUrl = await cmdConfigGet(tempDir, "providers.dashscope.baseUrl");
expect(dashscopeUrl).toBe("https://dashscope.aliyuncs.com/compatible-mode/v1");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("creates config file if it doesn't exist", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
const result = await cmdConfigSet(tempDir, "defaultAgent", "hermes");
expect(result).toEqual({ key: "defaultAgent", value: "hermes" });
// Verify file was created
const configPath = getConfigPath(tempDir);
const content = readFileSync(configPath, "utf8");
expect(content).toContain("defaultAgent: hermes");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("throws error when setting property on non-object", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(cmdConfigSet(tempDir, "defaultAgent.foo", "bar")).rejects.toThrow();
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("throws error when array value is invalid JSON for args key", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(
cmdConfigSet(tempDir, "agents.hermes.args", "[invalid json"),
).rejects.toThrow();
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("sets deeply nested model config (models.gpt4.provider)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const result = await cmdConfigSet(tempDir, "models.gpt4.provider", "new-provider");
expect(result).toEqual({
key: "models.gpt4.provider",
value: "new-provider",
});
// Verify it was written
const updated = await cmdConfigGet(tempDir, "models.gpt4.provider");
expect(updated).toBe("new-provider");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("sets agent command (agents.claude-code.command)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
const result = await cmdConfigSet(tempDir, "agents.claude-code.command", "new-command");
expect(result).toEqual({
key: "agents.claude-code.command",
value: "new-command",
});
// Verify it was written
const updated = await cmdConfigGet(tempDir, "agents.claude-code.command");
expect(updated).toBe("new-command");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
});
describe("cmdConfigSet validation", () => {
test("rejects unknown top-level key", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(cmdConfigSet(tempDir, "unknownKey", "value")).rejects.toThrow(
/Unknown config key.*unknownKey/,
);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("rejects unknown nested key in providers", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(
cmdConfigSet(tempDir, "providers.myProvider.unknownField", "value"),
).rejects.toThrow(/Unknown field.*unknownField.*providers/);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("rejects unknown nested key in models", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(cmdConfigSet(tempDir, "models.default.invalidField", "value")).rejects.toThrow(
/Unknown field.*invalidField.*models/,
);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("rejects unknown nested key in agents", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(cmdConfigSet(tempDir, "agents.hermes.badField", "value")).rejects.toThrow(
/Unknown field.*badField.*agents/,
);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("rejects nested path on scalar key (defaultAgent)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(cmdConfigSet(tempDir, "defaultAgent.foo", "value")).rejects.toThrow(
/defaultAgent.*scalar|Cannot set property/i,
);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("rejects nested path on scalar key (defaultModel)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(cmdConfigSet(tempDir, "defaultModel.bar", "value")).rejects.toThrow(
/defaultModel.*scalar|Cannot set property/i,
);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("rejects incomplete nested path (providers without field)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(cmdConfigSet(tempDir, "providers.myProvider", "value")).rejects.toThrow(
/incomplete path|must specify a field/i,
);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("rejects incomplete nested path (models without field)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(cmdConfigSet(tempDir, "models.myModel", "value")).rejects.toThrow(
/incomplete path|must specify a field/i,
);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("rejects incomplete nested path (agents without field)", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await expect(cmdConfigSet(tempDir, "agents.myAgent", "value")).rejects.toThrow(
/incomplete path|must specify a field/i,
);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("allows valid nested keys in providers", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await cmdConfigSet(tempDir, "providers.newprovider.baseUrl", "https://example.com");
await cmdConfigSet(tempDir, "providers.newprovider.apiKey", "sk-test");
const baseUrl = await cmdConfigGet(tempDir, "providers.newprovider.baseUrl");
const apiKey = await cmdConfigGet(tempDir, "providers.newprovider.apiKey");
expect(baseUrl).toBe("https://example.com");
expect(apiKey).toBe("sk-test");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("allows valid nested keys in models", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await cmdConfigSet(tempDir, "models.gpt4.provider", "openai");
await cmdConfigSet(tempDir, "models.gpt4.name", "gpt-4o");
const provider = await cmdConfigGet(tempDir, "models.gpt4.provider");
const name = await cmdConfigGet(tempDir, "models.gpt4.name");
expect(provider).toBe("openai");
expect(name).toBe("gpt-4o");
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
test("allows valid nested keys in agents", async () => {
const tempDir = mkdtempSync(join(tmpdir(), "test-config-"));
try {
createTestConfig(tempDir, sampleConfig);
await cmdConfigSet(tempDir, "agents.hermes.command", "uwf-hermes");
await cmdConfigSet(tempDir, "agents.hermes.args", '["--flag"]');
const command = await cmdConfigGet(tempDir, "agents.hermes.command");
const args = await cmdConfigGet(tempDir, "agents.hermes.args");
expect(command).toBe("uwf-hermes");
expect(args).toEqual(["--flag"]);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
});
});
@@ -40,6 +40,7 @@ describe("resolveHeadHash", () => {
workflow: workflowHash,
head: headHash,
completedAt: Date.now(),
reason: null,
});
const result = await resolveHeadHash(tmpDir, threadId);
@@ -64,6 +65,7 @@ describe("resolveHeadHash", () => {
workflow: workflowHash,
head: historicalHash,
completedAt: Date.now(),
reason: null,
});
const result = await resolveHeadHash(tmpDir, threadId);
@@ -87,18 +89,21 @@ describe("resolveHeadHash", () => {
workflow: workflowHash,
head: hash1,
completedAt: Date.now() - 2000,
reason: null,
});
await appendThreadHistory(tmpDir, {
thread: threadId2,
workflow: workflowHash,
head: hash2,
completedAt: Date.now() - 1000,
reason: null,
});
await appendThreadHistory(tmpDir, {
thread: threadId3,
workflow: workflowHash,
head: hash3,
completedAt: Date.now(),
reason: null,
});
const result = await resolveHeadHash(tmpDir, threadId2);
@@ -134,4 +134,34 @@ describe("cmdSetup agent configuration", () => {
const config2 = parse(readFileSync(join(storageRoot, "config.yaml"), "utf8"));
expect(config2.defaultAgent).toBe("builtin");
});
test("normalizes agent name with uwf- prefix to bare name", async () => {
vi.spyOn(globalThis, "fetch").mockResolvedValue(
new Response(JSON.stringify({}), { status: 200 }),
);
const result = await cmdSetup({ ...baseArgs(), agent: "uwf-hermes" });
expect(result.defaultAgent).toBe("hermes");
const config = parse(readFileSync(join(storageRoot, "config.yaml"), "utf8"));
expect(config.agents.hermes).toEqual({ command: "uwf-hermes", args: [] });
expect(config.defaultAgent).toBe("hermes");
// Verify no duplicate uwf- prefix
expect(config.agents["uwf-hermes"]).toBeUndefined();
});
test("normalizes uwf-claude-code to claude-code", async () => {
vi.spyOn(globalThis, "fetch").mockResolvedValue(
new Response(JSON.stringify({}), { status: 200 }),
);
const result = await cmdSetup({ ...baseArgs(), agent: "uwf-claude-code" });
expect(result.defaultAgent).toBe("claude-code");
const config = parse(readFileSync(join(storageRoot, "config.yaml"), "utf8"));
expect(config.agents["claude-code"]).toEqual({ command: "uwf-claude-code", args: [] });
expect(config.defaultAgent).toBe("claude-code");
// Verify no duplicate uwf- prefix
expect(config.agents["uwf-claude-code"]).toBeUndefined();
});
});
@@ -129,9 +129,8 @@ describe("cmdSetup with validation", () => {
const result = await cmdSetup(setupArgs());
expect(result.validation).toEqual({ ok: true, value: undefined });
// Config files should still be written
// Config file should still be written
expect(result.configPath).toBeTruthy();
expect(result.envPath).toBeTruthy();
});
test("includes validation failure — config still saved", async () => {
@@ -143,8 +142,7 @@ describe("cmdSetup with validation", () => {
expect(result.validation).toBeDefined();
expect((result.validation as { ok: boolean }).ok).toBe(false);
// Config files should still be written despite validation failure
// Config file should still be written despite validation failure
expect(result.configPath).toBeTruthy();
expect(result.envPath).toBeTruthy();
});
});
@@ -6,10 +6,14 @@ import { describe, expect, test } from "vitest";
const __dirname = dirname(fileURLToPath(import.meta.url));
import {
cmdSkillActor,
cmdSkillArchitecture,
cmdSkillAuthor,
cmdSkillCli,
cmdSkillDeveloper,
cmdSkillList,
cmdSkillModerator,
cmdSkillUser,
cmdSkillYaml,
} from "../commands/skill.js";
@@ -21,8 +25,11 @@ describe("skill commands", () => {
expect(result).toContain("architecture");
expect(result).toContain("yaml");
expect(result).toContain("moderator");
expect(result).toContain("actor");
expect(result).toContain("user");
expect(result).toContain("author");
expect(result).toContain("developer");
for (const name of result) {
expect(typeof name).toBe("string");
expect(name).toMatch(/^\S+$/);
}
});
@@ -62,6 +69,45 @@ describe("skill commands", () => {
expect(result).toContain("uwf");
});
test("skill actor returns non-empty markdown string", () => {
const result = cmdSkillActor();
expect(typeof result).toBe("string");
expect(result).toContain("frontmatter");
expect(result).toContain("CAS");
expect(result).toContain("status");
expect(result.length).toBeGreaterThan(200);
});
test("skill user returns non-empty markdown string", () => {
const result = cmdSkillUser();
expect(typeof result).toBe("string");
expect(result).toContain("uwf");
expect(result).toContain("thread");
expect(result).toContain("workflow");
expect(result).toContain("Quick Start");
expect(result.length).toBeGreaterThan(500);
});
test("skill author returns non-empty markdown string", () => {
const result = cmdSkillAuthor();
expect(typeof result).toBe("string");
expect(result).toContain("frontmatter");
expect(result).toContain("graph");
expect(result).toContain("$START");
expect(result).toContain("$END");
expect(result).toContain("$status");
expect(result.length).toBeGreaterThan(500);
});
test("skill developer returns non-empty markdown string", () => {
const result = cmdSkillDeveloper();
expect(typeof result).toBe("string");
expect(result).toContain("Monorepo");
expect(result).toContain("CAS");
expect(result).toContain("Biome");
expect(result.length).toBeGreaterThan(500);
});
test("skill help subcommand is suppressed", () => {
const output = execFileSync("bun", ["src/cli.ts", "skill", "--help"], {
cwd: join(__dirname, "..", ".."),
@@ -73,6 +119,10 @@ describe("skill commands", () => {
expect(output).toContain("architecture");
expect(output).toContain("yaml");
expect(output).toContain("moderator");
expect(output).toContain("actor");
expect(output).toContain("user");
expect(output).toContain("author");
expect(output).toContain("developer");
expect(output).toContain("list");
});
});
@@ -24,7 +24,7 @@ describe("solve-issue workflow: tea pr create worktree fix", () => {
"solve-issue.yaml",
);
test("committer procedure should require running tea pr create from main repo directory", async () => {
test("committer procedure should include --repo flag in tea pr create command", async () => {
const yamlContent = await readFile(workflowPath, "utf-8");
const workflow = parse(yamlContent) as WorkflowPayload;
@@ -32,12 +32,19 @@ describe("solve-issue workflow: tea pr create worktree fix", () => {
const committerProcedure = workflow.roles.committer?.procedure;
expect(committerProcedure).toBeDefined();
// Verify the procedure includes tea pr create
// Verify the procedure includes tea pr create with --repo flag
expect(committerProcedure).toContain("tea pr create");
expect(committerProcedure).toContain("--repo");
// Verify the procedure warns about running from main repo dir (not worktree)
expect(committerProcedure).toMatch(/main repo directory/i);
expect(committerProcedure).toMatch(/not a worktree/i);
// Verify the --repo flag appears before or together with tea pr create
// This ensures the command is: tea pr create --repo <owner/repo> ...
const teaPrCreateMatch = committerProcedure?.match(/tea pr create[^\n]*/);
expect(teaPrCreateMatch).not.toBeNull();
if (teaPrCreateMatch) {
const teaCommandLine = teaPrCreateMatch[0];
expect(teaCommandLine).toContain("--repo");
}
});
test("committer procedure should mention repo extraction from git remote", async () => {
@@ -0,0 +1,85 @@
import { mkdtemp } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
import { describe, expect, test } from "vitest";
import { appendThreadHistory, loadThreadHistory } from "../store.js";
describe("thread cancel status", () => {
test("cancelled history entry has reason 'cancelled'", async () => {
const tmpDir = await mkdtemp(join(tmpdir(), "uwf-cancel-test-"));
const threadId = "01JTEST000000000000CANCEL1" as ThreadId;
await appendThreadHistory(tmpDir, {
thread: threadId,
workflow: "test-workflow",
head: "test-head-hash" as CasRef,
completedAt: Date.now(),
reason: "cancelled",
});
const history = await loadThreadHistory(tmpDir);
expect(history).toHaveLength(1);
expect(history[0]?.reason).toBe("cancelled");
});
test("completed history entry has reason 'completed'", async () => {
const tmpDir = await mkdtemp(join(tmpdir(), "uwf-cancel-test-"));
const threadId = "01JTEST000000000000CANCEL2" as ThreadId;
await appendThreadHistory(tmpDir, {
thread: threadId,
workflow: "test-workflow",
head: "test-head-hash" as CasRef,
completedAt: Date.now(),
reason: "completed",
});
const history = await loadThreadHistory(tmpDir);
expect(history).toHaveLength(1);
expect(history[0]?.reason).toBe("completed");
});
test("legacy history entry without reason parses as null", async () => {
const tmpDir = await mkdtemp(join(tmpdir(), "uwf-cancel-test-"));
const threadId = "01JTEST000000000000CANCEL3" as ThreadId;
// Simulate legacy entry without reason field
await appendThreadHistory(tmpDir, {
thread: threadId,
workflow: "test-workflow",
head: "test-head-hash" as CasRef,
completedAt: Date.now(),
reason: null,
});
const history = await loadThreadHistory(tmpDir);
expect(history).toHaveLength(1);
expect(history[0]?.reason).toBeNull();
});
test("mixed completed and cancelled entries preserve distinct reasons", async () => {
const tmpDir = await mkdtemp(join(tmpdir(), "uwf-cancel-test-"));
await appendThreadHistory(tmpDir, {
thread: "01JTEST000000000000CANCEL4" as ThreadId,
workflow: "test-workflow",
head: "head1" as CasRef,
completedAt: Date.now(),
reason: "completed",
});
await appendThreadHistory(tmpDir, {
thread: "01JTEST000000000000CANCEL5" as ThreadId,
workflow: "test-workflow",
head: "head2" as CasRef,
completedAt: Date.now(),
reason: "cancelled",
});
const history = await loadThreadHistory(tmpDir);
expect(history).toHaveLength(2);
expect(history[0]?.reason).toBe("completed");
expect(history[1]?.reason).toBe("cancelled");
});
});
@@ -74,6 +74,7 @@ async function completeThread(
workflow: workflowHash,
head: headHash,
completedAt: Date.now(),
reason: null,
});
}
@@ -758,6 +758,7 @@ describe("cmdStepList with completed threads", () => {
workflow: workflowHash,
head: step2Hash,
completedAt: Date.now(),
reason: null,
});
const result = await cmdStepList(tmpDir, threadId);
@@ -886,6 +887,7 @@ describe("cmdStepShow with completed threads", () => {
workflow: workflowHash,
head: stepHash,
completedAt: Date.now(),
reason: null,
});
const result = await cmdStepShow(tmpDir, stepHash);
@@ -949,6 +951,7 @@ describe("cmdThreadRead with completed threads", () => {
workflow: workflowHash,
head: stepHash,
completedAt: Date.now(),
reason: null,
});
const markdown = await cmdThreadRead(tmpDir, threadId, THREAD_READ_DEFAULT_QUOTA, null, false);
@@ -1011,6 +1014,7 @@ describe("cmdThreadRead with completed threads", () => {
workflow: workflowHash,
head: step3Hash,
completedAt: Date.now(),
reason: null,
});
const markdown = await cmdThreadRead(
+78 -4
View File
@@ -1,4 +1,4 @@
#!/usr/bin/env bun
#!/usr/bin/env node
import type { CasRef, ThreadId } from "@uncaged/workflow-protocol";
import { Command } from "commander";
@@ -13,13 +13,18 @@ import {
cmdCasSchemaList,
cmdCasWalk,
} from "./commands/cas.js";
import { cmdConfigGet, cmdConfigList, cmdConfigSet } from "./commands/config.js";
import { cmdLogClean, cmdLogList, cmdLogShow } from "./commands/log.js";
import { cmdSetup, cmdSetupInteractive } from "./commands/setup.js";
import {
cmdSkillActor,
cmdSkillArchitecture,
cmdSkillAuthor,
cmdSkillCli,
cmdSkillDeveloper,
cmdSkillList,
cmdSkillModerator,
cmdSkillUser,
cmdSkillYaml,
} from "./commands/skill.js";
import { cmdStepFork, cmdStepList, cmdStepRead, cmdStepShow } from "./commands/step.js";
@@ -181,11 +186,11 @@ function parseStatusFilter(status: string | undefined): ThreadStatus[] | null {
if (raw === "active") return ["idle", "running"];
const parts = raw.split(",").map((s) => s.trim());
const validStatuses: ThreadStatus[] = ["idle", "running", "completed"];
const validStatuses: ThreadStatus[] = ["idle", "running", "completed", "cancelled"];
for (const part of parts) {
if (!validStatuses.includes(part as ThreadStatus)) {
process.stderr.write(
`Invalid status: ${part}. Must be one of: idle, running, completed, active\n`,
`Invalid status: ${part}. Must be one of: idle, running, completed, cancelled, active\n`,
);
process.exit(1);
}
@@ -238,7 +243,7 @@ thread
.description("List threads")
.option(
"--status <status>",
"Filter by status: idle, running, completed, active (idle+running), or comma-separated values",
"Filter by status: idle, running, completed, cancelled, active (idle+running), or comma-separated values",
)
.option("--after <date>", "Filter threads created after this date (ISO or relative like '7d')")
.option("--before <date>", "Filter threads created before this date (ISO or relative like '7d')")
@@ -502,6 +507,27 @@ skill
console.log(cmdSkillYaml());
});
skill
.command("actor")
.description("Print the actor reference (frontmatter protocol + CAS)")
.action(() => {
console.log(cmdSkillActor());
});
skill
.command("author")
.description("Print the author reference (workflow YAML design guide)")
.action(() => {
console.log(cmdSkillAuthor());
});
skill
.command("developer")
.description("Print the developer reference (coding conventions + architecture)")
.action(() => {
console.log(cmdSkillDeveloper());
});
skill
.command("moderator")
.description("Print the moderator reference")
@@ -509,6 +535,13 @@ skill
console.log(cmdSkillModerator());
});
skill
.command("user")
.description("Print the user reference (CLI guide + typical workflows)")
.action(() => {
console.log(cmdSkillUser());
});
skill
.command("list")
.description("List all available skill names")
@@ -711,6 +744,47 @@ log
});
});
const config = program.command("config").description("Configuration management");
config
.command("list")
.description("Display all configuration values (masks API keys)")
.action(() => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdConfigList(storageRoot);
writeOutput(result);
});
});
config
.command("get")
.description("Get a specific configuration value")
.argument(
"<key>",
"Dot-notation path to config value (e.g., defaultAgent, providers.dashscope.baseUrl)",
)
.action((key: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdConfigGet(storageRoot, key);
writeOutput({ value: result });
});
});
config
.command("set")
.description("Set a specific configuration value")
.argument("<key>", "Dot-notation path to config value")
.argument("<value>", "New value (use JSON array for 'args' key, e.g., '[\"--flag\"]')")
.action((key: string, value: string) => {
const storageRoot = resolveStorageRoot();
runAction(async () => {
const result = await cmdConfigSet(storageRoot, key, value);
writeOutput(result);
});
});
program.parseAsync(process.argv).catch((e: unknown) => {
const message = e instanceof Error ? e.message : String(e);
process.stderr.write(`${message}\n`);
@@ -0,0 +1,289 @@
import { existsSync, mkdirSync, readFileSync, writeFileSync } from "node:fs";
import { join } from "node:path";
import { parse, stringify } from "yaml";
/**
* Valid configuration key schema
*/
const VALID_CONFIG_KEYS: Record<string, { nested: boolean; knownFields?: string[] }> = {
providers: {
nested: true,
knownFields: ["baseUrl", "apiKey"],
},
models: {
nested: true,
knownFields: ["provider", "name"],
},
agents: {
nested: true,
knownFields: ["command", "args"],
},
defaultAgent: { nested: false },
defaultModel: { nested: false },
};
/**
* Validate a config key path against the known schema
*/
function validateConfigKey(path: string[]): void {
if (path.length === 0) {
throw new Error("Path cannot be empty");
}
const topLevel = path[0];
const schema = VALID_CONFIG_KEYS[topLevel];
if (!schema) {
const validKeys = Object.keys(VALID_CONFIG_KEYS).join(", ");
throw new Error(`Unknown config key: ${topLevel}. Valid top-level keys are: ${validKeys}`);
}
// Scalar keys cannot have nested paths
if (!schema.nested && path.length > 1) {
throw new Error(`${topLevel} is a scalar key and cannot have nested properties`);
}
// Nested keys must have at least 3 segments (e.g., providers.myProvider.baseUrl)
if (schema.nested && path.length < 3) {
const fields = schema.knownFields?.join(", ") ?? "";
throw new Error(
`Incomplete path for ${topLevel}. Must specify a field (e.g., ${topLevel}.<name>.<field>). Valid fields: ${fields}`,
);
}
// Validate the field name for nested keys
if (schema.nested && path.length >= 3 && schema.knownFields) {
const field = path[path.length - 1];
if (!schema.knownFields.includes(field)) {
throw new Error(
`Unknown field '${field}' in ${topLevel}. Valid fields are: ${schema.knownFields.join(", ")}`,
);
}
}
}
/**
* Returns the path to the config.yaml file
*/
export function getConfigPath(storageRoot: string): string {
return join(storageRoot, "config.yaml");
}
/**
* Load and parse YAML config file
*/
export function loadConfig(configPath: string): Record<string, unknown> {
if (!existsSync(configPath)) {
throw new Error(`Config file not found: ${configPath}`);
}
const content = readFileSync(configPath, "utf8");
if (!content.trim()) {
return {};
}
try {
const parsed = parse(content);
return (parsed ?? {}) as Record<string, unknown>;
} catch (error) {
throw new Error(
`Invalid YAML in config file: ${error instanceof Error ? error.message : String(error)}`,
);
}
}
/**
* Save config as YAML
*/
export function saveConfig(configPath: string, config: Record<string, unknown>): void {
const dir = join(configPath, "..");
if (!existsSync(dir)) {
mkdirSync(dir, { recursive: true });
}
const yaml = stringify(config);
writeFileSync(configPath, yaml, "utf8");
}
/**
* Parse dot-notation key into path segments
*/
export function parseDotPath(key: string): string[] {
return key.split(".");
}
/**
* Get nested value from object using path array
*/
export function getNestedValue(obj: Record<string, unknown>, path: string[]): unknown {
let current: unknown = obj;
for (const segment of path) {
if (current === null || current === undefined || typeof current !== "object") {
return undefined;
}
current = (current as Record<string, unknown>)[segment];
}
return current;
}
/**
* Set nested value in object using path array (mutates obj)
*/
export function setNestedValue(obj: Record<string, unknown>, path: string[], value: unknown): void {
if (path.length === 0) {
throw new Error("Path cannot be empty");
}
let current: Record<string, unknown> = obj;
// Navigate/create to the parent of the target
for (let i = 0; i < path.length - 1; i++) {
const segment = path[i];
const next = current[segment];
if (next === null || next === undefined) {
// Create intermediate object
const newObj: Record<string, unknown> = {};
current[segment] = newObj;
current = newObj;
} else if (typeof next === "object" && !Array.isArray(next)) {
// Navigate into existing object
current = next as Record<string, unknown>;
} else {
// Cannot navigate into non-object
throw new Error(
`Cannot set property '${path[i + 1]}' on non-object at path '${path.slice(0, i + 1).join(".")}'`,
);
}
}
// Set the final value
const lastSegment = path[path.length - 1];
current[lastSegment] = value;
}
/**
* Deep clone and mask all apiKey values in providers section
*/
export function maskApiKeys(config: Record<string, unknown>): Record<string, unknown> {
// Deep clone
const cloned = JSON.parse(JSON.stringify(config)) as Record<string, unknown>;
// Mask apiKey values in providers
if (cloned.providers && typeof cloned.providers === "object") {
const providers = cloned.providers as Record<string, unknown>;
for (const providerName of Object.keys(providers)) {
const provider = providers[providerName];
if (provider && typeof provider === "object") {
const providerObj = provider as Record<string, unknown>;
if ("apiKey" in providerObj) {
providerObj.apiKey = "***MASKED***";
}
}
}
}
return cloned;
}
/**
* List all configuration values (masks API keys)
*/
export async function cmdConfigList(storageRoot: string): Promise<unknown> {
const configPath = getConfigPath(storageRoot);
const config = loadConfig(configPath);
const masked = maskApiKeys(config);
return masked;
}
/**
* Get a specific configuration value
*/
export async function cmdConfigGet(storageRoot: string, key: string): Promise<unknown> {
const configPath = getConfigPath(storageRoot);
const config = loadConfig(configPath);
const path = parseDotPath(key);
const value = getNestedValue(config, path);
if (value === undefined) {
throw new Error(`Key not found: ${key}`);
}
return value;
}
/**
* Parse value for args key (must be JSON array)
*/
function parseArgsValue(value: string): unknown {
if (value.startsWith("[")) {
try {
const parsed = JSON.parse(value);
if (!Array.isArray(parsed)) {
throw new Error("Value must be an array");
}
return parsed;
} catch (error) {
throw new Error(
`Invalid JSON array for args key: ${error instanceof Error ? error.message : String(error)}`,
);
}
}
throw new Error("Value for 'args' key must be a JSON array starting with '['");
}
/**
* Validate that we're not setting a property on a non-object
*/
function validateParentPath(
config: Record<string, unknown>,
path: string[],
lastSegment: string,
): void {
if (path.length > 1) {
const parentPath = path.slice(0, -1);
const parent = getNestedValue(config, parentPath);
if (parent !== null && parent !== undefined && typeof parent !== "object") {
throw new Error(
`Cannot set property '${lastSegment}' on non-object at path '${parentPath.join(".")}'`,
);
}
}
}
/**
* Set a specific configuration value
*/
export async function cmdConfigSet(
storageRoot: string,
key: string,
value: string,
): Promise<unknown> {
const configPath = getConfigPath(storageRoot);
// Load existing config or create empty one
let config: Record<string, unknown>;
if (existsSync(configPath)) {
config = loadConfig(configPath);
} else {
config = {};
}
const path = parseDotPath(key);
// Validate the key path
validateConfigKey(path);
const lastSegment = path[path.length - 1];
// Parse value if it's for an array key (args)
let parsedValue: unknown = value;
if (lastSegment === "args") {
parsedValue = parseArgsValue(value);
}
// Validate we're not setting a property on a non-object
validateParentPath(config, path, lastSegment);
setNestedValue(config, path, parsedValue);
saveConfig(configPath, config);
return { key, value: parsedValue };
}
+2 -46
View File
@@ -85,10 +85,6 @@ function getConfigPath(root: string): string {
return join(root, "config.yaml");
}
function getEnvPath(root: string): string {
return join(root, ".env");
}
/**
* Load existing config.yaml or return empty structure.
*/
@@ -106,37 +102,6 @@ function loadExistingConfig(configPath: string): Record<string, unknown> {
return {};
}
/**
* Load existing .env as key=value map.
*/
function loadEnvFile(envPath: string): Record<string, string> {
const env: Record<string, string> = {};
try {
if (existsSync(envPath)) {
for (const line of readFileSync(envPath, "utf8").split("\n")) {
const trimmed = line.trim();
if (trimmed === "" || trimmed.startsWith("#")) continue;
const eq = trimmed.indexOf("=");
if (eq > 0) {
env[trimmed.slice(0, eq)] = trimmed.slice(eq + 1);
}
}
}
} catch {
// ignore
}
return env;
}
function saveEnvFile(envPath: string, env: Record<string, string>): void {
const lines = Object.entries(env).map(([k, v]) => `${k}=${v}`);
writeFileSync(envPath, `${lines.join("\n")}\n`, "utf8");
}
function apiKeyEnvName(providerName: string): string {
return `${providerName.toUpperCase().replace(/[^A-Z0-9]/g, "_")}_API_KEY`;
}
// ──────────────────────────────────────────────────────────────────────────────
// Extracted helpers — _discoverAgents
// ──────────────────────────────────────────────────────────────────────────────
@@ -397,8 +362,7 @@ function mergeConfig(existing: Record<string, unknown>, args: SetupArgs): Record
: {}
) as Record<string, unknown>;
const envName = apiKeyEnvName(args.provider);
providers[args.provider] = { baseUrl: args.baseUrl, apiKeyEnv: envName };
providers[args.provider] = { baseUrl: args.baseUrl, apiKey: args.apiKey };
const models = (
typeof existing.models === "object" && existing.models !== null
@@ -413,7 +377,7 @@ function mergeConfig(existing: Record<string, unknown>, args: SetupArgs): Record
: {}
) as Record<string, unknown>;
const agentName = args.agent ?? "hermes";
const agentName = _agentNameFromBinary(args.agent ?? "hermes");
// Ensure the selected agent has an entry
if (!agents[agentName]) {
agents[agentName] = { command: `uwf-${agentName}`, args: [] };
@@ -437,25 +401,17 @@ export async function cmdSetup(args: SetupArgs): Promise<Record<string, unknown>
mkdirSync(storageRoot, { recursive: true });
const configPath = getConfigPath(storageRoot);
const envPath = getEnvPath(storageRoot);
const existing = loadExistingConfig(configPath);
const merged = mergeConfig(existing, args);
writeFileSync(configPath, stringify(merged, { indent: 2 }), "utf8");
// Write API key to .env
const envName = apiKeyEnvName(args.provider);
const envData = loadEnvFile(envPath);
envData[envName] = args.apiKey;
saveEnvFile(envPath, envData);
// Validate model connectivity
const validation = await validateModel(args.baseUrl, args.apiKey, args.model);
return {
configPath,
envPath,
provider: args.provider,
model: args.model,
defaultAgent: merged.defaultAgent,
+14 -1
View File
@@ -1,11 +1,24 @@
export {
generateActorReference as cmdSkillActor,
generateArchitectureReference as cmdSkillArchitecture,
generateAuthorReference as cmdSkillAuthor,
generateCliReference as cmdSkillCli,
generateDeveloperReference as cmdSkillDeveloper,
generateModeratorReference as cmdSkillModerator,
generateUserReference as cmdSkillUser,
generateYamlReference as cmdSkillYaml,
} from "@uncaged/workflow-util";
const SKILL_NAMES = ["cli", "architecture", "yaml", "moderator"] as const;
const SKILL_NAMES = [
"cli",
"architecture",
"yaml",
"moderator",
"actor",
"user",
"author",
"developer",
] as const;
export function cmdSkillList(): ReadonlyArray<string> {
return [...SKILL_NAMES];
+8 -3
View File
@@ -331,7 +331,7 @@ export async function cmdThreadShow(storageRoot: string, threadId: ThreadId): Pr
fail(`thread not found: ${threadId}`);
}
export type ThreadStatus = "idle" | "running" | "completed";
export type ThreadStatus = "idle" | "running" | "completed" | "cancelled";
export type ThreadListItemWithStatus = ThreadListItem & {
status: ThreadStatus;
@@ -389,7 +389,7 @@ async function collectCompletedThreads(
thread: entry.thread,
workflow: entry.workflow,
head: entry.head,
status: "completed",
status: entry.reason === "cancelled" ? "cancelled" : "completed",
});
}
}
@@ -444,7 +444,10 @@ export async function cmdThreadList(
let items = await collectActiveThreads(storageRoot, uwf, index);
// Collect completed threads (if relevant for status filter)
const includeCompleted = statusFilter === null || statusFilter.includes("completed");
const includeCompleted =
statusFilter === null ||
statusFilter.includes("completed") ||
statusFilter.includes("cancelled");
if (includeCompleted) {
const activeIds = new Set(items.map((i) => i.thread));
const completedItems = await collectCompletedThreads(storageRoot, activeIds);
@@ -811,6 +814,7 @@ async function archiveThread(
workflow,
head,
completedAt: Date.now(),
reason: "completed",
});
}
@@ -1147,6 +1151,7 @@ export async function cmdThreadCancel(
workflow,
head,
completedAt: Date.now(),
reason: "cancelled",
};
await appendThreadHistory(storageRoot, historyEntry);
+10 -1
View File
@@ -88,6 +88,7 @@ export function getHistoryPath(storageRoot: string): string {
export type ThreadHistoryLine = ThreadListItem & {
completedAt: number;
reason: "completed" | "cancelled" | null;
};
export type UwfStore = {
@@ -228,7 +229,15 @@ export async function loadThreadHistory(storageRoot: string): Promise<ThreadHist
typeof head === "string" &&
typeof completedAt === "number"
) {
lines.push({ thread: thread as ThreadId, workflow, head, completedAt });
const reason = rec.reason;
const parsedReason = reason === "completed" || reason === "cancelled" ? reason : null;
lines.push({
thread: thread as ThreadId,
workflow,
head,
completedAt,
reason: parsedReason,
});
}
}
return lines;
View File
@@ -1,9 +1,15 @@
import { Database } from "bun:sqlite";
import { describe, expect, test } from "bun:test";
import { mkdtemp, rm, writeFile } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { createMemoryStore, refs, validate, walk } from "@uncaged/json-cas";
import {
computeDurationMs,
extractLastAssistantContent,
getHermesDbPath,
loadHermesSessionFromDb,
messageToTurnPayload,
parseSessionIdFromStdout,
storeHermesSessionDetail,
@@ -124,3 +130,236 @@ describe("storeHermesSessionDetail", () => {
}
});
});
// ── SQLite fallback tests ──────────────────────────────────────────
function createTestDb(dbPath: string): Database {
const db = new Database(dbPath);
db.run(`CREATE TABLE sessions (
id TEXT PRIMARY KEY,
model TEXT NOT NULL,
started_at INTEGER NOT NULL
)`);
db.run(`CREATE TABLE messages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT NOT NULL,
role TEXT NOT NULL,
content TEXT,
reasoning TEXT,
tool_calls TEXT,
FOREIGN KEY (session_id) REFERENCES sessions(id)
)`);
return db;
}
describe("getHermesDbPath", () => {
test("returns correct path", () => {
const { homedir } = require("node:os");
const { join } = require("node:path");
expect(getHermesDbPath()).toBe(join(homedir(), ".hermes", "state.db"));
});
});
describe("loadHermesSessionFromDb", () => {
test("returns session data from SQLite", async () => {
const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
const dbPath = join(tmpDir, "state.db");
const db = createTestDb(dbPath);
const sessionId = "test-session-001";
const startedAt = 1748099519;
db.run("INSERT INTO sessions (id, model, started_at) VALUES (?, ?, ?)", [
sessionId,
"claude-opus-4.6",
startedAt,
]);
db.run(
"INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
[sessionId, "user", "hello", null, null],
);
db.run(
"INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
[sessionId, "assistant", "hi there", "thinking...", null],
);
db.close();
const result = await loadHermesSessionFromDb(sessionId, dbPath);
expect(result).not.toBeNull();
expect(result!.session_id).toBe(sessionId);
expect(result!.model).toBe("claude-opus-4.6");
expect(result!.messages).toHaveLength(2);
expect(result!.messages[0]!.role).toBe("user");
expect(result!.messages[0]!.content).toBe("hello");
expect(result!.messages[1]!.role).toBe("assistant");
expect(result!.messages[1]!.content).toBe("hi there");
expect(result!.messages[1]!.reasoning).toBe("thinking...");
await rm(tmpDir, { recursive: true });
});
test("returns null when no session exists in DB", async () => {
const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
const dbPath = join(tmpDir, "state.db");
const db = createTestDb(dbPath);
db.close();
const result = await loadHermesSessionFromDb("nonexistent", dbPath);
expect(result).toBeNull();
await rm(tmpDir, { recursive: true });
});
test("returns null when DB file does not exist", async () => {
const result = await loadHermesSessionFromDb("any-id", "/tmp/nonexistent-hermes-db.db");
expect(result).toBeNull();
});
test("correctly parses tool_calls from DB JSON string", async () => {
const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
const dbPath = join(tmpDir, "state.db");
const db = createTestDb(dbPath);
const sessionId = "test-tool-calls";
db.run("INSERT INTO sessions (id, model, started_at) VALUES (?, ?, ?)", [
sessionId,
"gpt-4",
1748099519,
]);
const toolCallsJson = JSON.stringify([
{ function: { name: "read_file", arguments: '{"path":"x"}' } },
]);
db.run(
"INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
[sessionId, "assistant", "", null, toolCallsJson],
);
db.close();
const result = await loadHermesSessionFromDb(sessionId, dbPath);
expect(result).not.toBeNull();
expect(result!.messages[0]!.tool_calls).toEqual([
{ function: { name: "read_file", arguments: '{"path":"x"}' } },
]);
await rm(tmpDir, { recursive: true });
});
test("handles null fields in DB messages gracefully", async () => {
const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
const dbPath = join(tmpDir, "state.db");
const db = createTestDb(dbPath);
const sessionId = "test-nulls";
db.run("INSERT INTO sessions (id, model, started_at) VALUES (?, ?, ?)", [
sessionId,
"model",
1748099519,
]);
db.run(
"INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
[sessionId, "assistant", null, null, null],
);
db.close();
const result = await loadHermesSessionFromDb(sessionId, dbPath);
expect(result).not.toBeNull();
const msg = result!.messages[0]!;
expect(msg.content).toBeNull();
expect(msg.reasoning).toBeNull();
expect(msg.tool_calls).toBeNull();
await rm(tmpDir, { recursive: true });
});
test("messages ordered by insertion order", async () => {
const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
const dbPath = join(tmpDir, "state.db");
const db = createTestDb(dbPath);
const sessionId = "test-order";
db.run("INSERT INTO sessions (id, model, started_at) VALUES (?, ?, ?)", [
sessionId,
"model",
1748099519,
]);
db.run(
"INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
[sessionId, "user", "first", null, null],
);
db.run(
"INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
[sessionId, "assistant", "second", null, null],
);
db.run(
"INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
[sessionId, "user", "third", null, null],
);
db.close();
const result = await loadHermesSessionFromDb(sessionId, dbPath);
expect(result).not.toBeNull();
expect(result!.messages.map((m) => m.content)).toEqual(["first", "second", "third"]);
await rm(tmpDir, { recursive: true });
});
test("converts unix timestamp to ISO string for session_start", async () => {
const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
const dbPath = join(tmpDir, "state.db");
const db = createTestDb(dbPath);
const sessionId = "test-timestamp";
const startedAt = 1748099519;
db.run("INSERT INTO sessions (id, model, started_at) VALUES (?, ?, ?)", [
sessionId,
"model",
startedAt,
]);
db.close();
const result = await loadHermesSessionFromDb(sessionId, dbPath);
expect(result).not.toBeNull();
expect(result!.session_start).toBe(new Date(startedAt * 1000).toISOString());
await rm(tmpDir, { recursive: true });
});
});
describe("loadHermesSession with SQLite fallback", () => {
test("JSON file takes priority over DB", async () => {
const tmpDir = await mkdtemp(join(tmpdir(), "hermes-test-"));
const dbPath = join(tmpDir, "state.db");
const jsonPath = join(tmpDir, "session.json");
// Create DB with one model value
const db = createTestDb(dbPath);
const sessionId = "test-priority";
db.run("INSERT INTO sessions (id, model, started_at) VALUES (?, ?, ?)", [
sessionId,
"db-model",
1748099519,
]);
db.run(
"INSERT INTO messages (session_id, role, content, reasoning, tool_calls) VALUES (?, ?, ?, ?, ?)",
[sessionId, "user", "from db", null, null],
);
db.close();
// Create JSON file with a different model value
const jsonData: HermesSessionJson = {
session_id: sessionId,
model: "json-model",
session_start: "2026-05-24T12:00:00.000Z",
messages: [{ role: "user", content: "from json", reasoning: null, tool_calls: null }],
};
await writeFile(jsonPath, JSON.stringify(jsonData));
// loadHermesSession reads from JSON path, so we test the existing function directly
// The JSON-first priority is inherent in the implementation
const { readFile } = await import("node:fs/promises");
const text = await readFile(jsonPath, "utf8");
const parsed = JSON.parse(text);
expect(parsed.model).toBe("json-model");
await rm(tmpDir, { recursive: true });
});
});
@@ -1,3 +1,4 @@
import { Database } from "bun:sqlite";
import { readFile } from "node:fs/promises";
import { homedir } from "node:os";
import { join } from "node:path";
@@ -108,15 +109,99 @@ function parseSessionJson(raw: unknown): HermesSessionJson | null {
return { session_id, model, session_start, messages };
}
export function getHermesDbPath(): string {
return join(homedir(), ".hermes", "state.db");
}
type DbSessionRow = {
id: string;
model: string;
started_at: number;
};
type DbMessageRow = {
role: string;
content: string | null;
reasoning: string | null;
tool_calls: string | null;
};
function parseDbToolCalls(raw: string | null): HermesSessionMessage["tool_calls"] {
if (raw === null) {
return null;
}
try {
const parsed = JSON.parse(raw) as unknown;
return parseToolCalls(parsed);
} catch {
return null;
}
}
function dbMessageToSessionMessage(row: DbMessageRow): HermesSessionMessage {
return {
role: row.role,
content: row.content ?? null,
reasoning: row.reasoning ?? null,
tool_calls: parseDbToolCalls(row.tool_calls),
};
}
export function loadHermesSessionFromDb(
sessionId: string,
dbPath: string | null = null,
): HermesSessionJson | null {
const resolvedPath = dbPath ?? getHermesDbPath();
let db: InstanceType<typeof Database> | null = null;
try {
db = new Database(resolvedPath, { readonly: true });
const session = db
.query("SELECT id, model, started_at FROM sessions WHERE id = ?")
.get(sessionId) as DbSessionRow | null;
if (session === null) {
return null;
}
const rows = db
.query(
"SELECT role, content, reasoning, tool_calls FROM messages WHERE session_id = ? ORDER BY id",
)
.all(sessionId) as DbMessageRow[];
const messages: HermesSessionMessage[] = [];
for (const row of rows) {
const role = row.role;
if (role !== "user" && role !== "assistant" && role !== "tool") {
continue;
}
messages.push(dbMessageToSessionMessage(row));
}
return {
session_id: session.id,
model: session.model,
session_start: new Date(session.started_at * 1000).toISOString(),
messages,
};
} catch {
return null;
} finally {
db?.close();
}
}
export async function loadHermesSession(sessionId: string): Promise<HermesSessionJson | null> {
const path = getHermesSessionPath(sessionId);
try {
const text = await readFile(path, "utf8");
const raw = JSON.parse(text) as unknown;
return parseSessionJson(raw);
const result = parseSessionJson(raw);
if (result !== null) {
return result;
}
} catch {
return null;
// JSON file not available, fall through to DB
}
return loadHermesSessionFromDb(sessionId);
}
export function computeDurationMs(sessionStart: string, nowMs: number = Date.now()): number {
+1 -1
View File
@@ -100,7 +100,7 @@ type ProviderAlias = string;
type ModelAlias = string;
type AgentAlias = string;
type ProviderConfig = { baseUrl: string; apiKeyEnv: string };
type ProviderConfig = { baseUrl: string; apiKey: string };
type ModelConfig = {
provider: ProviderAlias;
name: string;
+1 -1
View File
@@ -151,7 +151,7 @@ export type Scenario = string;
export type ProviderConfig = {
baseUrl: string;
apiKeyEnv: string;
apiKey: string;
};
export type ModelConfig = {
+4 -6
View File
@@ -1,8 +1,7 @@
import { getSchema, validate } from "@uncaged/json-cas";
import type { CasRef, ModelAlias, WorkflowConfig } from "@uncaged/workflow-protocol";
import { config as loadDotenv } from "dotenv";
import { createAgentStore, getEnvPath, resolveStorageRoot } from "./storage.js";
import { createAgentStore, resolveStorageRoot } from "./storage.js";
export type ResolvedLlmProvider = {
baseUrl: string;
@@ -38,9 +37,9 @@ export function resolveModel(config: WorkflowConfig, alias: ModelAlias): Resolve
if (providerEntry === undefined) {
throw new Error(`unknown provider "${modelEntry.provider}" for model "${alias}"`);
}
const apiKey = process.env[providerEntry.apiKeyEnv];
const apiKey = providerEntry.apiKey;
if (apiKey === undefined || apiKey === "") {
throw new Error(`missing API key env var: ${providerEntry.apiKeyEnv}`);
throw new Error(`missing API key for provider: ${modelEntry.provider}`);
}
return {
baseUrl: providerEntry.baseUrl,
@@ -130,7 +129,7 @@ export type ExtractResult = {
/**
* Call an OpenAI-compatible LLM to extract structured output matching outputSchema.
* Loads config.yaml and .env from the workflow storage root.
* Loads config.yaml from the workflow storage root.
*/
export async function extract(
rawOutput: string,
@@ -138,7 +137,6 @@ export async function extract(
config: WorkflowConfig,
): Promise<ExtractResult> {
const storageRoot = resolveStorageRoot();
loadDotenv({ path: getEnvPath(storageRoot) });
const { store } = await createAgentStore(storageRoot);
const schema = getSchema(store, outputSchema);
+4 -4
View File
@@ -84,11 +84,11 @@ function normalizeProviders(raw: unknown): Record<ProviderAlias, ProviderConfig>
throw new Error(`config.providers.${name} must be a mapping`);
}
const baseUrl = entry.baseUrl;
const apiKeyEnv = entry.apiKeyEnv;
if (typeof baseUrl !== "string" || typeof apiKeyEnv !== "string") {
throw new Error(`config.providers.${name} requires baseUrl and apiKeyEnv`);
const apiKey = entry.apiKey;
if (typeof baseUrl !== "string" || typeof apiKey !== "string") {
throw new Error(`config.providers.${name} requires baseUrl and apiKey`);
}
providers[name] = { baseUrl, apiKeyEnv };
providers[name] = { baseUrl, apiKey };
}
return providers;
}
-78
View File
@@ -1,78 +0,0 @@
# @uncaged/workflow-util
## 0.5.0-alpha.4
### Patch Changes
- Replace optionalEnv/requireEnv with unified env(name, fallback) API
- Updated dependencies [f74b482]
- Updated dependencies [f74b482]
- @uncaged/workflow-protocol@0.5.0-alpha.4
## 0.5.0-alpha.3
### Patch Changes
- Updated dependencies
- @uncaged/workflow-protocol@0.5.0-alpha.3
## 0.5.0-alpha.2
### Patch Changes
- Updated dependencies
- @uncaged/workflow-protocol@0.5.0-alpha.2
## 0.5.0-alpha.1
### Patch Changes
- @uncaged/workflow-protocol@0.5.0-alpha.1
## 0.5.0-alpha.0
### Patch Changes
- Updated dependencies
- @uncaged/workflow-protocol@0.5.0-alpha.0
## 0.4.5
### Patch Changes
- Updated dependencies
- @uncaged/workflow-protocol@0.4.5
## 0.4.4
### Patch Changes
- Updated dependencies
- @uncaged/workflow-protocol@0.4.4
## 0.4.3
### Patch Changes
- Include src/ in published packages so bun runtime can resolve the 'bun' exports condition.
- Updated dependencies
- @uncaged/workflow-protocol@0.4.3
## 0.4.2
### Patch Changes
- Fix workspace dependency resolution: use workspace:^ so published packages resolve to compatible versions instead of exact (non-existent) versions.
- Updated dependencies
- @uncaged/workflow-protocol@0.4.2
## 0.4.0
### Minor Changes
- Fix package exports for published packages and adopt changesets for version management.
### Patch Changes
- Updated dependencies
- @uncaged/workflow-protocol@0.4.0
@@ -0,0 +1,68 @@
export function generateActorReference(): string {
return `# Actor Reference
You are executing a workflow role. Your system prompt defines your goal, procedure, and output requirements. This reference covers two things you need to know about the workflow engine.
## 1. Frontmatter Output Protocol
Your response **MUST** begin with a YAML frontmatter block at byte position 0 — no preamble text before it.
\`\`\`
---
status: done
myField: some value
---
... markdown body (your work, explanation, notes) ...
\`\`\`
### Standard Field
| Field | Values | Default | Description |
|-------|--------|---------|-------------|
| \`status\` | \`done\`, \`needs_input\`, \`in_progress\`, \`failed\` | \`done\` | Completion signal — determines which graph edge the moderator follows next |
### Schema-Defined Fields
Your role's output schema (shown in the system prompt under "Deliverable Format") defines additional fields. Output **only** the fields listed there — do not invent extra fields.
### Body
Everything after the closing \`---\` fence is the markdown body. Use it for explanations, logs, or human-readable notes. The body is stored but not parsed by the engine.
### Retry
If the engine cannot parse your frontmatter, it will ask you to retry (up to 2 times). Just output the corrected frontmatter block — don't panic.
## 2. CAS (Content-Addressable Store)
Your frontmatter output is automatically stored in CAS. You can also **use CAS directly** to store intermediate artifacts, build merkle DAGs for large outputs, or reference data from previous steps.
### Commands
\`\`\`
uwf cas put-text <text> # store plain text, print hash
uwf cas put <type-hash> <json> # store typed JSON data, print hash
uwf cas get <hash> # read a CAS node (type + payload)
uwf cas has <hash> # check if a hash exists
uwf cas refs <hash> # list direct references from a node
uwf cas walk <hash> # recursive traversal from a node
uwf cas schema list # list registered schemas
uwf cas schema get <hash> # show a schema definition
\`\`\`
### Merkle DAG Pattern
For large outputs, store parts individually and reference their hashes:
\`\`\`bash
# Store individual sections
HASH1=$(uwf cas put-text "section 1 content")
HASH2=$(uwf cas put-text "section 2 content")
# Reference hashes in your frontmatter or in a parent node
\`\`\`
This enables progressive loading — consumers can fetch the root and resolve children on demand.
`;
}
@@ -0,0 +1,183 @@
export function generateAuthorReference(): string {
return `# Author Reference
Guide for designing and writing workflow YAML definitions.
## Workflow Structure
\`\`\`yaml
name: solve-issue # verb-first kebab-case
description: "..." # human-readable summary
roles: # named actors
planner:
description: "..." # short purpose
goal: "..." # system-level goal for the agent
capabilities: [...] # skill keywords the agent should load
procedure: | # step-by-step instructions
1. Do this
2. Do that
output: "..." # what the agent should produce
frontmatter: # JSON Schema for structured output
oneOf:
- properties:
$status: { const: "ready" }
plan: { type: string }
required: [$status, plan]
- properties:
$status: { const: "failed" }
error: { type: string }
required: [$status, error]
graph: # status-based routing
$START:
_: { role: planner, prompt: "Analyze the issue." }
planner:
ready: { role: developer, prompt: "Implement {{{plan}}}." }
failed: { role: $END, prompt: "Failed: {{{error}}}" }
\`\`\`
## Role Definition
| Field | Purpose |
|-------|---------|
| \`description\` | Short description for humans and moderator context |
| \`goal\` | Injected as the agent's system-level objective |
| \`capabilities\` | Keyword tags — agent loads matching skills before starting |
| \`procedure\` | Step-by-step instructions the agent follows |
| \`output\` | Describes what to produce and which \`$status\` values to use |
| \`frontmatter\` | JSON Schema defining the structured output fields |
### Role Design Principles
- **Single responsibility** — each role does one thing well
- **Minimal context** — don't overload a role with too many steps; split if needed
- **Clear status values** — each status should map to a distinct graph edge
- **Explicit output** — tell the agent exactly what \`$status\` values are valid
## Frontmatter Schema
The \`frontmatter\` field is a standard JSON Schema. It defines the structured fields the agent must output in YAML frontmatter.
### \`$status\` Field
\`$status\` is the only standard field. Its value determines which graph edge the moderator follows. Use \`const\` to constrain each variant:
\`\`\`yaml
frontmatter:
oneOf:
- properties:
$status: { const: "done" }
result: { type: string }
required: [$status, result]
- properties:
$status: { const: "failed" }
error: { type: string }
required: [$status, error]
\`\`\`
### Custom Fields
Add any fields you need for data passing between roles. These are available in edge prompts via Mustache templates.
### Flat Schema (Single Status)
When a role has only one outcome:
\`\`\`yaml
frontmatter:
properties:
$status: { const: "done" }
summary: { type: string }
required: [$status, summary]
\`\`\`
## Graph Routing
The graph maps each role's \`$status\` values to the next role:
\`\`\`
graph[role][$status] → { role: nextRole, prompt: edgePrompt }
\`\`\`
### Special Nodes
| Node | Purpose |
|------|---------|
| \`$START\` | Entry point — status key is always \`_\` (unconditional) |
| \`$END\` | Terminal — thread completes and is archived |
### Edge Prompts
Use triple-brace Mustache (\`{{{field}}}\`) to pass data from the previous step's output:
\`\`\`yaml
graph:
planner:
ready: { role: developer, prompt: "Implement plan {{{plan}}} in {{{repoPath}}}." }
\`\`\`
The fields referenced must exist in the source role's frontmatter schema.
### Loops and Branching
Roles can route back to previous roles (loops) or to different roles based on status (branching):
\`\`\`yaml
graph:
reviewer:
approved: { role: tester, prompt: "Run tests." }
rejected: { role: developer, prompt: "Fix: {{{comments}}}" } # loop back
\`\`\`
### Fail Routing
Route failures to a cleanup role or \`$END\`:
\`\`\`yaml
graph:
developer:
done: { role: reviewer, prompt: "Review changes." }
failed: { role: cleanup, prompt: "Clean up: {{{error}}}" }
\`\`\`
## Self-Testing
### Step-by-Step Verification
\`\`\`bash
# Start a thread directly from YAML file (no registration needed)
uwf thread start my-workflow.yaml -p "Test prompt"
# Or register first, then start by name
uwf workflow add my-workflow.yaml
uwf thread start my-workflow -p "Test prompt"
# Execute one step at a time to verify routing
uwf thread exec <thread-id>
# Inspect step output
uwf step list <thread-id>
uwf step show <step-hash>
# Check the CAS data
uwf cas get <output-hash>
\`\`\`
### Validation Checklist
1. Every \`$status\` value in a role's frontmatter has a matching edge in the graph
2. Every field referenced in edge prompts (\`{{{field}}}\`) exists in the source role's schema
3. Every role referenced in the graph exists in \`roles\`
4. \`$START\` has exactly one edge with key \`_\`
5. At least one path leads to \`$END\`
6. No orphan roles (defined but never routed to)
## Common Pitfalls
- **Missing graph edge** — if a role can produce \`$status: failed\` but the graph has no \`failed\` edge, the moderator will error
- **Mustache field mismatch** — referencing \`{{{branch}}}\` in an edge prompt but the source schema has \`branchName\` instead
- **Overly complex roles** — a role with 20 steps should be split; each role should be completable in one agent turn
- **No fail path** — always handle failure; route to cleanup or \`$END\`
`;
}
@@ -0,0 +1,140 @@
export function generateDeveloperReference(): string {
return `# Developer Reference
Guide for contributing to the workflow engine codebase.
## Monorepo Structure
\`\`\`
packages/
workflow-protocol/ # Shared types (WorkflowPayload, StepNodePayload, etc.)
workflow-util/ # Base32, ULID, logger, frontmatter parsing, skill references
workflow-util-agent/ # createAgent factory, context builder, extract pipeline
workflow-agent-hermes/ # uwf-hermes CLI (spawns Hermes chat sessions)
workflow-agent-builtin/ # uwf-builtin CLI (direct LLM calls via OpenAI API)
cli-workflow/ # uwf CLI (moderator, thread/step/cas/config commands)
\`\`\`
Dependency layers (each only imports from packages above it):
\`\`\`
protocol → util → util-agent → agent-hermes / agent-builtin / cli-workflow
\`\`\`
External CAS: \`@uncaged/json-cas\` (store API, hashing, schema validation) + \`@uncaged/json-cas-fs\` (filesystem backend).
## Coding Conventions
### Functional-first
| Rule | Description |
|------|-------------|
| \`type\` over \`interface\` | All type definitions use \`type\` |
| \`function\` over \`class\` | Pure functions + closures, no class |
| No \`this\` | Functions must not depend on \`this\` context |
| No inheritance | No \`extends\`, \`implements\`, \`abstract\` |
| No optional properties | Use \`T \\| null\` instead of \`?:\` |
| Immutability first | Use \`Readonly<T>\`, \`as const\`, avoid mutation |
Classes allowed only when required by third-party libraries or for Error subclasses.
### Error Handling
- \`Result<T, E>\` type for expected failures (\`ok\`/\`err\` constructors from \`@uncaged/workflow-util\`)
- \`throw\` only for unrecoverable bugs
- No try-catch for flow control
### Async
Always \`async/await\`, never \`.then()\` chains.
### Logging
\`console.*\` is banned (Biome \`noConsole\` rule). Use the structured logger:
\`\`\`typescript
import { createLogger } from "@uncaged/workflow-util";
const log = createLogger();
log("4KNMR2PX", "Loading workflow..."); // 8-char Crockford Base32 tag
\`\`\`
Each call site gets a unique hand-written tag. \`grep "4KNMR2PX"\` in logs → instant code location.
CLI package (\`@uncaged/cli-workflow\`) may use \`console.log\` for user-facing output with a biome-ignore comment.
### No Dynamic Import
No \`await import()\` in production code. Always static top-level \`import\`. Test files are exempt.
### Naming
- Workflow names: verb-first kebab-case (\`solve-issue\`, \`review-code\`)
- IDs: Crockford Base32 — CAS hash (XXH64, 13-char), Thread ID (ULID, 26-char)
## Development Workflow
\`\`\`bash
bun install # install all workspace deps
bun run build # tsc --build (all packages)
bun run check # tsc + biome check + lint-log-tags
bun run format # biome format --write
bun test # run all tests
\`\`\`
Before committing: \`bun run check\` + \`bun test\` must both pass.
### Testing
- \`cli-workflow\`: vitest
- Other packages: \`bun test\`
- Test files live in \`__tests__/\` directories
### Publishing
Fixed-mode versioning — all \`@uncaged/*\` packages share the same version number.
\`\`\`bash
bun changeset # describe the change
bun version # bump versions + changelogs
bun release # build + test + publish to npmjs
\`\`\`
## Key Modules
### Moderator (\`cli-workflow/src/moderator/\`)
Status-based graph evaluator. Reads \`graph[lastRole][output.$status]\` to determine the next role. Zero LLM cost.
### Extract Pipeline (\`workflow-util-agent/src/\`)
1. Agent produces frontmatter markdown
2. \`parseFrontmatterMarkdown()\` extracts YAML frontmatter
3. \`tryFrontmatterFastPath()\` validates against role's output schema
4. If fast path fails, retries up to 2 times via agent continue
5. Validated output stored as CAS node
### createAgent Factory (\`workflow-util-agent/src/run.ts\`)
Shared entry point for all agent CLIs. Handles:
- Argument parsing (\`--thread\`, \`--role\`, \`--prompt\`)
- Context building (thread history, workflow definition)
- Output extraction and CAS persistence
- Frontmatter retry loop
### CAS Integration
All data is CAS-addressed via \`@uncaged/json-cas\`:
- \`store.put(schemaHash, data)\` → content hash
- \`store.get(hash)\` → node
- \`validate(store, node)\` → schema check
- Schemas registered at workflow add time
## Commit Convention
\`\`\`
<type>(<scope>): <description>
type: feat | fix | refactor | docs | chore | test
scope: workflow | cli | moderator | agent-kit | hermes | util | protocol
\`\`\`
`;
}
+4
View File
@@ -1,6 +1,9 @@
export { generateActorReference } from "./actor-reference.js";
export { generateArchitectureReference } from "./architecture-reference.js";
export { generateAuthorReference } from "./author-reference.js";
export { encodeUint64AsCrockford } from "./base32.js";
export { generateCliReference } from "./cli-reference.js";
export { generateDeveloperReference } from "./developer-reference.js";
export { env } from "./env.js";
export type {
AgentFrontmatter,
@@ -27,4 +30,5 @@ export { err, ok } from "./result.js";
export { getDefaultWorkflowStorageRoot, getGlobalCasDir } from "./storage-root.js";
export type { LogFn, Result } from "./types.js";
export { extractUlidTimestamp, generateUlid } from "./ulid.js";
export { generateUserReference } from "./user-reference.js";
export { generateYamlReference } from "./yaml-reference.js";
@@ -0,0 +1,125 @@
export function generateUserReference(): string {
return `# User Reference
Guide for using the uwf CLI to manage workflows and threads.
## Quick Start
\`\`\`bash
# 1. Configure provider and model
uwf setup
# 2. Register a workflow
uwf workflow add my-workflow.yaml
# 3. Start a thread (creates but does not execute)
uwf thread start my-workflow -p "Build a login page"
# 4. Execute the thread (runs moderator → agent → extract cycles)
uwf thread exec <thread-id> # one step
uwf thread exec <thread-id> -c 10 # up to 10 steps
uwf thread exec <thread-id> -c 10 --background # run in background
\`\`\`
## Concepts
- **Workflow** — YAML definition with roles and a routing graph; stored as a CAS node
- **Thread** — A running instance of a workflow; a chain of step nodes in CAS
- **Step** — One moderator → agent → extract cycle; contains the role's structured output
- **CAS** — Content-addressable store; every artifact is hashed (XXH64, Crockford Base32)
## Setup
\`\`\`
uwf setup # interactive wizard
uwf setup --provider <name> --base-url <url> \\
--api-key <key> --model <name> # non-interactive
[--agent <name>] # optional default agent
\`\`\`
Config is stored at \`~/.uncaged/workflow/config.yaml\`. Override storage root with \`UNCAGED_WORKFLOW_STORAGE_ROOT\`.
## Workflow Commands
\`\`\`
uwf workflow add <file> # register from YAML file
uwf workflow show <id> # show by name or CAS hash
uwf workflow list # list all registered workflows
\`\`\`
You can also pass a file path directly to \`uwf thread start\` without registering first.
## Thread Lifecycle
\`\`\`
uwf thread start <workflow> -p <prompt> # create thread
uwf thread exec <thread-id> # execute one step
[--agent <cmd>] # override agent
[-c, --count <n>] # run n steps
[--background] # run in background
uwf thread show <thread-id> # show head pointer
uwf thread list # list all threads
[--status <filter>] # idle, running, completed, cancelled, active (comma-separated)
[--after <thread-id>] # pagination: after this thread
[--before <thread-id>] # pagination: before this thread
[--skip <n>] # skip first n results
[--take <n>] # limit results
uwf thread read <thread-id> # render context as markdown
[--quota <chars>] # max output chars (default 4000)
[--before <step-hash>] # pagination
[--start] # include start step
uwf thread stop <thread-id> # stop background execution
uwf thread cancel <thread-id> # cancel and archive thread
\`\`\`
### Typical Lifecycle
\`\`\`
start → exec (repeat) → thread reaches $END → auto-completed
→ or: cancel to abort
\`\`\`
## Step Commands
\`\`\`
uwf step list <thread-id> # list all steps
uwf step show <step-hash> # show step details
uwf step fork <step-hash> # fork thread from a step (branch)
\`\`\`
Forking creates a new thread that shares history up to the fork point — useful for retrying from a known-good state.
## CAS Commands
\`\`\`
uwf cas get <hash> # read a node (type + payload)
[--timestamp] # include timestamp
uwf cas put <type-hash> <data> # store typed JSON, print hash
uwf cas put-text <text> # store plain text, print hash
uwf cas has <hash> # check existence
uwf cas refs <hash> # list direct references
uwf cas walk <hash> # recursive traversal
uwf cas reindex # rebuild type index
uwf cas schema list # list schemas
uwf cas schema get <hash> # show schema definition
\`\`\`
## Log Commands
\`\`\`
uwf log list # list log files
uwf log show # show log entries
[--thread <id>] # filter by thread
[--process <pid>] # filter by process
[--date <YYYY-MM-DD>] # filter by date
uwf log clean --before <date> # delete old logs
\`\`\`
## Global Options
\`\`\`
uwf --format <json|yaml> # output format (default: json)
uwf -V, --version # print version
\`\`\`
`;
}
+120
View File
@@ -0,0 +1,120 @@
#!/usr/bin/env bash
# Check development environment prerequisites for uncaged/workflow.
# Non-interactive — prints actionable fix instructions on failure.
# Exit 0 = all good, exit 1 = missing dependencies.
set -euo pipefail
errors=0
check() {
local name="$1" check_cmd="$2" fix_msg="$3"
if eval "$check_cmd" >/dev/null 2>&1; then
echo "$name"
else
echo "$name"
echo " Fix: $fix_msg"
errors=$((errors + 1))
fi
}
check_version() {
local name="$1" cmd="$2" fix_msg="$3"
local version
if version=$(eval "$cmd" 2>/dev/null | head -1); then
echo "$name$version"
else
echo "$name"
echo " Fix: $fix_msg"
errors=$((errors + 1))
fi
}
echo "=== Runtime ==="
check_version "bun" "bun --version" \
"curl -fsSL https://bun.sh/install | bash"
check_version "node" "node --version" \
"Install Node.js 20+: https://nodejs.org/"
check_version "python3" "python3 --version" \
"Install Python 3.11+: https://www.python.org/ or use uv: curl -LsSf https://astral.sh/uv/install.sh | sh && uv python install 3.11"
echo ""
echo "=== Tools ==="
check_version "hermes" "hermes --version" \
"See https://github.com/hermes-ai/hermes-agent for installation. Typical: pip install hermes-agent (or uv pip install -e . for dev)"
check_version "claude" "claude --version" \
"npm install -g @anthropic-ai/claude-code"
echo ""
echo "=== Workflow ==="
# Check repo location
REPO_DIR="${WORKFLOW_REPO:-$(cd "$(dirname "$0")/.." && pwd)}"
check "repo at ~/repos/workflow or WORKFLOW_REPO set" \
"[ -f '$REPO_DIR/packages/cli-workflow/src/cli.ts' ]" \
"Clone the repo: git clone https://git.shazhou.work/uncaged/workflow ~/repos/workflow"
# Check bun install
check "node_modules installed" \
"[ -d '$REPO_DIR/node_modules' ]" \
"cd $REPO_DIR && bun install"
# Check build
check "packages built (dist/)" \
"[ -f '$REPO_DIR/packages/cli-workflow/dist/cli.js' ]" \
"cd $REPO_DIR && bun run build"
# Check uwf is runnable
check_version "uwf" "bun $REPO_DIR/packages/cli-workflow/src/cli.ts --version" \
"cd $REPO_DIR && bun install && bun run build"
# Check uwf symlink
check "uwf in PATH" \
"command -v uwf" \
"sudo ln -sf $REPO_DIR/packages/cli-workflow/dist/cli.js /usr/bin/uwf && sudo chmod +x /usr/bin/uwf"
# Check uwf-hermes
check "uwf-hermes in PATH" \
"command -v uwf-hermes" \
"bun link in packages/workflow-agent-hermes, or: echo '#!/usr/bin/env bun' > ~/.local/bin/uwf-hermes && echo 'import \"$REPO_DIR/packages/workflow-agent-hermes/src/cli.ts\"' >> ~/.local/bin/uwf-hermes && chmod +x ~/.local/bin/uwf-hermes"
# Check uwf-claude-code
check "uwf-claude-code in PATH" \
"command -v uwf-claude-code" \
"Create wrapper: echo '#!/bin/bash\nexec bun run $REPO_DIR/packages/workflow-agent-claude-code/src/cli.ts \"\$@\"' > ~/.local/bin/uwf-claude-code && chmod +x ~/.local/bin/uwf-claude-code"
echo ""
echo "=== Config ==="
# Check workflow config exists
CONFIG_DIR="${UNCAGED_WORKFLOW_STORAGE_ROOT:-$HOME/.uncaged/workflow}"
check "config.yaml exists" \
"[ -f '$CONFIG_DIR/config.yaml' ]" \
"Run: uwf setup"
# Check config has apiKey (not apiKeyEnv)
if [ -f "$CONFIG_DIR/config.yaml" ]; then
check "config uses apiKey (not legacy apiKeyEnv)" \
"grep -q 'apiKey:' '$CONFIG_DIR/config.yaml' && ! grep -q 'apiKeyEnv:' '$CONFIG_DIR/config.yaml'" \
"Run: uwf setup (re-configure to write apiKey directly)"
fi
echo ""
echo "=== Docker (optional, for E2E tests) ==="
check_version "docker" "docker --version" \
"sudo apt install -y docker.io && sudo usermod -aG docker \$USER"
check "docker daemon running" \
"docker info" \
"sudo systemctl start docker"
echo ""
if [ "$errors" -gt 0 ]; then
echo "⚠️ $errors issue(s) found. Fix them and re-run this script."
exit 1
else
echo "🎉 All checks passed!"
exit 0
fi
+377
View File
@@ -0,0 +1,377 @@
#!/usr/bin/env bash
# E2E walkthrough for uncaged/workflow.
# Runs inside Docker with isolated UNCAGED_WORKFLOW_STORAGE_ROOT.
# Exercises: setup → workflow add → thread start/exec → cancel/fork → read/inspect.
#
# Usage:
# sudo -E scripts/e2e-walkthrough.sh [--agent <agent>] [--provider <provider>] [--model <model>] [--api-key <key>]
#
# Requires: Docker running, $HOME mount approach (see scripts/check-dev-env.sh).
# Produces: JSON report on stdout, logs in $E2E_DIR.
#
# IMPORTANT: Must run with `sudo -E` to preserve $HOME (Docker needs root).
#
# Known Issues (WIP):
# 1. `echo '$OUT' | jq` breaks when $OUT contains single quotes (e.g. workflow show
# output with YAML). Fix: use heredoc or pipe variable directly.
# 2. Config may still have old `apiKeyEnv` field — thread exec will fail with
# "no API key". Fix: re-run `uwf setup` or manually set `apiKey` in config.
# 3. Bootstrap installs jq via apt-get which adds ~30s startup time.
# Consider baking a custom image or using node's JSON.parse instead.
# 4. `bun install` in container may modify host's lockfile/node_modules.
# Consider `--frozen-lockfile` or read-only mount for non-essential paths.
set -euo pipefail
# --- Args ---
AGENT="uwf-builtin"
PROVIDER=""
MODEL=""
API_KEY=""
KEEP_CONTAINER=false
while [[ $# -gt 0 ]]; do
case "$1" in
--agent) AGENT="$2"; shift 2 ;;
--provider) PROVIDER="$2"; shift 2 ;;
--model) MODEL="$2"; shift 2 ;;
--api-key) API_KEY="$2"; shift 2 ;;
--keep) KEEP_CONTAINER=true; shift ;;
*) echo "Unknown arg: $1" >&2; exit 1 ;;
esac
done
# --- Resolve paths ---
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
REPO_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
E2E_DIR=$(mktemp -d /tmp/uwf-e2e-XXXXXX)
CONTAINER_NAME="uwf-e2e-$(date +%s)"
echo "=== uwf E2E walkthrough ===" >&2
echo "Agent: $AGENT" >&2
echo "Provider: ${PROVIDER:-"(from config)"}" >&2
echo "Model: ${MODEL:-"(from config)"}" >&2
echo "E2E dir: $E2E_DIR" >&2
echo "Container: $CONTAINER_NAME" >&2
echo "" >&2
# --- Cleanup ---
cleanup() {
if [ "$KEEP_CONTAINER" = false ]; then
docker rm -f "$CONTAINER_NAME" 2>/dev/null || true
fi
}
trap cleanup EXIT
# --- Build inner script ---
# This runs INSIDE the container with an isolated storage root.
cat > "$E2E_DIR/run.sh" << 'INNER_SCRIPT'
#!/usr/bin/env bash
set -euo pipefail
# Isolated storage — never touches host's ~/.uncaged/workflow
export UNCAGED_WORKFLOW_STORAGE_ROOT="/tmp/uwf-e2e-storage"
mkdir -p "$UNCAGED_WORKFLOW_STORAGE_ROOT"
REPO_DIR="$1"
AGENT="$2"
PROVIDER="$3"
MODEL="$4"
API_KEY="$5"
# Ensure tools are in PATH (derive HOME from REPO_DIR to avoid container HOME issues)
REAL_HOME="${6:-$HOME}"
export HOME="$REAL_HOME"
export PATH="$REAL_HOME/.bun/bin:$REAL_HOME/.hermes/hermes-agent/venv/bin:$REAL_HOME/.local/share/npm/bin:$PATH"
# Resolve uwf
UWF="bun $REPO_DIR/packages/cli-workflow/src/cli.ts"
PASS=0
FAIL=0
RESULTS=()
run_test() {
local name="$1"
shift
local output exit_code
echo "--- TEST: $name ---" >&2
output=$("$@" 2>&1) && exit_code=0 || exit_code=$?
if [ $exit_code -eq 0 ]; then
PASS=$((PASS + 1))
RESULTS+=("{\"name\":\"$name\",\"status\":\"pass\"}")
echo " ✅ PASS" >&2
else
FAIL=$((FAIL + 1))
# Escape output for JSON
local escaped
escaped=$(echo "$output" | head -5 | tr '\n' ' ' | sed 's/"/\\"/g' | cut -c1-200)
RESULTS+=("{\"name\":\"$name\",\"status\":\"fail\",\"error\":\"$escaped\"}")
echo " ❌ FAIL: $output" >&2
fi
echo "$output"
}
assert_contains() {
local haystack="$1" needle="$2"
if echo "$haystack" | grep -q "$needle"; then
return 0
else
echo "Expected to contain: $needle" >&2
echo "Got: $haystack" >&2
return 1
fi
}
assert_json_field() {
local json="$1" field="$2"
if echo "$json" | jq -e ".$field" >/dev/null 2>&1; then
return 0
else
echo "Missing JSON field: $field" >&2
return 1
fi
}
# ============================================================
# Phase 1: Environment check
# ============================================================
echo "" >&2
echo "=== Phase 1: Environment ===" >&2
run_test "uwf --version" bash -c "$UWF --version"
# ============================================================
# Phase 2: Setup (non-interactive)
# ============================================================
echo "" >&2
echo "=== Phase 2: Setup ===" >&2
if [ -n "$PROVIDER" ] && [ -n "$MODEL" ] && [ -n "$API_KEY" ]; then
SETUP_CMD="$UWF setup --provider $PROVIDER --base-url https://api.openai.com/v1 --api-key $API_KEY --model $MODEL"
if [ -n "$AGENT" ]; then
SETUP_CMD="$SETUP_CMD --agent $AGENT"
fi
run_test "uwf setup (non-interactive)" bash -c "$SETUP_CMD"
else
# Copy host config if available
if [ -f "$HOME/.uncaged/workflow/config.yaml" ]; then
cp "$HOME/.uncaged/workflow/config.yaml" "$UNCAGED_WORKFLOW_STORAGE_ROOT/config.yaml"
echo " Copied host config.yaml" >&2
fi
fi
# Test config commands
OUT=$(run_test "uwf config list" bash -c "$UWF config list")
run_test "config list is valid JSON" bash -c "echo '$OUT' | jq . >/dev/null"
# ============================================================
# Phase 3: Workflow registration
# ============================================================
echo "" >&2
echo "=== Phase 3: Workflow registration ===" >&2
# Use the example workflow
EXAMPLE_WF="$REPO_DIR/examples/solve-issue.yaml"
if [ ! -f "$EXAMPLE_WF" ]; then
echo "No example workflow found, creating minimal test workflow" >&2
EXAMPLE_WF="/tmp/test-workflow.yaml"
cat > "$EXAMPLE_WF" << 'WF'
name: test-e2e
roles:
worker:
goal: "Respond to the prompt with a brief answer."
outputSchema:
type: object
required: ["$status", "answer"]
properties:
$status:
type: string
enum: ["done"]
answer:
type: string
graph:
- from: $START
to: worker
- from: worker
condition:
$status: done
to: $END
WF
fi
OUT=$(run_test "uwf workflow add" bash -c "$UWF workflow add $EXAMPLE_WF")
run_test "workflow add returns hash" bash -c "echo '$OUT' | jq -e '.hash'"
OUT=$(run_test "uwf workflow list" bash -c "$UWF workflow list")
run_test "workflow list is non-empty" bash -c "echo '$OUT' | jq -e 'length > 0'"
# Get workflow name
WF_NAME=$(echo "$OUT" | jq -r '.[0].name // empty')
run_test "workflow has a name" bash -c "[ -n '$WF_NAME' ]"
OUT=$(run_test "uwf workflow show" bash -c "$UWF workflow show $WF_NAME")
run_test "workflow show returns roles" bash -c "echo '$OUT' | jq -e '.payload.roles'"
# ============================================================
# Phase 4: Thread lifecycle
# ============================================================
echo "" >&2
echo "=== Phase 4: Thread lifecycle ===" >&2
# Start a thread
OUT=$(run_test "uwf thread start" bash -c "$UWF thread start $WF_NAME -p 'E2E test: what is 2+2?'")
THREAD_ID=$(echo "$OUT" | jq -r '.thread // empty')
run_test "thread start returns thread ID" bash -c "[ -n '$THREAD_ID' ]"
# List threads
OUT=$(run_test "uwf thread list" bash -c "$UWF thread list")
run_test "thread appears in list" bash -c "echo '$OUT' | jq -e '.[] | select(.thread==\"$THREAD_ID\")'"
# Show thread
OUT=$(run_test "uwf thread show" bash -c "$UWF thread show $THREAD_ID")
run_test "thread show returns head" bash -c "echo '$OUT' | jq -e '.head'"
# Execute one step
EXEC_ARGS=""
if [ -n "$AGENT" ]; then
EXEC_ARGS="--agent $AGENT"
fi
OUT=$(run_test "uwf thread exec (1 step)" bash -c "$UWF thread exec $THREAD_ID $EXEC_ARGS")
run_test "thread exec returns step info" bash -c "echo '$OUT' | jq -e '.head'"
# ============================================================
# Phase 5: Read & Inspect
# ============================================================
echo "" >&2
echo "=== Phase 5: Read & Inspect ===" >&2
# Step list
OUT=$(run_test "uwf step list" bash -c "$UWF step list $THREAD_ID")
STEP_COUNT=$(echo "$OUT" | jq '.steps | length')
run_test "step list has steps" bash -c "[ $STEP_COUNT -gt 1 ]"
# Get last step hash
LAST_STEP=$(echo "$OUT" | jq -r '.steps[-1].hash // empty')
run_test "last step has hash" bash -c "[ -n '$LAST_STEP' ]"
# Step show
if [ -n "$LAST_STEP" ]; then
OUT=$(run_test "uwf step show" bash -c "$UWF step show $LAST_STEP")
run_test "step show returns role" bash -c "echo '$OUT' | jq -e '.role'"
fi
# Thread read
OUT=$(run_test "uwf thread read" bash -c "$UWF thread read $THREAD_ID")
run_test "thread read produces output" bash -c "[ -n '$OUT' ]"
# CAS operations
if [ -n "$LAST_STEP" ]; then
OUT=$(run_test "uwf cas get" bash -c "$UWF cas get $LAST_STEP")
run_test "cas get returns type" bash -c "echo '$OUT' | jq -e '.type'"
OUT=$(run_test "uwf cas has" bash -c "$UWF cas has $LAST_STEP")
OUT=$(run_test "uwf cas refs" bash -c "$UWF cas refs $LAST_STEP")
OUT=$(run_test "uwf cas walk" bash -c "$UWF cas walk $LAST_STEP")
run_test "cas walk returns nodes" bash -c "echo '$OUT' | jq -e 'length > 0'"
fi
# ============================================================
# Phase 6: Cancel & Fork
# ============================================================
echo "" >&2
echo "=== Phase 6: Cancel & Fork ===" >&2
# Start a second thread for cancel test
OUT=$(run_test "thread start (for cancel)" bash -c "$UWF thread start $WF_NAME -p 'E2E cancel test'")
CANCEL_THREAD=$(echo "$OUT" | jq -r '.thread // empty')
if [ -n "$CANCEL_THREAD" ]; then
OUT=$(run_test "uwf thread cancel" bash -c "$UWF thread cancel $CANCEL_THREAD")
run_test "cancelled thread status" bash -c "$UWF thread list --status completed | jq -e '.[] | select(.thread==\"$CANCEL_THREAD\")'"
fi
# Fork from the first thread's last step
if [ -n "$LAST_STEP" ]; then
OUT=$(run_test "uwf step fork" bash -c "$UWF step fork $LAST_STEP")
FORK_THREAD=$(echo "$OUT" | jq -r '.thread // empty')
run_test "fork creates new thread" bash -c "[ -n '$FORK_THREAD' ] && [ '$FORK_THREAD' != '$THREAD_ID' ]"
fi
# ============================================================
# Phase 7: Log inspection
# ============================================================
echo "" >&2
echo "=== Phase 7: Logs ===" >&2
OUT=$(run_test "uwf log list" bash -c "$UWF log list")
OUT=$(run_test "uwf log show" bash -c "$UWF log show --thread $THREAD_ID 2>&1 || true")
# ============================================================
# Phase 8: Config operations
# ============================================================
echo "" >&2
echo "=== Phase 8: Config get/set ===" >&2
OUT=$(run_test "uwf config get defaultAgent" bash -c "$UWF config get defaultAgent")
OUT=$(run_test "uwf config set (test key)" bash -c "$UWF config set models.test.name test-model")
OUT=$(run_test "uwf config get (verify set)" bash -c "$UWF config get models.test.name")
run_test "config set value persisted" bash -c "echo '$OUT' | grep -q 'test-model'"
# ============================================================
# Report
# ============================================================
echo "" >&2
echo "=== Results ===" >&2
echo "Pass: $PASS Fail: $FAIL" >&2
# JSON report
echo "{"
echo " \"pass\": $PASS,"
echo " \"fail\": $FAIL,"
echo " \"agent\": \"$AGENT\","
echo " \"tests\": [$(IFS=,; echo "${RESULTS[*]}")]"
echo "}"
[ $FAIL -eq 0 ]
INNER_SCRIPT
chmod +x "$E2E_DIR/run.sh"
# --- Run in Docker ---
echo "Starting Docker container..." >&2
# --- Build bootstrap script (runs first inside container) ---
cat > "$E2E_DIR/bootstrap.sh" << BOOTSTRAP
#!/usr/bin/env bash
set -uo pipefail
echo "Installing jq..." >&2
apt-get update -qq >&2 && apt-get install -y -qq jq >&2
echo "jq installed" >&2
# All tools come from host via mount
export HOME='$HOME'
export PATH="$HOME/.bun/bin:$HOME/.hermes/hermes-agent/venv/bin:$HOME/.local/share/npm/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
# Ensure bun modules are resolved for this environment
cd '$REPO_DIR'
echo "Running bun install..." >&2
which bun >&2
bun install 2>&1 | tail -3 >&2
echo "bun install done" >&2
# Run E2E (pass HOME explicitly as 6th arg)
bash /e2e/run.sh '$REPO_DIR' '$AGENT' '$PROVIDER' '$MODEL' '$API_KEY' '$HOME'
BOOTSTRAP
chmod +x "$E2E_DIR/bootstrap.sh"
docker run --rm \
--name "$CONTAINER_NAME" \
-v "$HOME:$HOME" \
-v "$E2E_DIR:/e2e" \
-e HOME="$HOME" \
-w "$REPO_DIR" \
node:22-bookworm \
bash /e2e/bootstrap.sh