Compare commits

...

49 Commits

Author SHA1 Message Date
xiaoju 69ec8c2c5e release: v0.1.2
CI / check (pull_request) Successful in 3m6s
2026-06-07 15:44:00 +08:00
xingyue 81aa282c92 Merge pull request 'chore: release prep — proman bump + protocol 0.1.1 align' (#152) from release/next into main
CI / check (push) Successful in 2m56s
2026-06-07 07:41:37 +00:00
xingyue a620defbcf chore: bump versions via proman (protocol 0.1.1 align npm + session-resume fix)
CI / check (pull_request) Successful in 3m19s
2026-06-07 15:35:15 +08:00
scottwei 439891f6b6 Merge pull request 'revert: undo #150 release bump (changeset + version bump 不应由依赖升级触发)' (#151) from revert/150-release-bump into main
CI / check (push) Successful in 3m40s
Reviewed-on: #151
Reviewed-by: scottwei <shazhou.ww@gmail.com>
2026-06-07 07:33:54 +00:00
xingyue df244c52e8 Revert "Merge pull request 'chore: release — bump @ocas/* ^0.4.0, @shazhou/proman ^0.6.3' (#150) from release/bump-ocas-proman into main"
CI / check (pull_request) Successful in 3m45s
This reverts commit 9d0c6df62c, reversing
changes made to 00d960daba.
2026-06-07 15:25:31 +08:00
xiaomo cb6e0d6a11 Merge pull request 'chore: add changeset for session resume fix (#139)' (#141) from chore/139-changeset into main
CI / check (push) Successful in 3m36s
2026-06-07 07:20:36 +00:00
xiaomo 9d0c6df62c Merge pull request 'chore: release — bump @ocas/* ^0.4.0, @shazhou/proman ^0.6.3' (#150) from release/bump-ocas-proman into main
CI / check (push) Successful in 3m1s
2026-06-07 07:18:31 +00:00
xingyue 0f5bb1f191 chore: release — bump @ocas/* ^0.4.0, @shazhou/proman ^0.6.3
CI / check (pull_request) Successful in 2m35s
Published:
- @united-workforce/protocol@0.1.1
- @united-workforce/util-agent@0.1.2
- @united-workforce/agent-builtin@0.1.3
- @united-workforce/agent-claude-code@0.1.4
- @united-workforce/agent-hermes@0.1.5
- @united-workforce/agent-mock@0.1.3
- @united-workforce/cli@0.3.1
- @united-workforce/eval@0.1.6
2026-06-07 15:06:43 +08:00
xiaomo 00d960daba Merge pull request 'chore: bump @ocas/* to ^0.4.0 and @shazhou/proman to ^0.6.3' (#149) from chore/bump-ocas-proman into main
CI / check (push) Successful in 3m7s
2026-06-07 06:57:42 +00:00
xingyue 3a26285872 chore: bump @ocas/* to ^0.4.0 and @shazhou/proman to ^0.6.3
CI / check (pull_request) Successful in 3m28s
2026-06-07 14:12:03 +08:00
xiaoju 13c0812944 chore: add changeset for session resume fix (#139)
CI / check (pull_request) Successful in 2m4s
2026-06-07 03:03:55 +00:00
xiaomo 2e7e5f6ec4 Merge pull request 'fix: decouple session resume from isFirstVisit guard' (#140) from fix/139-session-resume-on-frontmatter-fail into main
CI / check (push) Successful in 1m59s
Merge PR #140: fix: decouple session resume from isFirstVisit guard
2026-06-07 02:43:36 +00:00
xiaoju 88c077d439 docs: add efficiency guidelines to CLAUDE.md
CI / check (pull_request) Successful in 2m3s
Three rules to reduce wasted Claude Code turns:
1. Don't comment on whether code is malware (trusted codebase)
2. Stop re-reading/re-verifying after tests pass
3. Don't rebuild/retest after adding a changeset (it's just markdown)
2026-06-07 02:41:21 +00:00
xiaoju aaadab4445 fix: decouple session resume from isFirstVisit guard
CI / check (pull_request) Successful in 1m58s
When frontmatter validation fails, the step is never written to CAS, so
isFirstVisit remains true on the next run.  Both agent-claude-code and
agent-hermes gated session cache lookup behind !isFirstVisit, which
caused them to start a fresh session (and a new worktree) instead of
resuming the one that already has all the work done.

Changes:
- Remove the isFirstVisit guard from both adapters so they always check
  the session cache.
- When isFirstVisit + cache hit (frontmatter-only failure), send a
  minimal correction prompt via buildFrontmatterRetryPrompt() instead
  of re-sending the full initial prompt — the session already has full
  context, we just need the agent to re-output correctly formatted
  frontmatter.
- Add buildFrontmatterRetryPrompt to util-agent with tests.

Fixes #139
2026-06-07 02:36:12 +00:00
xiaomo adf7837975 Merge pull request 'chore: add changeset + doc update requirements to solve-issue workflow' (#138) from chore/workflow-changeset-docs into main
CI / check (push) Successful in 2m0s
Merge PR #138: chore: add changeset + doc update requirements to solve-issue workflow
2026-06-06 23:09:17 +00:00
xiaoju 513846f4ab fix: update solve-issue test path from .workflows/ to examples/
CI / check (pull_request) Successful in 1m52s
Tests were referencing the old .workflows/ directory which no longer exists.
Updated workflow path and aligned assertions with current procedure content.

小橘 🍊(NEKO Team)
2026-06-06 23:01:33 +00:00
xiaoju aee123cc82 chore: add changeset + doc update requirements to solve-issue workflow
CI / check (pull_request) Failing after 2m4s
Developer: steps 12-13 — add changeset with correct bump type, update docs
Reviewer: checks 6-7 — verify changeset exists, docs updated for user-facing changes

Synced from ocas PR #86.
小橘 🍊
2026-06-06 22:45:42 +00:00
xiaoju 8ddada5879 chore: clean up workflow YAML — bun→pnpm, enum→const, deduplicate
CI / check (push) Failing after 3m6s
- solve-issue.yaml: bun→pnpm (5 refs), examples/ is now canonical
- Delete redundant workflows/solve-issue.yaml and .workflows/solve-issue.yaml
- analyze-topic.yaml + eval-simple.yaml: enum→const for $status
- Archive normalize-bun-monorepo.yaml and e2e-walkthrough.yaml to legacy-packages/

Closes #137
小橘 🍊
2026-06-06 10:56:28 +00:00
xiaoju aa732f5466 chore: bump eval to 0.1.5
CI / check (push) Successful in 3m56s
Fix workspace:^ not being replaced in 0.1.4 publish (was published with npm instead of pnpm).

小橘 🍊
2026-06-06 08:57:24 +00:00
xiaoju e354fc4341 chore: bump eval to 0.1.4
CI / check (push) Successful in 3m1s
小橘 🍊(NEKO Team)
2026-06-06 08:02:33 +00:00
xiaoju 0e7e3ea44b fix: invalid Crockford Base32 log tag in eval list command
CI / check (pull_request) Successful in 3m57s
CI / check (push) Successful in 3m31s
L is not a valid Crockford Base32 character. Replace with H.

小橘 🍊(NEKO Team)
2026-06-06 07:57:00 +00:00
xiaoju aa454c85dd chore: bump versions for release
CI / check (push) Successful in 2m56s
- @united-workforce/util: 0.1.3 → 0.1.4
- @united-workforce/util-agent: 0.1.0 → 0.1.1
- @united-workforce/agent-hermes: 0.1.3 → 0.1.4
- @united-workforce/agent-claude-code: 0.1.2 → 0.1.3
2026-06-06 04:40:27 +00:00
xiaomo 6dd7d521be Merge pull request 'chore: deduplicate debate frontmatter with YAML anchor' (#135) from chore/debate-yaml-cleanup into main
CI / check (push) Successful in 2m40s
Merge PR #135: chore: deduplicate debate frontmatter with YAML anchor
2026-06-06 04:23:12 +00:00
xiaoju 950dc056d8 chore: deduplicate debate frontmatter with YAML anchor
CI / check (pull_request) Successful in 2m22s
Use &debater-frontmatter anchor for the shared oneOf schema between
proponent and opponent roles. Procedure blocks remain duplicated
since YAML anchors cannot be embedded inside block scalars.

capabilities: [] kept — required by WorkflowPayload type.

Addresses review suggestions from #133.
2026-06-06 04:16:13 +00:00
xiaomo d360b85374 Merge pull request 'docs: upgrade debate example + fix: UWF_HERMES_BIN env support' (#133) from docs/upgrade-debate-example into main
CI / check (push) Successful in 3m1s
Merge PR #133: docs: upgrade debate example + fix: UWF_HERMES_BIN env support
2026-06-06 04:11:13 +00:00
xiaoju 509dfad857 fix: support UWF_HERMES_BIN env var for hermes binary path
CI / check (pull_request) Successful in 3m28s
Replace hardcoded HERMES_COMMAND constant with resolveHermesCommand()
that checks UWF_HERMES_BIN first, falling back to 'hermes' via PATH.

This fixes environments where hermes is installed in a venv or
non-standard location that isn't in the non-login shell PATH
(e.g. ~/.local/bin symlink only available in login shell).

Refs #134
2026-06-06 03:59:08 +00:00
xiaoju 58b84e3b3c docs: upgrade debate example — 3 roles, oneOf routing, bounded termination
CI / check (pull_request) Failing after 11m23s
Replace the original 2-role debate with a 3-role version featuring:
- proponent/opponent/host roles (was: for/against)
- oneOf + const status routing (was: enum)
- Critical thinking framework in procedure (pre-speech reflection,
  evidence discipline, anti-fragility)
- Bounded termination via Thread Progress (3rd speech → final)
- Host role for impartial summary and verdict

Based on xiaonuo's debate workflow design.
2026-06-06 03:30:54 +00:00
xiaomo f821ac99f4 Merge pull request 'docs: add upgrading section to usage reference' (#132) from feat/usage-upgrade-hint into main
CI / check (push) Successful in 2m8s
2026-06-06 03:00:09 +00:00
xiaoju 2c4700c49f docs: add upgrading section to usage reference
CI / check (pull_request) Successful in 2m27s
2026-06-06 02:57:25 +00:00
xiaomo 4410afcd4a Merge pull request 'fix: render const values as literals in output format instruction (#129)' (#130) from fix/129-const-prompt into main
CI / check (push) Successful in 2m29s
2026-06-06 01:44:24 +00:00
xiaoju a0e254a681 fix: render const values as literals in output format instruction (#129)
CI / check (pull_request) Successful in 1m48s
buildOutputFormatInstruction now renders const fields with their actual
value (e.g. $status: greeted) instead of the type placeholder (<string>).
Also adds early return in resolvePropertySchema for const properties.

Fixes #129
2026-06-06 01:12:13 +00:00
xiaomo dd77b40f6c Merge pull request 'feat: inject thread progress into agent prompt (#127)' (#128) from feat/127-inject-turn-count into main
CI / check (push) Successful in 1m44s
2026-06-06 00:53:10 +00:00
xiaoju 5ed6f68e4b feat: inject thread progress into agent prompt (#127)
CI / check (pull_request) Successful in 1m42s
Agents now receive a Thread Progress section showing current step number
and role visit count, eliminating tool calls to count turns.

- util-agent: new buildThreadProgress() helper
- agent-hermes: inject before continuation/first-visit prompt
- agent-claude-code: same injection point

Fixes #127
2026-06-06 00:40:12 +00:00
xiaoju 1ed0bf1f76 chore: clean changesets after v0.3.0 release
CI / check (push) Successful in 1m43s
2026-06-06 00:14:00 +00:00
xiaoju d97840cf8d chore: release cli@0.3.0 util@0.1.3 agent-hermes@0.1.3 agent-claude-code@0.1.2 agent-builtin@0.1.2 agent-mock@0.1.2
CI / check (push) Successful in 1m46s
2026-06-06 00:13:48 +00:00
xiaomo b560818f1a Merge pull request 'fix: bootstrap — session restart hint + v0.2.1 migration note' (#125) from fix/123-session-restart-hint into main
CI / check (push) Successful in 1m42s
2026-06-05 23:54:24 +00:00
xiaoju f989dee85b fix: bootstrap — remind to restart session after skill install/update
CI / check (pull_request) Successful in 1m42s
- Step 3 (fresh install): warn skills not active until new session
- Step 2 (upgrade): same reminder after regenerating skills
- Step 3 (upgrade): add v0.2.1 migration note for enum → const

Refs #123
2026-06-05 23:48:53 +00:00
xiaomo 7e4a59de7e Merge pull request 'fix: workflow-authoring docs — type:object + const vs enum clarity (#123)' (#124) from fix/123-workflow-authoring-docs into main
CI / check (push) Successful in 1m42s
2026-06-05 23:33:57 +00:00
xiaoju 68079cc003 fix: unify $status to const-only, drop enum support (#123)
CI / check (pull_request) Successful in 1m43s
- Validator: hasStatusConst/getConstStatuses replace enum checks
- enum in $status is now rejected with clear error message
- All docs/examples/tests migrated from enum to const/oneOf
- bootstrap hello.yaml updated

Fixes #123
2026-06-05 23:31:56 +00:00
xiaoju 1a37928bb9 fix: workflow-authoring docs — type:object + const vs enum clarity (#123)
CI / check (pull_request) Successful in 1m41s
- Add type:object to all frontmatter examples (flat and oneOf)
- Restructure $status section: Multi-exit (oneOf/const) vs Single-exit (flat/enum)
- Add Important rules box clarifying validation requirements
- Restore Custom Fields subsection

Fixes #123
2026-06-05 23:13:54 +00:00
xiaomo 57511a93fe Merge pull request 'fix: bootstrap agent discovery + adapter version independence (#120)' (#122) from fix/120-agent-discovery into main
CI / check (push) Successful in 1m44s
2026-06-05 22:35:54 +00:00
xiaoju adc3982a4a fix: bootstrap agent discovery + adapter version independence (#120)
CI / check (pull_request) Successful in 1m42s
- Step 1: detect hermes/claude before choosing adapter
- Adapter versions independent from CLI — install @latest
- ACP verification: hermes acp --help
- Remove uwf-builtin (not ready)

Refs #120
2026-06-05 22:29:35 +00:00
xiaomo 4580388270 Merge pull request 'fix: bootstrap docs — pnpm/npm parity, adapter order, preset table (#118)' (#119) from fix/118-bootstrap-ux into main
CI / check (push) Successful in 2m29s
2026-06-05 16:48:47 +00:00
xiaoju caba82fe36 fix: bootstrap PATH fix guidance — find binary location + update shell config (#118 #1)
CI / check (pull_request) Successful in 1m44s
2026-06-05 16:45:33 +00:00
xiaoju 6aee2ed5ef fix: bootstrap docs — pnpm/npm parity, adapter order, preset table (#118)
CI / check (pull_request) Successful in 2m27s
- Show pnpm and npm install commands side-by-side
- Clarify adapter must be installed before uwf setup --agent
- Add version verification steps with PATH troubleshooting
- --agent takes adapter command name (uwf-hermes), not npm package
- Preset providers shown as table with default base URLs
- Non-preset providers must specify --base-url manually

Fixes #118 (#2, #3, #4, #5)
2026-06-05 16:41:35 +00:00
xiaomo 709b9dc1e5 Merge pull request 'fix: suppress ExperimentalWarning, PEP 668 guidance, setup help (#116)' (#117) from fix/116-setup-ux-2 into main
CI / check (push) Successful in 2m21s
2026-06-05 16:15:27 +00:00
xiaoju 7a788a9d90 fix: suppress ExperimentalWarning, PEP 668 guidance, setup help
CI / check (pull_request) Successful in 2m31s
- All 5 CLI bins: shebang --disable-warning=ExperimentalWarning
- Remove NODE_OPTIONS injection from thread.ts spawn (redundant now)
- Bootstrap pip install: venv (recommended) / pipx / source options
- setup --help mentions interactive wizard mode
- Update shebang test to accept -S flag

Fixes #116
2026-06-05 16:12:06 +00:00
xiaomo e5af5e9027 Merge pull request 'fix: setup UX improvements (#114)' (#115) from fix/114-setup-ux into main
CI / check (push) Successful in 2m43s
2026-06-05 15:45:02 +00:00
xiaoju fde87b6274 fix: setup UX improvements — adapter check, ENOENT, SQLite warning, VERSION, PATH docs
CI / check (pull_request) Successful in 2m24s
- setup validates adapter binary availability, prints install command if missing
- setup prints 'Config saved to <path> ✓' on success
- spawn ENOENT gives actionable error with which command
- SQLite ExperimentalWarning suppressed via NODE_OPTIONS
- bootstrap VERSION reads cli package.json (was reading util)
- bootstrap PATH guidance is shell-agnostic

Fixes #114
2026-06-05 15:42:22 +00:00
58 changed files with 823 additions and 937 deletions
-9
View File
@@ -1,9 +0,0 @@
---
"@united-workforce/cli": patch
---
fix: expand bootstrap prompt with full onboarding and upgrade guide
Bootstrap now covers two scenarios:
- Fresh install: CLI + adapter installation, `uwf setup` configuration, skill installation, end-to-end verification
- Upgrade: package update, skill regeneration, breaking change migrations (e.g. $START new/resume)
-8
View File
@@ -1,8 +0,0 @@
---
"@united-workforce/cli": patch
---
fix: bootstrap adds Step 0 environment pre-flight check
- Pre-flight checks for node, pnpm/npm, global bin PATH, hermes CLI with FIX instructions (#112)
- Install commands changed from npm to pnpm (with npm fallback)
-9
View File
@@ -1,9 +0,0 @@
---
"@united-workforce/cli": patch
"@united-workforce/util": patch
---
fix: workflow-authoring flat schema example uses enum, bootstrap adds PATH guidance
- workflow-authoring: flat schema example uses `enum: [done]` instead of bare `const` (#110.3)
- bootstrap: adds `which hermes` check and PATH guidance for venv installs (#110.4)
-10
View File
@@ -1,10 +0,0 @@
---
"@united-workforce/cli": patch
---
fix: preset provider base-url auto-fill, bootstrap ACP docs, friendlier name mismatch error
- `uwf setup --provider dashscope` now auto-fills `--base-url` from preset list (#106)
- Bootstrap guide documents uwf-hermes ACP dependency (`pip install hermes-agent[acp]`) (#107)
- Bootstrap verify step uses inline workflow instead of missing `examples/eval-simple.yaml` (#107)
- Workflow filename mismatch error now suggests how to fix it (#108)
-9
View File
@@ -1,9 +0,0 @@
---
"@united-workforce/cli": minor
"@united-workforce/util": patch
---
feat: replace $START `_` status with `new`/`resume` semantics
BREAKING: All workflow YAML files must update `$START._` to `$START.new` + `$START.resume`.
The `resume` edge prompt replaces the previously hardcoded resume message in the CLI.
-247
View File
@@ -1,247 +0,0 @@
name: "solve-issue"
description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds."
roles:
planner:
description: "Analyzes issue and outputs a TDD test spec"
goal: "You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify."
capabilities:
- issue-analysis
- planning
procedure: |
On first run (no previous steps):
1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
2. Look for project conventions files (CLAUDE.md, CONTRIBUTING.md, .cursor/rules/) in the repo
3. Assess whether the issue has enough information to produce a test spec
4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output $status=insufficient_info
5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
On subsequent runs (bounced back by tester with fix_spec):
1. Read the tester's output from the previous step to understand what's wrong with the spec
2. Revise the test spec accordingly
After producing the test spec:
1. The test spec is stored in CAS automatically by the uwf pipeline (agents do not need to call `ocas put` directly)
2. Put the plan hash in frontmatter.plan (required when $status=ready)
3. Set repoPath to the absolute path of the repository root
IMPORTANT: Extract the repo remote (owner/repo) from git:
```bash
git remote get-url origin | sed 's|.*[:/]\([^/]*/[^.]*\).*|\1|'
```
Store the result as repoRemote in your frontmatter output so downstream roles can use it for tea/API calls.
output: "Output a brief summary of the test spec. Set $status to ready (with plan hash and repoPath) or insufficient_info."
frontmatter:
oneOf:
- properties:
$status: { const: "ready" }
plan: { type: string }
repoPath: { type: string }
repoRemote: { type: string }
required: [$status, plan, repoPath, repoRemote]
- properties:
$status: { const: "insufficient_info" }
reason: { type: string }
required: [$status, reason]
developer:
description: "TDD implementation per test spec"
goal: "You are a developer agent. You implement code changes following TDD — write tests first, then implementation."
capabilities:
- coding
procedure: |
IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
The repo path and other details are provided in your task prompt.
Before starting any work, set up an isolated worktree:
1. cd into the repo path provided in your task prompt
2. `git fetch origin` to get latest refs
3. First time (no existing branch):
- `git worktree add .worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
- `cd .worktrees/fix/<issue-number>-<short-slug> && bun install`
4. If bounced back from reviewer or tester (branch already exists):
- cd into the existing worktree under `.worktrees/fix/<issue-number>-<short-slug>`
- `git fetch origin && git rebase origin/main`
5. ALL subsequent work must happen inside the worktree directory.
Then implement TDD:
6. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner's output in your task prompt)
7. If bounced back from reviewer or tester: read the previous role's feedback in your task prompt
8. Write tests first based on the spec
9. Implement the code to make tests pass
10. Ensure `bun run build` passes with no errors
11. Run `bun test` to verify all tests pass
- If tests fail on first run:
* Read the test output carefully for missing imports or setup issues
* Check if you're running tests from the correct working directory (package root vs workspace root)
* Fix the immediate issue and rerun ONCE
* If tests still fail after 2 attempts: check the test spec for ambiguities
* If stuck after 3 test cycles: set $status=failed with detailed error report rather than continuing blind retries
12. MANDATORY VERIFICATION before reporting done:
- Run `git branch --show-current` and confirm branch name matches expected
- Run `git status` and verify changed files exist
- Run `ls -la <key-implementation-files>` to verify they exist on disk
- If ANY verification fails: retry the implementation, do NOT report done
If you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,
or repeated attempts fail), set $status=failed with a reason.
output: "List all files changed and provide a summary. Set $status to done (with branch/worktree), or failed (with reason)."
frontmatter:
oneOf:
- properties:
$status: { const: "done" }
branch: { type: string }
worktree: { type: string }
repoRemote: { type: string }
required: [$status, branch, worktree]
- properties:
$status: { const: "failed" }
reason: { type: string }
required: [$status, reason]
reviewer:
description: "Code standards compliance check"
goal: "You are a code reviewer. You verify code standards compliance — NOT functionality (that's the tester's job)."
capabilities:
- code-review
- static-analysis
procedure: |
The worktree path is provided in your task prompt. cd into it first.
CRITICAL: You MUST execute every verification command below. Do NOT report results without running the actual commands. Do NOT rely on prior context or assumptions.
Before reviewing, verify the worktree and branch exist:
0. Run `cd <worktree-path> && pwd` to confirm the path is accessible
- If the cd fails: the worktree truly doesn't exist, reject with that reason
- If the cd succeeds: proceed with step 1 below
1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
2. If the branch doesn't correspond to the issue, flag it in your output and reject
Then perform code review:
Hard checks (must all pass):
3. `bun run build` — no build errors
4. `bunx biome check` — no lint violations
5. TypeScript strict mode — no type errors
Soft checks (review against project conventions if CLAUDE.md / .cursor/rules exist):
- Naming conventions, module boundaries, code style
- No `console.log` in production code
- No dynamic imports in production code
Only review standards compliance. Do NOT test functionality.
If rejecting, you MUST explain the specific reason in your output.
output: "Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments)."
frontmatter:
oneOf:
- properties:
$status: { const: "approved" }
branch: { type: string }
worktree: { type: string }
repoRemote: { type: string }
required: [$status, branch, worktree]
- properties:
$status: { const: "rejected" }
comments: { type: string }
worktree: { type: string }
repoRemote: { type: string }
required: [$status, comments, worktree]
tester:
description: "Functional correctness verification"
goal: "You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec."
capabilities:
- testing
procedure: |
The worktree path is provided in your task prompt. cd into it first.
1. Run `bun test` for automated test verification
2. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner step in the thread history)
3. Verify each scenario in the spec is covered and passing
4. Determine outcome:
- passed: all scenarios verified, tests pass
- fix_code: tests fail or implementation doesn't match spec → send back to developer
- fix_spec: the spec itself is wrong or incomplete → send back to planner
output: "Report test results per scenario. Set $status to passed (with branch/worktree), fix_code (with report), or fix_spec (with report)."
frontmatter:
oneOf:
- properties:
$status: { const: "passed" }
branch: { type: string }
worktree: { type: string }
repoRemote: { type: string }
required: [$status, branch, worktree]
- properties:
$status: { const: "fix_code" }
report: { type: string }
repoRemote: { type: string }
worktree: { type: string }
branch: { type: string }
required: [$status, report]
- properties:
$status: { const: "fix_spec" }
report: { type: string }
repoRemote: { type: string }
worktree: { type: string }
branch: { type: string }
required: [$status, report]
committer:
description: "Commits and creates PR"
goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue."
capabilities: []
procedure: |
The worktree path, branch name, and repo remote (owner/repo) are provided in your task prompt.
cd into the worktree first.
Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
1. Check `git status` — if working tree is clean and branch is ahead of origin, skip to step 3 (push).
2. If there are unstaged/uncommitted changes: `git add -A` then `git commit -m "type: description\n\nFixes #N"`
3. Push the branch: `git push -u origin <branch-name>`
4. **Verify push succeeded** — run `git ls-remote origin <branch-name>` and confirm it prints a commit hash.
- If no output or push failed: capture the error, mark hook_failed
5. Create a PR using the Gitea API (do NOT use `tea pr create` — it fails in worktrees):
```bash
GITEA_TOKEN=$(cfg get GITEA_TOKEN)
curl -s -X POST -H "Authorization: token $GITEA_TOKEN" -H "Content-Type: application/json" \
"https://git.shazhou.work/api/v1/repos/<owner>/<repo>/pulls" \
-d '{"title":"...","body":"...","head":"<branch>","base":"main"}'
```
- The repo remote (owner/repo format, e.g. "shazhou/united-workforce") is given in your task prompt — use it directly.
- PR body must include: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
6. **Verify PR was created** — parse the curl response JSON: it must contain a `"number"` field. Print the PR URL.
- If curl returns an error or no number field: capture the response, mark hook_failed
7. After PR creation, clean up the worktree:
- cd to the repo root (parent of .worktrees)
- `git worktree remove <worktree-path>`
output: "Include PR URL on success or error log on failure. Set $status to committed (with prUrl) or hook_failed (with error)."
frontmatter:
oneOf:
- properties:
$status: { const: "committed" }
prUrl: { type: string }
repoRemote: { type: string }
worktree: { type: string }
branch: { type: string }
required: [$status, prUrl]
- properties:
$status: { const: "hook_failed" }
error: { type: string }
repoRemote: { type: string }
worktree: { type: string }
branch: { type: string }
required: [$status, error]
graph:
$START:
new: { role: "planner", prompt: "Analyze the issue and produce an implementation plan." }
resume: { role: "planner", prompt: "Review the previous run output and continue the work." }
planner:
insufficient_info: { role: "$SUSPEND", prompt: "信息不足,需要补充:{{{reason}}}" }
ready: { role: "developer", prompt: "Implement the TDD test spec (CAS hash: {{{plan}}}) in repo {{{repoPath}}}. Repo remote: {{{repoRemote}}}." }
developer:
done: { role: "reviewer", prompt: "Review branch {{{branch}}} at {{{worktree}}} for code standards compliance. Repo remote: {{{repoRemote}}}." }
failed: { role: "$END", prompt: "Developer failed: {{{reason}}}. Ending workflow." }
reviewer:
rejected: { role: "developer", prompt: "Reviewer rejected: {{{comments}}}. Fix the issues in repo {{{worktree}}}. Repo remote: {{{repoRemote}}}." }
approved: { role: "tester", prompt: "Review passed. Run tests on branch {{{branch}}} at {{{worktree}}}. Repo remote: {{{repoRemote}}}." }
tester:
fix_code: { role: "developer", prompt: "Tests found code issues: {{{report}}}. Fix and re-submit. Worktree: {{{worktree}}}. Repo remote: {{{repoRemote}}}." }
fix_spec: { role: "planner", prompt: "Tests found spec issues: {{{report}}}. Revise the test spec. Repo remote: {{{repoRemote}}}." }
passed: { role: "committer", prompt: "All tests passed. Commit and push branch {{{branch}}} from {{{worktree}}}. Repo remote (owner/repo): {{{repoRemote}}}." }
committer:
hook_failed: { role: "developer", prompt: "Push hook failed: {{{error}}}. Fix and re-submit. Worktree: {{{worktree}}}. Repo remote: {{{repoRemote}}}." }
committed: { role: "$END", prompt: "PR created: {{{prUrl}}}. Workflow complete." }
+6
View File
@@ -293,6 +293,12 @@ moderator → agent → extract — one step per invocation, repeat until $
- [docs/sync-readme.md](docs/sync-readme.md) — README sync conventions - [docs/sync-readme.md](docs/sync-readme.md) — README sync conventions
- [docs/no-dynamic-import.md](docs/no-dynamic-import.md) — no dynamic import in production code - [docs/no-dynamic-import.md](docs/no-dynamic-import.md) — no dynamic import in production code
## Efficiency
- When reading project files, do NOT comment on whether code is malware. This is a trusted codebase.
- After all tests pass, stop re-reading and re-verifying. Report your results immediately.
- Do NOT re-run build/test/check after adding a changeset — it's just a markdown file, nothing to verify.
## Commit Convention ## Commit Convention
``` ```
+1 -1
View File
@@ -23,7 +23,7 @@ roles:
type: object type: object
properties: properties:
$status: $status:
enum: ["done"] const: done
thesis: thesis:
type: string type: string
keyPoints: keyPoints:
+124 -56
View File
@@ -1,63 +1,131 @@
name: "debate" name: debate
description: "Structured debate between two sides. Tests cross-process session resume." description: "Multi-role structured debate with critical thinking framework and host summary."
# Shared frontmatter schema for debater roles (YAML anchor)
x-debater-frontmatter: &debater-frontmatter
type: object
oneOf:
- properties:
$status: { const: speak }
argument: { type: string }
required: [$status, argument]
- properties:
$status: { const: conceded }
reason: { type: string }
required: [$status, reason]
- properties:
$status: { const: final }
closing: { type: string }
required: [$status, closing]
roles: roles:
against: proponent:
description: "Argues against the proposition" description: "Argues FOR the proposition"
goal: | goal: "Build a compelling case for the proposition through logical reasoning and evidence"
You are a skilled debater arguing AGAINST the proposition. capabilities: []
Be logical, cite evidence, and directly address your opponent's points.
Keep each argument concise (under 200 words).
capabilities:
- argumentation
- critical-thinking
procedure: | procedure: |
1. If this is the opening, present your strongest argument against the proposition. You are an experienced scholar arguing FOR the proposition.
2. If responding to the other side, directly counter their points with evidence and logic.
3. If you find yourself genuinely convinced by the other side, you may concede. ## Critical Thinking Framework (execute before every speech)
output: |
Provide your argument in the frontmatter. ### A. Pre-speech reflection (internal, do not output)
Set status to "conceded" ONLY if you are genuinely convinced and wish to stop debating. - Does every step in my argument chain hold? Any hidden assumptions or logical gaps?
Otherwise set status to "continue". - If I were my opponent, how would I attack this? Where am I weakest?
- Does my evidence actually support my claim, or could it backfire?
- Should I go on offense or defense this round?
### B. Evidence discipline
- Verify key numbers — watch for order-of-magnitude errors
- Assess data freshness — fast-moving fields have short half-lives
- Distinguish primary data from secondary citations, expert opinion, and common assumptions
### C. Anti-fragility
- Anticipate counterarguments; preemptively strengthen or strategically abandon weak points
- Catch logical gaps, data misuse, or outdated claims in your opponent's reasoning
## Rules
1. Check Thread Progress to see how many times you have spoken.
2. On your 3rd speech, you MUST output $status: final (closing statement).
3. If genuinely convinced by the opponent, output $status: conceded.
4. Otherwise output $status: speak and counter the opponent's points.
5. Be rigorous, cite evidence, stay concise.
output: "Debate argument"
frontmatter: *debater-frontmatter
opponent:
description: "Argues AGAINST the proposition"
goal: "Build a compelling case against the proposition through logical reasoning and evidence"
capabilities: []
procedure: |
You are an experienced scholar arguing AGAINST the proposition.
## Critical Thinking Framework (execute before every speech)
### A. Pre-speech reflection (internal, do not output)
- Does every step in my argument chain hold? Any hidden assumptions or logical gaps?
- If I were my opponent, how would I attack this? Where am I weakest?
- Does my evidence actually support my claim, or could it backfire?
- Should I go on offense or defense this round?
### B. Evidence discipline
- Verify key numbers — watch for order-of-magnitude errors
- Assess data freshness — fast-moving fields have short half-lives
- Distinguish primary data from secondary citations, expert opinion, and common assumptions
### C. Anti-fragility
- Anticipate counterarguments; preemptively strengthen or strategically abandon weak points
- Catch logical gaps, data misuse, or outdated claims in your opponent's reasoning
## Rules
1. Check Thread Progress to see how many times you have spoken.
2. On your 3rd speech, or when the proponent has issued a final statement, you MUST output $status: final.
3. If genuinely convinced by the proponent, output $status: conceded.
4. Otherwise output $status: speak and counter the proponent's points.
5. Be rigorous, cite evidence, stay concise.
output: "Debate argument"
frontmatter: *debater-frontmatter
host:
description: "Debate moderator — delivers impartial summary and verdict"
goal: "Objectively review the debate, analyze both sides, and deliver a verdict"
capabilities: []
procedure: |
You are an experienced academic debate moderator.
## Task
1. Outline each side's core arguments
2. Evaluate reasoning quality and evidence use
3. Highlight the most impactful exchanges
4. Analyze the deeper significance of the topic
5. Deliver an overall verdict
## Style
- Impartial but with independent judgment
- Substantive, not superficial
output: "Debate summary report"
frontmatter: frontmatter:
type: object type: object
properties: properties:
$status: $status: { const: done }
enum: ["continue", "conceded"] summary: { type: string }
argument: highlights: { type: string }
type: string verdict: { type: string }
required: [$status, argument] required: [$status, summary, highlights, verdict]
for:
description: "Argues for the proposition"
goal: |
You are a skilled debater arguing FOR the proposition.
Be logical, cite evidence, and directly address your opponent's points.
Keep each argument concise (under 200 words).
capabilities:
- argumentation
- critical-thinking
procedure: |
1. Read the opposing side's latest argument carefully.
2. Counter their points with evidence and logic.
3. If you find yourself genuinely convinced by the other side, you may concede.
output: |
Provide your argument in the frontmatter.
Set status to "conceded" ONLY if you are genuinely convinced and wish to stop debating.
Otherwise set status to "continue".
frontmatter:
type: object
properties:
$status:
enum: ["continue", "conceded"]
argument:
type: string
required: [$status, argument]
graph: graph:
$START: $START:
new: { role: "against", prompt: "Present your opening argument against the proposition." } new: { role: proponent, prompt: "The debate begins. You are arguing FOR the proposition. Present your opening argument." }
resume: { role: "against", prompt: "Review the previous debate output and continue the argument against the proposition." } resume: { role: proponent, prompt: "The debate continues." }
against:
conceded: { role: "$END", prompt: "The against side conceded. Debate over." } proponent:
continue: { role: "for", prompt: "Counter the opposing argument: {{{argument}}}" } speak: { role: opponent, prompt: "Proponent argues:\n\n{{{argument}}}\n\nYou are the opponent. Counter this argument." }
for: conceded: { role: host, prompt: "The proponent conceded: {{{reason}}}\n\nPlease summarize the debate." }
conceded: { role: "$END", prompt: "The for side conceded. Debate over." } final: { role: opponent, prompt: "Proponent's closing statement:\n\n{{{closing}}}\n\nYou are the opponent. Deliver your final response." }
continue: { role: "against", prompt: "Counter the opposing argument: {{{argument}}}" }
opponent:
speak: { role: proponent, prompt: "Opponent argues:\n\n{{{argument}}}\n\nYou are the proponent. Counter this argument." }
conceded: { role: host, prompt: "The opponent conceded: {{{reason}}}\n\nPlease summarize the debate." }
final: { role: host, prompt: "Opponent's closing statement:\n\n{{{closing}}}\n\nThe debate is over. Please summarize." }
host:
done: { role: "$END", prompt: "Summary complete." }
+1 -2
View File
@@ -18,8 +18,7 @@ roles:
type: object type: object
properties: properties:
$status: $status:
type: string const: done
enum: [done]
summary: summary:
type: string type: string
required: [$status, summary] required: [$status, summary]
+27 -7
View File
@@ -1,5 +1,5 @@
name: "solve-issue" name: "solve-issue"
description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds." description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds. Uses pnpm."
roles: roles:
planner: planner:
description: "Analyzes issue and outputs a TDD test spec" description: "Analyzes issue and outputs a TDD test spec"
@@ -80,7 +80,7 @@ roles:
2. `git fetch origin` to get latest refs 2. `git fetch origin` to get latest refs
3. First time (no existing branch): 3. First time (no existing branch):
- `git worktree add .worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main` - `git worktree add .worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
- `cd .worktrees/fix/<issue-number>-<short-slug> && bun install` - `cd .worktrees/fix/<issue-number>-<short-slug> && pnpm install`
4. If continuing on existing branch (prompt says "Continue work on existing branch" or provides a worktree path): 4. If continuing on existing branch (prompt says "Continue work on existing branch" or provides a worktree path):
- cd directly into the worktree path provided in the prompt - cd directly into the worktree path provided in the prompt
- `git fetch origin && git rebase origin/main` - `git fetch origin && git rebase origin/main`
@@ -95,8 +95,20 @@ roles:
7. If bounced back from reviewer or tester: read the previous role's feedback in your task prompt 7. If bounced back from reviewer or tester: read the previous role's feedback in your task prompt
8. Write tests first based on the spec 8. Write tests first based on the spec
9. Implement the code to make tests pass 9. Implement the code to make tests pass
10. Ensure `bun run build` passes with no errors 10. Ensure `pnpm run build` passes with no errors
11. Run `bun test` to verify all tests pass 11. Run `pnpm test` to verify all tests pass
After implementation, before reporting done:
12. Add a changeset file (`.changeset/<short-slug>.md`) with correct bump type:
- `patch` for bug fixes, internal refactors, test-only changes
- `minor` for new features, new CLI commands, new API surfaces
- `major` for breaking changes
List every affected package in the changeset frontmatter.
13. Update documentation if the change affects user-facing behavior:
- `README.md` — usage examples, feature descriptions
- `.cards/` — architecture decision records (if applicable)
- CLI prompt subcommand output (if CLI help text changes)
- CLI `--help` text (if flags/commands are added or changed)
If you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors, If you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,
or repeated attempts fail), set $status=failed with a reason. or repeated attempts fail), set $status=failed with a reason.
@@ -127,8 +139,8 @@ roles:
Then perform code review: Then perform code review:
Hard checks (must all pass): Hard checks (must all pass):
3. `bun run build` — no build errors 3. `pnpm run build` — no build errors
4. `bunx biome check` — no lint violations 4. `pnpm run check` — no lint violations
5. TypeScript strict mode — no type errors 5. TypeScript strict mode — no type errors
Soft checks (review against project conventions if CLAUDE.md / .cursor/rules exist): Soft checks (review against project conventions if CLAUDE.md / .cursor/rules exist):
@@ -136,6 +148,14 @@ roles:
- No `console.log` in production code - No `console.log` in production code
- No dynamic imports in production code - No dynamic imports in production code
Documentation & changeset checks:
6. Changeset exists in `.changeset/` with correct bump type (`patch`/`minor`/`major`) and lists all affected packages
7. If the change is user-facing, documentation is updated:
- `README.md` reflects new/changed behavior
- `.cards/` architecture cards updated if design decisions changed
- CLI prompt subcommand output updated (if it generates skill/reference content)
- CLI `--help` text matches new flags/commands
Only review standards compliance. Do NOT test functionality. Only review standards compliance. Do NOT test functionality.
If rejecting, you MUST explain the specific reason in your output. If rejecting, you MUST explain the specific reason in your output.
output: "Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments)." output: "Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments)."
@@ -159,7 +179,7 @@ roles:
procedure: | procedure: |
The worktree path is provided in your task prompt. cd into it first. The worktree path is provided in your task prompt. cd into it first.
1. Run `bun test` for automated test verification 1. Run `pnpm test` for automated test verification
2. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner step in the thread history) 2. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner step in the thread history)
3. Verify each scenario in the spec is covered and passing 3. Verify each scenario in the spec is covered and passing
4. Determine outcome: 4. Determine outcome:
+1 -1
View File
@@ -21,7 +21,7 @@
"@agentclientprotocol/sdk": "^0.22.1", "@agentclientprotocol/sdk": "^0.22.1",
"@biomejs/biome": "^2.4.14", "@biomejs/biome": "^2.4.14",
"@changesets/cli": "^2.31.0", "@changesets/cli": "^2.31.0",
"@shazhou/proman": "^0.5.1", "@shazhou/proman": "^0.6.3",
"@types/node": "^25.7.0", "@types/node": "^25.7.0",
"@types/xxhashjs": "^0.2.4", "@types/xxhashjs": "^0.2.4",
"@united-workforce/agent-hermes": "workspace:*", "@united-workforce/agent-hermes": "workspace:*",
+2 -2
View File
@@ -1,6 +1,6 @@
{ {
"name": "@united-workforce/agent-builtin", "name": "@united-workforce/agent-builtin",
"version": "0.1.1", "version": "0.1.2",
"files": [ "files": [
"src", "src",
"dist", "dist",
@@ -21,7 +21,7 @@
"test:ci": "vitest run __tests__/" "test:ci": "vitest run __tests__/"
}, },
"dependencies": { "dependencies": {
"@ocas/core": "^0.3.0", "@ocas/core": "^0.4.0",
"@united-workforce/util": "workspace:^", "@united-workforce/util": "workspace:^",
"@united-workforce/util-agent": "workspace:^" "@united-workforce/util-agent": "workspace:^"
}, },
+1 -1
View File
@@ -1,4 +1,4 @@
#!/usr/bin/env node #!/usr/bin/env -S node --disable-warning=ExperimentalWarning
// eslint-disable-next-line -- dynamic import for version // eslint-disable-next-line -- dynamic import for version
const pkg = await import("../package.json", { with: { type: "json" } }); const pkg = await import("../package.json", { with: { type: "json" } });
+8
View File
@@ -0,0 +1,8 @@
# Changelog
## 0.1.4 — 2026-06-07
- fix: decouple session resume from isFirstVisit guard
When frontmatter validation fails, the step is never written to CAS, so isFirstVisit remains true on the next run. Both adapters now always check the session cache regardless of isFirstVisit. When resuming after a frontmatter-only failure (isFirstVisit + cache hit), a minimal correction prompt is sent via buildFrontmatterRetryPrompt() instead of re-sending the full initial prompt.
+2 -2
View File
@@ -1,6 +1,6 @@
{ {
"name": "@united-workforce/agent-claude-code", "name": "@united-workforce/agent-claude-code",
"version": "0.1.1", "version": "0.1.4",
"files": [ "files": [
"src", "src",
"dist", "dist",
@@ -21,7 +21,7 @@
"test:ci": "vitest run __tests__/" "test:ci": "vitest run __tests__/"
}, },
"dependencies": { "dependencies": {
"@ocas/core": "^0.3.0", "@ocas/core": "^0.4.0",
"@united-workforce/protocol": "workspace:^", "@united-workforce/protocol": "workspace:^",
"@united-workforce/util": "workspace:^", "@united-workforce/util": "workspace:^",
"@united-workforce/util-agent": "workspace:^" "@united-workforce/util-agent": "workspace:^"
+21 -4
View File
@@ -6,7 +6,9 @@ import {
type AgentContext, type AgentContext,
type AgentRunResult, type AgentRunResult,
buildContinuationPrompt, buildContinuationPrompt,
buildFrontmatterRetryPrompt,
buildRolePrompt, buildRolePrompt,
buildThreadProgress,
createAgent, createAgent,
getCachedSessionId, getCachedSessionId,
setCachedSessionId, setCachedSessionId,
@@ -27,6 +29,10 @@ export function buildClaudeCodePrompt(ctx: AgentContext): string {
if (ctx.outputFormatInstruction !== undefined && ctx.outputFormatInstruction !== "") { if (ctx.outputFormatInstruction !== undefined && ctx.outputFormatInstruction !== "") {
parts.push(ctx.outputFormatInstruction, ""); parts.push(ctx.outputFormatInstruction, "");
} }
// Inject thread progress so the agent knows step count and role visit count
parts.push(buildThreadProgress(ctx.steps, ctx.role), "");
parts.push(rolePrompt, "", "## Task", ctx.start.prompt); parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
if (!ctx.isFirstVisit) { if (!ctx.isFirstVisit) {
@@ -171,8 +177,12 @@ async function runClaudeCode(ctx: AgentContext, model: string | null): Promise<A
log("K7R2M4N8", `prompt for role=${ctx.role} (length=${fullPrompt.length}):\n${fullPrompt}`); log("K7R2M4N8", `prompt for role=${ctx.role} (length=${fullPrompt.length}):\n${fullPrompt}`);
// Try resuming a cached session for re-entry scenarios (e.g. reviewer reject → developer re-entry). // Try resuming a cached session. This covers both normal re-entry
if (!ctx.isFirstVisit) { // (e.g. reviewer reject → developer re-entry) AND the case where a
// previous run completed but frontmatter validation failed — the step
// was never written to CAS so isFirstVisit is still true, but the
// session cache holds a valid session we should resume.
{
const cachedSessionId = await getCachedSessionId( const cachedSessionId = await getCachedSessionId(
"claude-code", "claude-code",
ctx.threadId, ctx.threadId,
@@ -180,13 +190,20 @@ async function runClaudeCode(ctx: AgentContext, model: string | null): Promise<A
ctx.storageRoot, ctx.storageRoot,
); );
if (cachedSessionId !== null) { if (cachedSessionId !== null) {
// isFirstVisit + cache hit = previous run completed but frontmatter
// validation failed. The session already has full context — send a
// minimal correction prompt instead of the full initial prompt.
const resumePrompt = ctx.isFirstVisit
? buildFrontmatterRetryPrompt(ctx.outputFormatInstruction)
: fullPrompt;
try { try {
const { stdout, stderr, exitCode } = await spawnClaudeResume( const { stdout, stderr, exitCode } = await spawnClaudeResume(
cachedSessionId, cachedSessionId,
fullPrompt, resumePrompt,
model, model,
); );
const result = await processClaudeOutput(stdout, stderr, exitCode, ctx.store, fullPrompt); const result = await processClaudeOutput(stdout, stderr, exitCode, ctx.store, resumePrompt);
if (result.sessionId !== undefined && result.sessionId !== "") { if (result.sessionId !== undefined && result.sessionId !== "") {
await setCachedSessionId( await setCachedSessionId(
"claude-code", "claude-code",
+1 -1
View File
@@ -1,4 +1,4 @@
#!/usr/bin/env node #!/usr/bin/env -S node --disable-warning=ExperimentalWarning
// eslint-disable-next-line -- dynamic import for version // eslint-disable-next-line -- dynamic import for version
const pkg = await import("../package.json", { with: { type: "json" } }); const pkg = await import("../package.json", { with: { type: "json" } });
+6
View File
@@ -1,5 +1,11 @@
# @united-workforce/agent-hermes # @united-workforce/agent-hermes
## 0.1.5 — 2026-06-07
- fix: decouple session resume from isFirstVisit guard
When frontmatter validation fails, the step is never written to CAS, so isFirstVisit remains true on the next run. Both adapters now always check the session cache regardless of isFirstVisit. When resuming after a frontmatter-only failure (isFirstVisit + cache hit), a minimal correction prompt is sent via buildFrontmatterRetryPrompt() instead of re-sending the full initial prompt.
## 0.1.1 ## 0.1.1
### Patch Changes ### Patch Changes
@@ -15,7 +15,8 @@ describe("Issue #551 — bin entry & engines", () => {
const pkg = JSON.parse(readFileSync(join(PKG_ROOT, "package.json"), "utf-8")); const pkg = JSON.parse(readFileSync(join(PKG_ROOT, "package.json"), "utf-8"));
const binPath = pkg.bin["uwf-hermes"]; const binPath = pkg.bin["uwf-hermes"];
const content = readFileSync(join(PKG_ROOT, binPath), "utf-8"); const content = readFileSync(join(PKG_ROOT, binPath), "utf-8");
expect(content.startsWith("#!/usr/bin/env node")).toBe(true); expect(content.startsWith("#!/usr/bin/env")).toBe(true);
expect(content).toContain("node");
}); });
test("README.md explains uwf-hermes is an adapter", () => { test("README.md explains uwf-hermes is an adapter", () => {
+2 -2
View File
@@ -1,6 +1,6 @@
{ {
"name": "@united-workforce/agent-hermes", "name": "@united-workforce/agent-hermes",
"version": "0.1.2", "version": "0.1.5",
"files": [ "files": [
"src", "src",
"dist", "dist",
@@ -21,7 +21,7 @@
"test:ci": "vitest run __tests__/" "test:ci": "vitest run __tests__/"
}, },
"dependencies": { "dependencies": {
"@ocas/core": "^0.3.0", "@ocas/core": "^0.4.0",
"@united-workforce/protocol": "workspace:^", "@united-workforce/protocol": "workspace:^",
"@united-workforce/util": "workspace:^", "@united-workforce/util": "workspace:^",
"@united-workforce/util-agent": "workspace:^" "@united-workforce/util-agent": "workspace:^"
+7 -2
View File
@@ -12,7 +12,11 @@ const OWN_VERSION = (
} }
).version; ).version;
const HERMES_COMMAND = "hermes"; /** Resolve hermes binary: `UWF_HERMES_BIN` override → default `"hermes"` via PATH. */
function resolveHermesCommand(): string {
const override = process.env.UWF_HERMES_BIN;
return override !== undefined && override !== "" ? override : "hermes";
}
const PROTOCOL_VERSION = 1; const PROTOCOL_VERSION = 1;
type JsonRpcResponse = { type JsonRpcResponse = {
@@ -271,7 +275,8 @@ export class HermesAcpClient {
return; return;
} }
const child = spawn(HERMES_COMMAND, ["acp"], { const hermesCommand = resolveHermesCommand();
const child = spawn(hermesCommand, ["acp"], {
env: process.env, env: process.env,
shell: false, shell: false,
stdio: ["pipe", "pipe", "pipe"], stdio: ["pipe", "pipe", "pipe"],
+1 -1
View File
@@ -1,4 +1,4 @@
#!/usr/bin/env node #!/usr/bin/env -S node --disable-warning=ExperimentalWarning
// eslint-disable-next-line -- dynamic import for version // eslint-disable-next-line -- dynamic import for version
const pkg = await import("../package.json", { with: { type: "json" } }); const pkg = await import("../package.json", { with: { type: "json" } });
+27 -9
View File
@@ -5,7 +5,9 @@ import {
type AgentContext, type AgentContext,
type AgentRunResult, type AgentRunResult,
buildContinuationPrompt, buildContinuationPrompt,
buildFrontmatterRetryPrompt,
buildRolePrompt, buildRolePrompt,
buildThreadProgress,
createAgent, createAgent,
} from "@united-workforce/util-agent"; } from "@united-workforce/util-agent";
import type { AcpUsage } from "./acp-client.js"; import type { AcpUsage } from "./acp-client.js";
@@ -60,6 +62,9 @@ export function buildHermesPrompt(ctx: AgentContext): string {
parts.push(ctx.outputFormatInstruction, ""); parts.push(ctx.outputFormatInstruction, "");
} }
// Inject thread progress so the agent knows step count and role visit count
parts.push(buildThreadProgress(ctx.steps, ctx.role), "");
if (!ctx.isFirstVisit) { if (!ctx.isFirstVisit) {
// Re-entry: show only steps since last visit, meta only // Re-entry: show only steps since last visit, meta only
parts.push(buildContinuationPrompt(ctx.steps, ctx.role, ctx.edgePrompt)); parts.push(buildContinuationPrompt(ctx.steps, ctx.role, ctx.edgePrompt));
@@ -98,6 +103,8 @@ async function storePromptResult(store: Store, sessionId: string): Promise<{ det
type PromptAttempt = { type PromptAttempt = {
useContinuation: boolean; useContinuation: boolean;
resumed: boolean; resumed: boolean;
/** True when resuming after a frontmatter-only failure (isFirstVisit + cache hit). */
frontmatterRetry: boolean;
}; };
async function prepareSession( async function prepareSession(
@@ -106,28 +113,36 @@ async function prepareSession(
cwd: string, cwd: string,
resumeDisabled: boolean, resumeDisabled: boolean,
): Promise<PromptAttempt> { ): Promise<PromptAttempt> {
if (ctx.isFirstVisit || resumeDisabled) { if (resumeDisabled) {
await client.connect(cwd); await client.connect(cwd);
return { useContinuation: false, resumed: false }; return { useContinuation: false, resumed: false, frontmatterRetry: false };
} }
// Check session cache regardless of isFirstVisit. A previous run may
// have completed and cached its session but failed frontmatter
// validation — the step never got written to CAS so isFirstVisit is
// still true, yet we should resume the existing session.
const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role, ctx.storageRoot); const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role, ctx.storageRoot);
if (cachedSessionId === null) { if (cachedSessionId === null) {
log("6RWK3N8Q", `no cached session for ${ctx.threadId}:${ctx.role}, starting new session`); log("6RWK3N8Q", `no cached session for ${ctx.threadId}:${ctx.role}, starting new session`);
await client.connect(cwd); await client.connect(cwd);
return { useContinuation: false, resumed: false }; return { useContinuation: false, resumed: false, frontmatterRetry: false };
} }
try { try {
await client.resume(cachedSessionId, cwd); await client.resume(cachedSessionId, cwd);
log("9MHT4V2P", `resumed hermes session ${cachedSessionId} for ${ctx.threadId}:${ctx.role}`); log("9MHT4V2P", `resumed hermes session ${cachedSessionId} for ${ctx.threadId}:${ctx.role}`);
return { useContinuation: true, resumed: true }; return {
useContinuation: !ctx.isFirstVisit,
resumed: true,
frontmatterRetry: ctx.isFirstVisit,
};
} catch (error) { } catch (error) {
const message = error instanceof Error ? error.message : String(error); const message = error instanceof Error ? error.message : String(error);
log("3XPN7K4W", `session resume failed, falling back to new session: ${message}`); log("3XPN7K4W", `session resume failed, falling back to new session: ${message}`);
await client.close(); await client.close();
await client.connect(cwd); await client.connect(cwd);
return { useContinuation: false, resumed: false }; return { useContinuation: false, resumed: false, frontmatterRetry: false };
} }
} }
@@ -150,9 +165,12 @@ export function createHermesAgent(resumeDisabled: boolean): () => Promise<void>
ctx: AgentContext, ctx: AgentContext,
useContinuation: boolean, useContinuation: boolean,
beforeTurns: TurnsSnapshot, beforeTurns: TurnsSnapshot,
frontmatterRetry: boolean,
): Promise<AgentRunResult> { ): Promise<AgentRunResult> {
const effectiveCtx = useContinuation ? ctx : { ...ctx, isFirstVisit: true }; // Frontmatter retry: session has full context, just re-output the format.
const fullPrompt = buildHermesPrompt(effectiveCtx); const fullPrompt = frontmatterRetry
? buildFrontmatterRetryPrompt(ctx.outputFormatInstruction)
: buildHermesPrompt(useContinuation ? ctx : { ...ctx, isFirstVisit: true });
const startMs = Date.now(); const startMs = Date.now();
const { text, sessionId, usage: acpUsage } = await client.prompt(fullPrompt); const { text, sessionId, usage: acpUsage } = await client.prompt(fullPrompt);
const durationSec = (Date.now() - startMs) / 1000; const durationSec = (Date.now() - startMs) / 1000;
@@ -184,7 +202,7 @@ export function createHermesAgent(resumeDisabled: boolean): () => Promise<void>
const beforeTurns = snapshotTurns(beforeSession); const beforeTurns = snapshotTurns(beforeSession);
try { try {
return await runPrompt(ctx, attempt.useContinuation, beforeTurns); return await runPrompt(ctx, attempt.useContinuation, beforeTurns, attempt.frontmatterRetry);
} catch (error) { } catch (error) {
if (!attempt.resumed) { if (!attempt.resumed) {
throw error; throw error;
@@ -195,7 +213,7 @@ export function createHermesAgent(resumeDisabled: boolean): () => Promise<void>
await client.close(); await client.close();
await client.connect(cwd); await client.connect(cwd);
// Fresh session after retry — reset snapshot to zero // Fresh session after retry — reset snapshot to zero
return runPrompt(ctx, false, ZERO_TURNS); return runPrompt(ctx, false, ZERO_TURNS, false);
} }
} }
+2 -2
View File
@@ -1,6 +1,6 @@
{ {
"name": "@united-workforce/agent-mock", "name": "@united-workforce/agent-mock",
"version": "0.1.1", "version": "0.1.2",
"files": [ "files": [
"src", "src",
"dist", "dist",
@@ -21,7 +21,7 @@
"test:ci": "vitest run __tests__/" "test:ci": "vitest run __tests__/"
}, },
"dependencies": { "dependencies": {
"@ocas/core": "^0.3.0", "@ocas/core": "^0.4.0",
"@united-workforce/protocol": "workspace:^", "@united-workforce/protocol": "workspace:^",
"@united-workforce/util": "workspace:^", "@united-workforce/util": "workspace:^",
"@united-workforce/util-agent": "workspace:^", "@united-workforce/util-agent": "workspace:^",
+1 -1
View File
@@ -1,4 +1,4 @@
#!/usr/bin/env node #!/usr/bin/env -S node --disable-warning=ExperimentalWarning
// eslint-disable-next-line -- dynamic import for version // eslint-disable-next-line -- dynamic import for version
const pkg = await import("../package.json", { with: { type: "json" } }); const pkg = await import("../package.json", { with: { type: "json" } });
+3 -3
View File
@@ -1,6 +1,6 @@
{ {
"name": "@united-workforce/cli", "name": "@united-workforce/cli",
"version": "0.2.0", "version": "0.3.0",
"files": [ "files": [
"src", "src",
"dist", "dist",
@@ -11,8 +11,8 @@
"uwf": "./dist/cli.js" "uwf": "./dist/cli.js"
}, },
"dependencies": { "dependencies": {
"@ocas/core": "^0.3.0", "@ocas/core": "^0.4.0",
"@ocas/fs": "^0.3.0", "@ocas/fs": "^0.4.0",
"@united-workforce/protocol": "workspace:^", "@united-workforce/protocol": "workspace:^",
"@united-workforce/util": "workspace:^", "@united-workforce/util": "workspace:^",
"@united-workforce/util-agent": "workspace:^", "@united-workforce/util-agent": "workspace:^",
+18 -10
View File
@@ -28,9 +28,13 @@ roles:
$status: "ready" $status: "ready"
frontmatter: frontmatter:
type: object type: object
required: ["$status"] oneOf:
properties: - properties:
$status: { type: string, enum: ["ready", "not-ready"] } $status: { const: "ready" }
required: ["$status"]
- properties:
$status: { const: "not-ready" }
required: ["$status"]
roleB: roleB:
description: Second role description: Second role
goal: Do B goal: Do B
@@ -42,7 +46,7 @@ roles:
type: object type: object
required: ["$status"] required: ["$status"]
properties: properties:
$status: { type: string, enum: ["done"] } $status: { const: "done" }
graph: graph:
$START: $START:
new: new:
@@ -82,9 +86,13 @@ roles:
$status: "pass" $status: "pass"
frontmatter: frontmatter:
type: object type: object
required: ["$status"] oneOf:
properties: - properties:
$status: { type: string, enum: ["pass", "fail"] } $status: { const: "pass" }
required: ["$status"]
- properties:
$status: { const: "fail" }
required: ["$status"]
roleB: roleB:
description: Pass role description: Pass role
goal: Do B goal: Do B
@@ -96,7 +104,7 @@ roles:
type: object type: object
required: ["$status"] required: ["$status"]
properties: properties:
$status: { type: string, enum: ["done"] } $status: { const: "done" }
roleC: roleC:
description: Fail role description: Fail role
goal: Do C goal: Do C
@@ -108,7 +116,7 @@ roles:
type: object type: object
required: ["$status"] required: ["$status"]
properties: properties:
$status: { type: string, enum: ["done"] } $status: { const: "done" }
graph: graph:
$START: $START:
new: new:
@@ -155,7 +163,7 @@ roles:
type: object type: object
required: ["$status"] required: ["$status"]
properties: properties:
$status: { type: string, enum: ["done"] } $status: { const: "done" }
graph: graph:
$START: $START:
new: new:
@@ -21,11 +21,11 @@ describe("solve-issue workflow: Gitea API PR creation", () => {
"..", "..",
"..", "..",
"..", "..",
".workflows", "examples",
"solve-issue.yaml", "solve-issue.yaml",
); );
test("committer procedure should use curl API instead of tea pr create", async () => { test("committer procedure should create PR via tea pr create", async () => {
const yamlContent = await readFile(workflowPath, "utf-8"); const yamlContent = await readFile(workflowPath, "utf-8");
const workflow = parse(yamlContent) as WorkflowPayload; const workflow = parse(yamlContent) as WorkflowPayload;
@@ -33,25 +33,22 @@ describe("solve-issue workflow: Gitea API PR creation", () => {
const committerProcedure = workflow.roles.committer?.procedure; const committerProcedure = workflow.roles.committer?.procedure;
expect(committerProcedure).toBeDefined(); expect(committerProcedure).toBeDefined();
// Verify the procedure uses curl API, not tea pr create // Verify the procedure uses tea pr create for PR creation
expect(committerProcedure).toContain("curl"); expect(committerProcedure).toContain("tea pr create");
expect(committerProcedure).toContain("api/v1/repos"); expect(committerProcedure).toContain("git push");
expect(committerProcedure).toContain("/pulls"); expect(committerProcedure).toContain("Fixes #N");
// Verify it explicitly warns against tea pr create
expect(committerProcedure).toMatch(/do NOT use.*tea pr create/i);
}); });
test("committer procedure should reference repoRemote from task prompt", async () => { test("committer procedure should extract owner/repo from git remote", async () => {
const yamlContent = await readFile(workflowPath, "utf-8"); const yamlContent = await readFile(workflowPath, "utf-8");
const workflow = parse(yamlContent) as WorkflowPayload; const workflow = parse(yamlContent) as WorkflowPayload;
const committerProcedure = workflow.roles.committer?.procedure; const committerProcedure = workflow.roles.committer?.procedure;
expect(committerProcedure).toBeDefined(); expect(committerProcedure).toBeDefined();
// Verify the procedure mentions repoRemote is provided in task prompt // Verify the procedure extracts owner/repo from remote
expect(committerProcedure).toMatch(/repo remote.*provided.*task prompt/i); expect(committerProcedure).toContain("git remote get-url origin");
expect(committerProcedure).toMatch(/owner\/repo/i); expect(committerProcedure).toContain("hook_failed");
}); });
test("committer procedure should include error handling for curl failures", async () => { test("committer procedure should include error handling for curl failures", async () => {
@@ -100,45 +97,42 @@ describe("solve-issue workflow: Gitea API PR creation", () => {
expect(committedVariant.required).toContain("$status"); expect(committedVariant.required).toContain("$status");
}); });
test("developer procedure should include mandatory verification step", async () => { test("developer procedure should include worktree setup", async () => {
const yamlContent = await readFile(workflowPath, "utf-8"); const yamlContent = await readFile(workflowPath, "utf-8");
const workflow = parse(yamlContent) as WorkflowPayload; const workflow = parse(yamlContent) as WorkflowPayload;
const developerProcedure = workflow.roles.developer?.procedure; const developerProcedure = workflow.roles.developer?.procedure;
expect(developerProcedure).toBeDefined(); expect(developerProcedure).toBeDefined();
// Verify the procedure includes mandatory verification step // Verify the procedure includes worktree setup
expect(developerProcedure).toContain("MANDATORY VERIFICATION"); expect(developerProcedure).toContain("IMPORTANT");
expect(developerProcedure).toContain("git branch --show-current"); expect(developerProcedure).toContain("git worktree add");
expect(developerProcedure).toContain("git status"); expect(developerProcedure).toContain("pnpm install");
expect(developerProcedure).toMatch(/ls -la|verify.*exist/i);
}); });
test("reviewer procedure should enforce worktree path verification", async () => { test("reviewer procedure should verify branch and run checks", async () => {
const yamlContent = await readFile(workflowPath, "utf-8"); const yamlContent = await readFile(workflowPath, "utf-8");
const workflow = parse(yamlContent) as WorkflowPayload; const workflow = parse(yamlContent) as WorkflowPayload;
const reviewerProcedure = workflow.roles.reviewer?.procedure; const reviewerProcedure = workflow.roles.reviewer?.procedure;
expect(reviewerProcedure).toBeDefined(); expect(reviewerProcedure).toBeDefined();
// Verify the procedure includes critical enforcement // Verify the procedure includes branch verification and build checks
expect(reviewerProcedure).toContain("CRITICAL"); expect(reviewerProcedure).toContain("git branch --show-current");
expect(reviewerProcedure).toMatch(/cd.*pwd/); expect(reviewerProcedure).toContain("pnpm run build");
expect(reviewerProcedure).toContain( expect(reviewerProcedure).toContain("pnpm run check");
"Do NOT report results without running the actual commands",
);
}); });
test("developer procedure should include test debugging escalation", async () => { test("developer procedure should include changeset and failure handling", async () => {
const yamlContent = await readFile(workflowPath, "utf-8"); const yamlContent = await readFile(workflowPath, "utf-8");
const workflow = parse(yamlContent) as WorkflowPayload; const workflow = parse(yamlContent) as WorkflowPayload;
const developerProcedure = workflow.roles.developer?.procedure; const developerProcedure = workflow.roles.developer?.procedure;
expect(developerProcedure).toBeDefined(); expect(developerProcedure).toBeDefined();
// Verify the procedure includes test failure guidance // Verify the procedure includes changeset requirement and failure path
expect(developerProcedure).toMatch(/tests fail.*first run/i); expect(developerProcedure).toContain(".changeset/");
expect(developerProcedure).toMatch(/3 test cycles|after 3 attempts/i);
expect(developerProcedure).toContain("$status=failed"); expect(developerProcedure).toContain("$status=failed");
expect(developerProcedure).toContain("pnpm test");
}); });
}); });
@@ -54,7 +54,7 @@ roles:
type: object type: object
required: ["$status"] required: ["$status"]
properties: properties:
$status: { type: string, enum: ["ready"] } $status: { const: "ready" }
graph: graph:
$START: $START:
new: new:
@@ -114,7 +114,7 @@ roles:
type: object type: object
required: ["$status"] required: ["$status"]
properties: properties:
$status: { type: string, enum: ["ready"] } $status: { const: "ready" }
graph: graph:
$START: $START:
new: new:
@@ -161,7 +161,7 @@ roles:
type: object type: object
required: ["$status"] required: ["$status"]
properties: properties:
$status: { type: string, enum: ["ready"] } $status: { const: "ready" }
graph: graph:
$START: $START:
new: new:
@@ -31,7 +31,7 @@ roles:
type: object type: object
required: ["$status"] required: ["$status"]
properties: properties:
$status: { type: string, enum: ["ready"] } $status: { const: "ready" }
graph: graph:
$START: $START:
new: new:
@@ -54,7 +54,7 @@ roles:
type: object type: object
required: ["$status"] required: ["$status"]
properties: properties:
$status: { type: string, enum: ["ready"] } $status: { const: "ready" }
graph: graph:
$START: $START:
new: new:
@@ -17,7 +17,7 @@ function makeWorkflow(overrides?: Partial<WorkflowPayload>): WorkflowPayload {
frontmatter: { frontmatter: {
type: "object", type: "object",
properties: { properties: {
$status: { enum: ["done"] }, $status: { const: "done" },
plan: { type: "string" }, plan: { type: "string" },
}, },
required: ["$status", "plan"], required: ["$status", "plan"],
@@ -85,7 +85,7 @@ describe("Suite 1: Role Reference Integrity", () => {
output: "None", output: "None",
frontmatter: { frontmatter: {
type: "object", type: "object",
properties: { $status: { enum: ["done"] } }, properties: { $status: { const: "done" } },
required: ["$status"], required: ["$status"],
} as unknown as string, } as unknown as string,
}; };
@@ -187,7 +187,7 @@ describe("Suite 2: Graph Structure", () => {
output: "Isolated", output: "Isolated",
frontmatter: { frontmatter: {
type: "object", type: "object",
properties: { $status: { enum: ["done"] } }, properties: { $status: { const: "done" } },
required: ["$status"], required: ["$status"],
} as unknown as string, } as unknown as string,
}; };
@@ -272,8 +272,8 @@ describe("Suite 3: Status-Edge Consistency", () => {
}); });
}); });
describe("Suite 3b: Enum-Based Multi-Exit", () => { describe("Suite 3b: Enum-Based $status is Rejected", () => {
test("3b.1 enum multi-exit passes with matching graph keys", () => { test("3b.1 enum multi-exit is rejected (must use oneOf + const)", () => {
const wf = makeWorkflow(); const wf = makeWorkflow();
wf.roles.reviewer = { wf.roles.reviewer = {
...wf.roles.reviewer, ...wf.roles.reviewer,
@@ -291,52 +291,10 @@ describe("Suite 3b: Enum-Based Multi-Exit", () => {
rejected: { role: "writer", prompt: "Fix: {{{comments}}}", location: null }, rejected: { role: "writer", prompt: "Fix: {{{comments}}}", location: null },
}; };
const errors = validateWorkflow(wf); const errors = validateWorkflow(wf);
expect(errors).toEqual([]); expect(errors.some((e) => e.includes("must define") && e.includes("const"))).toBe(true);
}); });
test("3b.2 enum multi-exit with extra graph key", () => { test("3b.2 enum single-exit is rejected (must use const)", () => {
const wf = makeWorkflow();
wf.roles.reviewer = {
...wf.roles.reviewer,
frontmatter: {
type: "object",
properties: {
$status: { enum: ["approved", "rejected"] },
comments: { type: "string" },
},
required: ["$status", "comments"],
} as unknown as string,
};
wf.graph.reviewer = {
approved: { role: "$END", prompt: "Done", location: null },
rejected: { role: "writer", prompt: "Fix", location: null },
timeout: { role: "$END", prompt: "Timed out", location: null },
};
const errors = validateWorkflow(wf);
expect(errors.some((e) => e.includes("extra status keys: timeout"))).toBe(true);
});
test("3b.3 enum multi-exit with missing graph key", () => {
const wf = makeWorkflow();
wf.roles.reviewer = {
...wf.roles.reviewer,
frontmatter: {
type: "object",
properties: {
$status: { enum: ["approved", "rejected"] },
comments: { type: "string" },
},
required: ["$status", "comments"],
} as unknown as string,
};
wf.graph.reviewer = {
approved: { role: "$END", prompt: "Done", location: null },
};
const errors = validateWorkflow(wf);
expect(errors.some((e) => e.includes("missing status keys: rejected"))).toBe(true);
});
test("3b.4 enum with single explicit value passes", () => {
const wf = makeWorkflow(); const wf = makeWorkflow();
wf.roles.writer = { wf.roles.writer = {
...wf.roles.writer, ...wf.roles.writer,
@@ -351,28 +309,71 @@ describe("Suite 3b: Enum-Based Multi-Exit", () => {
}; };
wf.graph.writer = { ready: { role: "reviewer", prompt: "Review: {{{plan}}}", location: null } }; wf.graph.writer = { ready: { role: "reviewer", prompt: "Review: {{{plan}}}", location: null } };
const errors = validateWorkflow(wf); const errors = validateWorkflow(wf);
expect(errors).toEqual([]); expect(errors.some((e) => e.includes("must define") && e.includes("const"))).toBe(true);
}); });
});
test("3b.5 enum multi-exit mustache var not in frontmatter", () => { describe("Suite 3c: Const-Based Flat Schema", () => {
test("3c.1 flat schema with const $status passes validation", () => {
const wf = makeWorkflow(); const wf = makeWorkflow();
wf.roles.reviewer = { wf.roles.writer = {
...wf.roles.reviewer, ...wf.roles.writer,
frontmatter: { frontmatter: {
type: "object", type: "object",
properties: { properties: {
$status: { enum: ["approved", "rejected"] }, $status: { const: "done" },
comments: { type: "string" }, plan: { type: "string" },
}, },
required: ["$status", "comments"], required: ["$status", "plan"],
} as unknown as string, } as unknown as string,
}; };
wf.graph.reviewer = { const errors = validateWorkflow(wf);
approved: { role: "$END", prompt: "Done: {{{nonexistent}}}", location: null }, expect(errors).toEqual([]);
rejected: { role: "writer", prompt: "Fix: {{{comments}}}", location: null }, });
test("3c.2 flat schema with const $status detects extra graph key", () => {
const wf = makeWorkflow();
wf.roles.writer = {
...wf.roles.writer,
frontmatter: {
type: "object",
properties: {
$status: { const: "done" },
plan: { type: "string" },
},
required: ["$status", "plan"],
} as unknown as string,
};
wf.graph.writer = {
done: { role: "reviewer", prompt: "Review.", location: null },
extra: { role: "$END", prompt: "Nope.", location: null },
}; };
const errors = validateWorkflow(wf); const errors = validateWorkflow(wf);
expect(errors.some((e) => e.includes("nonexistent") && e.includes("not found"))).toBe(true); expect(errors.some((e) => e.includes("extra status keys") && e.includes("extra"))).toBe(true);
});
test("3c.3 flat schema with const $status validates mustache vars", () => {
const wf = makeWorkflow();
wf.roles.writer = {
...wf.roles.writer,
frontmatter: {
type: "object",
properties: {
$status: { const: "done" },
plan: { type: "string" },
},
required: ["$status", "plan"],
} as unknown as string,
};
wf.graph.writer = {
done: { role: "reviewer", prompt: "Review: {{{nonexistent}}}", location: null },
};
const errors = validateWorkflow(wf);
expect(
errors.some(
(e) => e.includes('prompt variable "nonexistent"') && e.includes('role "writer"'),
),
).toBe(true);
}); });
}); });
@@ -480,7 +481,7 @@ describe("Suite 6: Multiple Errors Collection", () => {
output: "None", output: "None",
frontmatter: { frontmatter: {
type: "object", type: "object",
properties: { $status: { enum: ["done"] } }, properties: { $status: { const: "done" } },
required: ["$status"], required: ["$status"],
} as unknown as string, } as unknown as string,
}; };
@@ -31,7 +31,7 @@ function makeMinimalPayload(name: string, description: string): WorkflowPayload
frontmatter: { frontmatter: {
type: "object", type: "object",
properties: { properties: {
$status: { type: "string", enum: ["done"] }, $status: { const: "done" },
}, },
required: ["$status"], required: ["$status"],
} as unknown as CasRef, } as unknown as CasRef,
+2 -2
View File
@@ -1,4 +1,4 @@
#!/usr/bin/env node #!/usr/bin/env -S node --disable-warning=ExperimentalWarning
import type { CasRef, ThreadId, ThreadStatus } from "@united-workforce/protocol"; import type { CasRef, ThreadId, ThreadStatus } from "@united-workforce/protocol";
import { Command } from "commander"; import { Command } from "commander";
@@ -542,7 +542,7 @@ prompt
program program
.command("setup") .command("setup")
.description("Configure provider, model, and agent") .description("Configure provider, model, and agent. Run without options for interactive wizard.")
.option("--provider <name>", "Provider name") .option("--provider <name>", "Provider name")
.option("--base-url <url>", "OpenAI-compatible API base URL") .option("--base-url <url>", "OpenAI-compatible API base URL")
.option("--api-key <key>", "API key") .option("--api-key <key>", "API key")
+145 -34
View File
@@ -1,10 +1,35 @@
import { readFileSync } from "node:fs";
import { dirname, join } from "node:path";
import { fileURLToPath } from "node:url";
import { import {
generateAdapterDevelopingReference, generateAdapterDevelopingReference,
generateUsageReference, generateUsageReference,
generateWorkflowAuthoringReference, generateWorkflowAuthoringReference,
VERSION,
} from "@united-workforce/util"; } from "@united-workforce/util";
// CLI package version (for bootstrap prompt — uwf --version prints this)
// Walk up from __dirname to find the nearest package.json (works from both src/ and dist/)
function _findCliVersion(): string {
let dir = dirname(fileURLToPath(import.meta.url));
for (let i = 0; i < 5; i++) {
const candidate = join(dir, "package.json");
try {
const pkg = JSON.parse(readFileSync(candidate, "utf-8")) as {
name?: string;
version?: string;
};
if (pkg.name === "@united-workforce/cli") {
return pkg.version ?? "0.0.0";
}
} catch {
// not found, keep walking
}
dir = dirname(dir);
}
return "0.0.0";
}
const CLI_VERSION = _findCliVersion();
export { export {
generateAdapterDevelopingReference as cmdPromptAdapterDeveloping, generateAdapterDevelopingReference as cmdPromptAdapterDeveloping,
generateUsageReference as cmdPromptUsage, generateUsageReference as cmdPromptUsage,
@@ -45,56 +70,113 @@ pnpm --version || npm --version
# 3. Global bin directory is in PATH # 3. Global bin directory is in PATH
# For pnpm: # For pnpm:
pnpm root -g 2>/dev/null && pnpm bin -g pnpm bin -g # prints the global bin directory
# For npm: # For npm:
npm prefix -g npm prefix -g # global prefix; bin is <prefix>/bin
# The "bin" directory printed above MUST be in your PATH. # The directory printed above MUST be in your PATH.
# Test: echo $PATH | tr ':' '\\n' | grep -E "(pnpm|npm|node)" # Test: echo \\$PATH | tr ':' '\\\\n'
# FIX: add the bin directory to ~/.profile or ~/.bashrc: # FIX: add the missing directory to your shell's startup file
# export PATH="$(pnpm bin -g):$PATH" # pnpm # (e.g. ~/.bashrc, ~/.zshrc, ~/.profile, or fish config):
# export PATH="$(npm prefix -g)/bin:$PATH" # npm # export PATH="<global-bin-dir>:$PATH"
# 4. (uwf-hermes only) hermes CLI
which hermes
# FIX: if hermes is in a virtualenv, add it to PATH:
# export PATH="$HOME/.hermes/hermes-agent/.venv/bin:$PATH"
# or create a symlink: ln -s ~/.hermes/hermes-agent/.venv/bin/hermes ~/.local/bin/hermes
\`\`\` \`\`\`
**All checks must pass before continuing.** If you had to modify PATH, verify the change persists by opening a new shell or sourcing your profile. **All checks must pass before continuing.** If you had to modify PATH, verify the change persists by opening a new shell or sourcing your shell config.
### Step 1 — Install CLI and agent adapter ### Step 1 — Discover agents and install adapter
**First, detect which supported agents are already installed on the user's machine:**
\`\`\`bash \`\`\`bash
pnpm add -g @united-workforce/cli # or: npm install -g @united-workforce/cli # Check for Hermes Agent
uwf --version # should print ${VERSION} which hermes 2>/dev/null && hermes --version
# Check for Claude Code
which claude 2>/dev/null && claude --version # should show "X.Y.Z (Claude Code)"
\`\`\` \`\`\`
Install an agent adapter (at least one is required): **Based on the results:**
| Adapter | Install | When to use | - **Only hermes found** → install \`uwf-hermes\` adapter
|---------|---------|-------------| - **Only claude found** → install \`uwf-claude-code\` adapter
| uwf-hermes | \`pnpm add -g @united-workforce/agent-hermes\` | When your agent framework is Hermes Agent | - **Both found** → ask the user which agent they want uwf to use as default
| uwf-claude-code | \`pnpm add -g @united-workforce/agent-claude-code\` | When using Claude Code CLI directly | - **Neither found** → the user must install at least one agent first:
| uwf-builtin | \`pnpm add -g @united-workforce/agent-builtin\` | Lightweight built-in agent (no external dependency) | - Hermes Agent: https://hermes-agent.nousresearch.com/docs
- Claude Code: \`npm install -g @anthropic-ai/claude-code\`
**Install the uwf CLI and the chosen adapter** using pnpm or npm:
**uwf-hermes** also requires the Hermes ACP plugin. After installing \`hermes-agent\`, run:
\`\`\`bash \`\`\`bash
pip install hermes-agent[acp] # or: pip install -e .[acp] if installed from source # CLI (required)
pnpm add -g @united-workforce/cli # or: npm install -g @united-workforce/cli
# Adapter — install the one matching the detected agent:
pnpm add -g @united-workforce/agent-hermes # or: npm i -g @united-workforce/agent-hermes
pnpm add -g @united-workforce/agent-claude-code # or: npm i -g @united-workforce/agent-claude-code
\`\`\` \`\`\`
Verify the adapter is installed: \`uwf-hermes --version\` (or whichever you chose). **⚠ Adapter versions are independent from CLI versions.** Do NOT try to match adapter version to CLI version. Just install \`@latest\` (the default).
**After installing, verify that \`uwf\` and the adapter are available in PATH:**
\`\`\`bash
uwf --version # should print ${CLI_VERSION}
uwf-hermes --version # or: uwf-claude-code --version
\`\`\`
If either command is not found, the global bin directory is not in the current shell's PATH. **You must fix this before continuing:**
1. Find where the binary was installed:
\`\`\`bash
find ~/.local ~/.hermes /usr/local -name uwf -type f 2>/dev/null
npm prefix -g # global prefix — bin is <prefix>/bin
\`\`\`
2. Add the directory to PATH permanently by appending to the user's shell config (e.g. \`~/.bashrc\`, \`~/.zshrc\`, \`~/.profile\`, or fish config):
\`\`\`bash
export PATH="<global-bin-dir>:$PATH"
\`\`\`
3. Source the updated config or open a new shell, then re-verify the commands work.
**uwf-hermes** also requires the Hermes ACP plugin. Verify with \`hermes acp --help\`. If not available, install it:
\`\`\`bash
# Option A: install into hermes venv (recommended)
source ~/.hermes/hermes-agent/.venv/bin/activate && pip install hermes-agent[acp]
# Option B: pipx
pipx install 'hermes-agent[acp]'
# Option C: if installed from source
pip install -e '.[acp]'
\`\`\`
### Step 2 — Configure provider and model ### Step 2 — Configure provider and model
uwf needs an LLM provider to run agents. **Ask the user** for their provider, API key, and model, then run: uwf needs an LLM provider to run agents. **Ask the user** for their provider, API key, and model, then run:
\`\`\`bash \`\`\`bash
uwf setup --provider <name> --base-url <url> --api-key <key> --model <model> [--agent <adapter>] uwf setup --provider <name> --api-key <key> --model <model> --agent <adapter-command>
\`\`\` \`\`\`
Preset providers (base-url is auto-filled when using a preset name): **Note:** \`--agent\` takes the adapter **command name** (e.g. \`uwf-hermes\`), not the npm package name.
openai, xai, openrouter, venice, dashscope, deepseek, siliconflow, volcengine, kimi, glm, stepfun, minimax, ollama
**Preset providers** — when using a preset name, \`--base-url\` is auto-filled and can be omitted:
| Provider | Name | Default base URL |
|----------|------|-----------------|
| OpenAI | \`openai\` | https://api.openai.com/v1 |
| xAI | \`xai\` | https://api.x.ai/v1 |
| OpenRouter | \`openrouter\` | https://openrouter.ai/api/v1 |
| Venice | \`venice\` | https://api.venice.ai/api/v1 |
| Dashscope | \`dashscope\` | https://dashscope.aliyuncs.com/compatible-mode/v1 |
| DeepSeek | \`deepseek\` | https://api.deepseek.com/v1 |
| SiliconFlow | \`siliconflow\` | https://api.siliconflow.cn/v1 |
| VolcEngine | \`volcengine\` | https://ark.cn-beijing.volces.com/api/v3 |
| Kimi (Moonshot) | \`kimi\` | https://api.moonshot.cn/v1 |
| GLM (Zhipu AI) | \`glm\` | https://open.bigmodel.cn/api/paas/v4 |
| StepFun | \`stepfun\` | https://api.stepfun.com/v1 |
| MiniMax | \`minimax\` | https://api.minimax.io/v1 |
| Ollama (local) | \`ollama\` | http://localhost:11434/v1 |
For **non-preset providers**, you must specify \`--base-url\` manually.
Example: Example:
\`\`\`bash \`\`\`bash
@@ -119,6 +201,8 @@ Each command outputs a complete SKILL.md with YAML frontmatter. Use your agent f
Verify skills are installed by listing them (e.g. \`skills_list()\`) and confirming all three appear. Verify skills are installed by listing them (e.g. \`skills_list()\`) and confirming all three appear.
**⚠ After saving all skills, start a new session** so the agent loads the updated skill content. Skills saved in the current session are not active until the next session.
### Step 4 — Verify end-to-end ### Step 4 — Verify end-to-end
Create a minimal workflow file to test your setup: Create a minimal workflow file to test your setup:
@@ -137,7 +221,7 @@ roles:
frontmatter: frontmatter:
type: object type: object
properties: properties:
$status: { enum: [done] } $status: { const: done }
message: { type: string } message: { type: string }
required: [$status, message] required: [$status, message]
graph: graph:
@@ -164,11 +248,25 @@ If the thread reaches \`$END\` with status \`completed\`, the setup is working.
### Step 1 — Update packages ### Step 1 — Update packages
\`\`\`bash \`\`\`bash
pnpm add -g @united-workforce/cli@latest # or: npm install -g @united-workforce/cli@latest # Using pnpm
uwf --version # should print ${VERSION} pnpm add -g @united-workforce/cli@latest
# Also update your adapter(s) # Using npm
npm install -g @united-workforce/cli@latest
\`\`\`
\`\`\`bash
uwf --version # should print ${CLI_VERSION}
\`\`\`
Also update your adapter(s):
\`\`\`bash
# pnpm
pnpm add -g @united-workforce/agent-hermes@latest pnpm add -g @united-workforce/agent-hermes@latest
# npm
npm install -g @united-workforce/agent-hermes@latest
\`\`\` \`\`\`
### Step 2 — Regenerate skills ### Step 2 — Regenerate skills
@@ -181,6 +279,8 @@ uwf prompt workflow-authoring # → update skill "uwf-workflow-authoring"
uwf prompt adapter-developing # → update skill "uwf-adapter-developing" uwf prompt adapter-developing # → update skill "uwf-adapter-developing"
\`\`\` \`\`\`
**⚠ After updating skills, start a new session** to load the new skill content.
### Step 3 — Migrate workflow YAML files (if needed) ### Step 3 — Migrate workflow YAML files (if needed)
Check the changelog for breaking changes. Known migrations: Check the changelog for breaking changes. Known migrations:
@@ -199,6 +299,17 @@ Check the changelog for breaking changes. Known migrations:
Update all \`.workflow/\` and \`.workflows/\` YAML files in your projects. \`uwf workflow add\` will reject files with the old \`_\` syntax. Update all \`.workflow/\` and \`.workflows/\` YAML files in your projects. \`uwf workflow add\` will reject files with the old \`_\` syntax.
- **v0.2.1**: \`$status: { enum: [value] }\`\`$status: { const: "value" }\`. The validator no longer accepts \`enum\` for \`$status\`. Update all workflow YAML files:
\`\`\`yaml
# Before (v0.2.0)
$status: { enum: [done] }
$status: { type: string, enum: ["ready", "failed"] }
# After (v0.2.1+)
$status: { const: "done" }
# For multi-exit, use oneOf with const (unchanged)
\`\`\`
### Step 4 — Verify ### Step 4 — Verify
\`\`\`bash \`\`\`bash
+43 -1
View File
@@ -1,3 +1,4 @@
import { execFileSync } from "node:child_process";
import { existsSync, mkdirSync, readdirSync, readFileSync, statSync, writeFileSync } from "node:fs"; import { existsSync, mkdirSync, readdirSync, readFileSync, statSync, writeFileSync } from "node:fs";
import { join } from "node:path"; import { join } from "node:path";
import { stdin as input, stdout as output } from "node:process"; import { stdin as input, stdout as output } from "node:process";
@@ -181,7 +182,6 @@ export async function _discoverAgents(): Promise<string[]> {
async function _tryWhichDiscovery(): Promise<string[] | null> { async function _tryWhichDiscovery(): Promise<string[] | null> {
try { try {
const { execFileSync } = await import("node:child_process");
const text = execFileSync("which", ["-a", "uwf-hermes", "uwf-claude-code", "uwf-cursor"], { const text = execFileSync("which", ["-a", "uwf-hermes", "uwf-claude-code", "uwf-cursor"], {
encoding: "utf-8", encoding: "utf-8",
stdio: ["pipe", "pipe", "pipe"], stdio: ["pipe", "pipe", "pipe"],
@@ -397,6 +397,37 @@ function mergeConfig(existing: Record<string, unknown>, args: SetupArgs): Record
}; };
} }
/**
* Check if the configured adapter binary (and its dependencies) are in PATH.
* Returns warnings array — empty means all good.
*/
export function _checkAdapterAvailability(agentName: string): string[] {
const warnings: string[] = [];
const binary = `uwf-${agentName}`;
try {
execFileSync("which", [binary], { encoding: "utf8", stdio: ["pipe", "pipe", "pipe"] });
} catch {
warnings.push(
`${binary} not found in PATH. Install it: pnpm add -g @united-workforce/agent-${agentName}`,
);
return warnings; // skip dependency check if adapter itself is missing
}
// uwf-hermes depends on hermes CLI
if (agentName === "hermes") {
try {
execFileSync("which", ["hermes"], { encoding: "utf8", stdio: ["pipe", "pipe", "pipe"] });
} catch {
warnings.push(
'hermes CLI not found in PATH (required by uwf-hermes). Fix: export PATH="$HOME/.hermes/hermes-agent/.venv/bin:$PATH"',
);
}
}
return warnings;
}
/** /**
* Non-interactive setup. All required args provided via CLI flags. * Non-interactive setup. All required args provided via CLI flags.
*/ */
@@ -411,15 +442,26 @@ export async function cmdSetup(args: SetupArgs): Promise<Record<string, unknown>
writeFileSync(configPath, stringify(merged, { indent: 2 }), "utf8"); writeFileSync(configPath, stringify(merged, { indent: 2 }), "utf8");
// Print config path to stderr (stdout is reserved for JSON output)
console.error(`Config saved to ${configPath}`);
// Validate model connectivity // Validate model connectivity
const validation = await validateModel(args.baseUrl, args.apiKey, args.model); const validation = await validateModel(args.baseUrl, args.apiKey, args.model);
// Check adapter availability
const agentName = _agentNameFromBinary(args.agent ?? "hermes");
const adapterWarnings = _checkAdapterAvailability(agentName);
for (const w of adapterWarnings) {
console.error(`${w}`);
}
return { return {
configPath, configPath,
provider: args.provider, provider: args.provider,
model: args.model, model: args.model,
defaultAgent: merged.defaultAgent, defaultAgent: merged.defaultAgent,
validation, validation,
adapterWarnings,
}; };
} }
+6
View File
@@ -1004,6 +1004,12 @@ function spawnAgent(
}); });
} catch (e) { } catch (e) {
const err = e as NodeJS.ErrnoException & { stderr?: Buffer | string | null }; const err = e as NodeJS.ErrnoException & { stderr?: Buffer | string | null };
if (err.code === "ENOENT") {
failStep(
plog,
`"${agent.command}" not found in PATH. Install it or check your PATH config. Run: which ${agent.command}`,
);
}
const stderr = const stderr =
err.stderr == null err.stderr == null
? "" ? ""
+13 -13
View File
@@ -24,22 +24,22 @@ function isOneOfSchema(fm: unknown): fm is SchemaObj & { oneOf: SchemaObj[] } {
return Array.isArray(obj.oneOf); return Array.isArray(obj.oneOf);
} }
/** Check if a frontmatter schema declares "$status" as an enum (the required form for user roles). */ /** Check if a frontmatter schema declares "$status" as const (flat schema form). */
function hasStatusEnum(fm: unknown): boolean { function hasStatusConst(fm: unknown): boolean {
if (typeof fm !== "object" || fm === null) return false; if (typeof fm !== "object" || fm === null) return false;
const obj = fm as SchemaObj; const obj = fm as SchemaObj;
const props = obj.properties as Record<string, SchemaObj> | undefined; const props = obj.properties as Record<string, SchemaObj> | undefined;
if (!props?.$status) return false; if (!props?.$status) return false;
return Array.isArray(props.$status.enum); return typeof props.$status.const === "string";
} }
/** Extract status values from an enum-based $status field. */ /** Extract status values from a const-based $status field. */
function getEnumStatuses(fm: SchemaObj): string[] { function getConstStatuses(fm: SchemaObj): string[] {
const props = fm.properties as Record<string, SchemaObj> | undefined; const props = fm.properties as Record<string, SchemaObj> | undefined;
if (!props?.$status) return []; if (!props?.$status) return [];
const statusDef = props.$status; const statusDef = props.$status;
if (!Array.isArray(statusDef.enum)) return []; if (typeof statusDef.const === "string") return [statusDef.const];
return statusDef.enum as string[]; return [];
} }
/** Get property names from a schema object. */ /** Get property names from a schema object. */
@@ -248,21 +248,21 @@ function checkRoleConsistency(payload: WorkflowPayload, errors: string[]): void
checkOneOfDiscriminant(roleName, variants, statuses, errors); checkOneOfDiscriminant(roleName, variants, statuses, errors);
checkStatusEdges(roleName, graphKeys, new Set(statuses), errors); checkStatusEdges(roleName, graphKeys, new Set(statuses), errors);
checkMultiExitMustache(roleName, graphEntry, variants, errors); checkMultiExitMustache(roleName, graphEntry, variants, errors);
} else if (hasStatusEnum(fm)) { } else if (hasStatusConst(fm)) {
const statuses = getEnumStatuses(fm as SchemaObj); const statuses = getConstStatuses(fm as SchemaObj);
checkStatusEdges(roleName, graphKeys, new Set(statuses), errors); checkStatusEdges(roleName, graphKeys, new Set(statuses), errors);
// For enum-based schemas, mustache vars come from the flat properties // For const-based flat schemas, mustache vars come from the flat properties
checkEnumMustache(roleName, graphEntry, fm as SchemaObj, errors); checkFlatMustache(roleName, graphEntry, fm as SchemaObj, errors);
} else { } else {
errors.push( errors.push(
`role "${roleName}" must define "$status" as an enum (or oneOf const) in frontmatter`, `role "${roleName}" must define "$status" as const (or oneOf with const) in frontmatter`,
); );
} }
} }
} }
/** Check mustache vars in all edge prompts against flat schema properties. */ /** Check mustache vars in all edge prompts against flat schema properties. */
function checkEnumMustache( function checkFlatMustache(
roleName: string, roleName: string,
graphEntry: Record<string, { role: string; prompt: string }>, graphEntry: Record<string, { role: string; prompt: string }>,
fm: SchemaObj, fm: SchemaObj,
+3 -3
View File
@@ -1,6 +1,6 @@
{ {
"name": "@united-workforce/eval", "name": "@united-workforce/eval",
"version": "0.1.3", "version": "0.1.5",
"private": false, "private": false,
"files": [ "files": [
"src", "src",
@@ -22,8 +22,8 @@
"test:ci": "vitest run __tests__/" "test:ci": "vitest run __tests__/"
}, },
"dependencies": { "dependencies": {
"@ocas/core": "^0.3.0", "@ocas/core": "^0.4.0",
"@ocas/fs": "^0.3.0", "@ocas/fs": "^0.4.0",
"@united-workforce/protocol": "workspace:^", "@united-workforce/protocol": "workspace:^",
"@united-workforce/util": "workspace:^", "@united-workforce/util": "workspace:^",
"commander": "^14.0.3", "commander": "^14.0.3",
+1 -1
View File
@@ -6,7 +6,7 @@ import { formatList, selectEntries } from "./format.js";
import { readEvalEntries } from "./read.js"; import { readEvalEntries } from "./read.js";
const log = createLogger({ sink: { kind: "stderr" } }); const log = createLogger({ sink: { kind: "stderr" } });
const LOG_LIST = "L5KX9R2B"; const LOG_LIST = "H5KX9R2B";
type ListCliOptions = { type ListCliOptions = {
task: string | undefined; task: string | undefined;
+3 -3
View File
@@ -1,6 +1,6 @@
{ {
"name": "@united-workforce/protocol", "name": "@united-workforce/protocol",
"version": "0.1.0", "version": "0.1.1",
"files": [ "files": [
"src", "src",
"dist", "dist",
@@ -18,8 +18,8 @@
"test:ci": "vitest run src/__tests__/" "test:ci": "vitest run src/__tests__/"
}, },
"dependencies": { "dependencies": {
"@ocas/core": "^0.3.0", "@ocas/core": "^0.4.0",
"@ocas/fs": "^0.3.0" "@ocas/fs": "^0.4.0"
}, },
"devDependencies": { "devDependencies": {
"typescript": "^5.8.3" "typescript": "^5.8.3"
+8
View File
@@ -0,0 +1,8 @@
# Changelog
## 0.1.2 — 2026-06-07
- fix: decouple session resume from isFirstVisit guard
When frontmatter validation fails, the step is never written to CAS, so isFirstVisit remains true on the next run. Both adapters now always check the session cache regardless of isFirstVisit. When resuming after a frontmatter-only failure (isFirstVisit + cache hit), a minimal correction prompt is sent via buildFrontmatterRetryPrompt() instead of re-sending the full initial prompt.
@@ -143,7 +143,7 @@ describe("buildOutputFormatInstruction", () => {
{ {
type: "object", type: "object",
properties: { properties: {
$status: { type: "string", enum: ["approved"] }, $status: { const: "approved" },
branch: { type: "string" }, branch: { type: "string" },
}, },
required: ["$status"], required: ["$status"],
@@ -151,7 +151,7 @@ describe("buildOutputFormatInstruction", () => {
{ {
type: "object", type: "object",
properties: { properties: {
$status: { type: "string", enum: ["rejected"] }, $status: { const: "rejected" },
comments: { type: "string" }, comments: { type: "string" },
}, },
required: ["$status"], required: ["$status"],
@@ -225,4 +225,34 @@ describe("buildOutputFormatInstruction", () => {
const result = buildOutputFormatInstruction({}); const result = buildOutputFormatInstruction({});
expect(result).toContain("Focus exclusively on YOUR role"); expect(result).toContain("Focus exclusively on YOUR role");
}); });
test("renders const value as literal in flat schema example", () => {
const schema = {
type: "object",
properties: {
$status: { type: "string", const: "greeted" },
message: { type: "string" },
},
required: ["$status", "message"],
};
const result = buildOutputFormatInstruction(schema);
expect(result).toContain("$status: greeted");
expect(result).toContain("fixed value");
expect(result).not.toContain("$status: <string>");
});
test("renders const value for non-string types", () => {
const schema = {
type: "object",
properties: {
count: { type: "number", const: 42 },
done: { type: "boolean", const: true },
},
required: ["count", "done"],
};
const result = buildOutputFormatInstruction(schema);
expect(result).toContain("count: 42");
expect(result).toContain("done: true");
expect(result).toContain("fixed value");
});
}); });
@@ -0,0 +1,59 @@
import type { StepContext } from "@united-workforce/protocol";
import { describe, expect, test } from "vitest";
import { buildThreadProgress } from "../src/build-thread-progress.js";
function makeStep(role: string): StepContext {
return {
role,
output: {},
detail: "0000000000000" as string,
agent: "uwf-mock",
edgePrompt: "",
startedAtMs: 0,
completedAtMs: 0,
cwd: "",
assembledPrompt: null,
usage: null,
content: null,
};
}
describe("buildThreadProgress", () => {
test("first step of thread", () => {
const result = buildThreadProgress([], "proponent");
expect(result).toContain("## Thread Progress");
expect(result).toContain("first step");
expect(result).toContain("first time");
expect(result).toContain("proponent");
});
test("second step, role not seen before", () => {
const steps = [makeStep("opponent")];
const result = buildThreadProgress(steps, "proponent");
expect(result).toContain("Thread step 2");
expect(result).toContain("spoken 0 times");
});
test("role has spoken once before", () => {
const steps = [makeStep("proponent"), makeStep("opponent")];
const result = buildThreadProgress(steps, "proponent");
expect(result).toContain("Thread step 3");
expect(result).toContain("spoken 1 time before");
// singular "time" not "times"
expect(result).not.toContain("1 times");
});
test("role has spoken multiple times", () => {
const steps = [
makeStep("proponent"),
makeStep("opponent"),
makeStep("proponent"),
makeStep("opponent"),
makeStep("proponent"),
makeStep("opponent"),
];
const result = buildThreadProgress(steps, "proponent");
expect(result).toContain("Thread step 7");
expect(result).toContain("spoken 3 times");
});
});
@@ -0,0 +1,23 @@
import { describe, expect, test } from "vitest";
import { buildFrontmatterRetryPrompt } from "../src/frontmatter-retry-prompt.js";
describe("buildFrontmatterRetryPrompt", () => {
test("includes correction instruction", () => {
const result = buildFrontmatterRetryPrompt("Use YAML frontmatter");
expect(result).toContain("previous run completed");
expect(result).toContain("do NOT need to redo any work");
expect(result).toContain("corrected YAML frontmatter");
});
test("includes outputFormatInstruction when provided", () => {
const instruction = "---\nstatus: $done | $review\nsummary: string\n---";
const result = buildFrontmatterRetryPrompt(instruction);
expect(result).toContain(instruction);
});
test("works with empty outputFormatInstruction", () => {
const result = buildFrontmatterRetryPrompt("");
expect(result).not.toContain("\n\n\n");
expect(result).toContain("corrected YAML frontmatter");
});
});
+3 -3
View File
@@ -1,6 +1,6 @@
{ {
"name": "@united-workforce/util-agent", "name": "@united-workforce/util-agent",
"version": "0.1.0", "version": "0.1.2",
"files": [ "files": [
"src", "src",
"dist", "dist",
@@ -18,8 +18,8 @@
"test:ci": "vitest run __tests__/ src/__tests__/" "test:ci": "vitest run __tests__/ src/__tests__/"
}, },
"dependencies": { "dependencies": {
"@ocas/core": "^0.3.0", "@ocas/core": "^0.4.0",
"@ocas/fs": "^0.3.0", "@ocas/fs": "^0.4.0",
"@united-workforce/protocol": "workspace:^", "@united-workforce/protocol": "workspace:^",
"@united-workforce/util": "workspace:^", "@united-workforce/util": "workspace:^",
"dotenv": "^16.6.1", "dotenv": "^16.6.1",
@@ -74,6 +74,10 @@ function collectObjectSchemas(schema: JSONSchema): JSONSchema[] {
} }
function resolvePropertySchema(prop: JSONSchema): JSONSchema { function resolvePropertySchema(prop: JSONSchema): JSONSchema {
if (prop.const !== undefined) {
return prop;
}
if (Array.isArray(prop.enum) && prop.enum.length > 0) { if (Array.isArray(prop.enum) && prop.enum.length > 0) {
return prop; return prop;
} }
@@ -113,6 +117,11 @@ function buildPropertyExampleLine(prop: SchemaProperty): string {
commentParts.push("required"); commentParts.push("required");
} }
if (resolved.const !== undefined) {
commentParts.push("fixed value");
return `${prop.name}: ${formatYamlScalar(resolved.const)}${buildPropertyComment(commentParts)}`;
}
if (Array.isArray(resolved.enum) && resolved.enum.length > 0) { if (Array.isArray(resolved.enum) && resolved.enum.length > 0) {
const enumValues = resolved.enum.map((v) => String(v)); const enumValues = resolved.enum.map((v) => String(v));
commentParts.push(...enumValues); commentParts.push(...enumValues);
@@ -0,0 +1,27 @@
import type { StepContext } from "@united-workforce/protocol";
/**
* Build a compact thread-progress summary so the agent knows where it is
* in the conversation without making tool calls to count steps.
*
* Example output:
* ## Thread Progress
* Thread step 6. You (proponent) have spoken 2 times before this turn.
*/
export function buildThreadProgress(steps: StepContext[], role: string): string {
const totalSteps = steps.length;
const roleVisits = steps.filter((s) => s.role === role).length;
const parts = [`## Thread Progress`];
if (totalSteps === 0) {
parts.push(
`This is the first step of the thread. You (${role}) are speaking for the first time.`,
);
} else {
parts.push(
`Thread step ${totalSteps + 1}. You (${role}) have spoken ${roleVisits} time${roleVisits === 1 ? "" : "s"} before this turn.`,
);
}
return parts.join("\n");
}
@@ -0,0 +1,21 @@
/**
* Build a minimal prompt for retrying frontmatter output on a resumed session.
*
* Used when a previous run completed successfully but frontmatter validation
* failed — the session already has full context, we just need the agent to
* re-output correctly formatted frontmatter without redoing any work.
*/
export function buildFrontmatterRetryPrompt(outputFormatInstruction: string): string {
const parts: string[] = [
"Your previous run completed all work successfully, but the output format was incorrect.",
"You do NOT need to redo any work — all changes are already in place.",
"",
];
if (outputFormatInstruction !== "") {
parts.push(outputFormatInstruction, "");
}
parts.push(
"Please output ONLY the corrected YAML frontmatter block (--- delimited) followed by a brief summary of the work you completed.",
);
return parts.join("\n");
}
+2
View File
@@ -1,6 +1,7 @@
export { buildContinuationPrompt } from "./build-continuation-prompt.js"; export { buildContinuationPrompt } from "./build-continuation-prompt.js";
export { buildOutputFormatInstruction } from "./build-output-format-instruction.js"; export { buildOutputFormatInstruction } from "./build-output-format-instruction.js";
export { buildRolePrompt } from "./build-role-prompt.js"; export { buildRolePrompt } from "./build-role-prompt.js";
export { buildThreadProgress } from "./build-thread-progress.js";
export type { BuildContextMeta } from "./context.js"; export type { BuildContextMeta } from "./context.js";
export { buildContext, buildContextWithMeta } from "./context.js"; export { buildContext, buildContextWithMeta } from "./context.js";
export type { ExtractResult, ResolvedLlmProvider } from "./extract.js"; export type { ExtractResult, ResolvedLlmProvider } from "./extract.js";
@@ -11,6 +12,7 @@ export {
} from "./extract.js"; } from "./extract.js";
export type { FrontmatterFastPathResult } from "./frontmatter.js"; export type { FrontmatterFastPathResult } from "./frontmatter.js";
export { tryFrontmatterFastPath } from "./frontmatter.js"; export { tryFrontmatterFastPath } from "./frontmatter.js";
export { buildFrontmatterRetryPrompt } from "./frontmatter-retry-prompt.js";
export { createAgent, parseArgv } from "./run.js"; export { createAgent, parseArgv } from "./run.js";
export { getCachedSessionId, getCachePath, setCachedSessionId } from "./session-cache.js"; export { getCachedSessionId, getCachePath, setCachedSessionId } from "./session-cache.js";
export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js"; export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
+1 -1
View File
@@ -1,6 +1,6 @@
{ {
"name": "@united-workforce/util", "name": "@united-workforce/util",
"version": "0.1.2", "version": "0.1.4",
"files": [ "files": [
"src", "src",
"dist", "dist",
+13
View File
@@ -140,5 +140,18 @@ For specific scenarios, run the corresponding \`uwf prompt\` command:
|----------|---------|-------------| |----------|---------|-------------|
| Writing workflow YAML | \`uwf prompt workflow-authoring\` | Designing roles, conditions, graphs, and edge prompts | | Writing workflow YAML | \`uwf prompt workflow-authoring\` | Designing roles, conditions, graphs, and edge prompts |
| Building a new agent adapter | \`uwf prompt adapter-developing\` | Creating a new \`uwf-<name>\` CLI adapter | | Building a new agent adapter | \`uwf prompt adapter-developing\` | Creating a new \`uwf-<name>\` CLI adapter |
## Upgrading
\`\`\`bash
# Install the latest version
pnpm add -g @united-workforce/cli@latest @united-workforce/agent-hermes@latest
# or: npm install -g @united-workforce/cli@latest @united-workforce/agent-hermes@latest
# Verify
uwf --version
# Then run uwf prompt bootstrap and follow the upgrade instructions
\`\`\`
`; `;
} }
@@ -28,6 +28,7 @@ roles: # named actors
2. Do that 2. Do that
output: "..." # what the agent should produce output: "..." # what the agent should produce
frontmatter: # JSON Schema for structured output frontmatter: # JSON Schema for structured output
type: object
oneOf: oneOf:
- properties: - properties:
$status: { const: "ready" } $status: { const: "ready" }
@@ -71,10 +72,13 @@ The \`frontmatter\` field is a standard JSON Schema. It defines the structured f
### \`$status\` Field ### \`$status\` Field
\`$status\` is the only standard field. Its value determines which graph edge the moderator follows. Use \`const\` to constrain each variant: \`$status\` is the only standard field. Its value determines which graph edge the moderator follows.
**Multi-exit (oneOf)** use \`const\` to constrain each variant:
\`\`\`yaml \`\`\`yaml
frontmatter: frontmatter:
type: object
oneOf: oneOf:
- properties: - properties:
$status: { const: "done" } $status: { const: "done" }
@@ -86,26 +90,25 @@ frontmatter:
required: [$status, error] required: [$status, error]
\`\`\` \`\`\`
### Custom Fields **Single-exit (flat schema)** same syntax, just no \`oneOf\` wrapper:
Add any fields you need for data passing between roles. These are available in edge prompts via Mustache templates.
### Flat Schema (Single Status)
When a role has only one outcome, use \`enum\` with a single value:
\`\`\`yaml \`\`\`yaml
frontmatter: frontmatter:
type: object type: object
properties: properties:
$status: $status: { const: "done" }
type: string
enum: [done]
summary: { type: string } summary: { type: string }
required: [$status, summary] required: [$status, summary]
\`\`\` \`\`\`
Note: \`$status: { const: "done" }\` is **not** valid in flat schemas — the validator requires \`enum\` or \`oneOf\` with \`const\`. Use \`const\` only inside \`oneOf\` variants. **Important rules:**
- \`type: object\` is **required** at the top level of frontmatter (both flat and oneOf)
- \`$status\` always uses \`const: "value"\` — simple and consistent
- \`enum\` is **not supported** for \`$status\` — the validator will reject it
### Custom Fields
Add any fields you need for data passing between roles. These are available in edge prompts via Mustache templates.
## Graph Routing ## Graph Routing
+38 -36
View File
@@ -18,8 +18,8 @@ importers:
specifier: ^2.31.0 specifier: ^2.31.0
version: 2.31.0(@types/node@25.9.1) version: 2.31.0(@types/node@25.9.1)
'@shazhou/proman': '@shazhou/proman':
specifier: ^0.5.1 specifier: ^0.6.3
version: 0.5.1(@biomejs/biome@2.4.16)(typescript@5.9.3)(vite@7.3.5(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(yaml@2.9.0))(vitest@3.2.6(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(msw@2.14.6(@types/node@25.9.1)(typescript@5.9.3))(yaml@2.9.0)) version: 0.6.3(@biomejs/biome@2.4.16)(typescript@5.9.3)(vite@7.3.5(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(yaml@2.9.0))(vitest@3.2.6(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(msw@2.14.6(@types/node@25.9.1)(typescript@5.9.3))(yaml@2.9.0))
'@types/node': '@types/node':
specifier: ^25.7.0 specifier: ^25.7.0
version: 25.9.1 version: 25.9.1
@@ -45,8 +45,8 @@ importers:
packages/agent-builtin: packages/agent-builtin:
dependencies: dependencies:
'@ocas/core': '@ocas/core':
specifier: ^0.3.0 specifier: ^0.4.0
version: 0.3.0 version: 0.4.0
'@united-workforce/util': '@united-workforce/util':
specifier: workspace:^ specifier: workspace:^
version: link:../util version: link:../util
@@ -61,8 +61,8 @@ importers:
packages/agent-claude-code: packages/agent-claude-code:
dependencies: dependencies:
'@ocas/core': '@ocas/core':
specifier: ^0.3.0 specifier: ^0.4.0
version: 0.3.0 version: 0.4.0
'@united-workforce/protocol': '@united-workforce/protocol':
specifier: workspace:^ specifier: workspace:^
version: link:../protocol version: link:../protocol
@@ -80,8 +80,8 @@ importers:
packages/agent-hermes: packages/agent-hermes:
dependencies: dependencies:
'@ocas/core': '@ocas/core':
specifier: ^0.3.0 specifier: ^0.4.0
version: 0.3.0 version: 0.4.0
'@united-workforce/protocol': '@united-workforce/protocol':
specifier: workspace:^ specifier: workspace:^
version: link:../protocol version: link:../protocol
@@ -99,8 +99,8 @@ importers:
packages/agent-mock: packages/agent-mock:
dependencies: dependencies:
'@ocas/core': '@ocas/core':
specifier: ^0.3.0 specifier: ^0.4.0
version: 0.3.0 version: 0.4.0
'@united-workforce/protocol': '@united-workforce/protocol':
specifier: workspace:^ specifier: workspace:^
version: link:../protocol version: link:../protocol
@@ -121,11 +121,11 @@ importers:
packages/cli: packages/cli:
dependencies: dependencies:
'@ocas/core': '@ocas/core':
specifier: ^0.3.0 specifier: ^0.4.0
version: 0.3.0 version: 0.4.0
'@ocas/fs': '@ocas/fs':
specifier: ^0.3.0 specifier: ^0.4.0
version: 0.3.0 version: 0.4.0
'@united-workforce/protocol': '@united-workforce/protocol':
specifier: workspace:^ specifier: workspace:^
version: link:../protocol version: link:../protocol
@@ -231,11 +231,11 @@ importers:
packages/eval: packages/eval:
dependencies: dependencies:
'@ocas/core': '@ocas/core':
specifier: ^0.3.0 specifier: ^0.4.0
version: 0.3.0 version: 0.4.0
'@ocas/fs': '@ocas/fs':
specifier: ^0.3.0 specifier: ^0.4.0
version: 0.3.0 version: 0.4.0
'@united-workforce/protocol': '@united-workforce/protocol':
specifier: workspace:^ specifier: workspace:^
version: link:../protocol version: link:../protocol
@@ -256,11 +256,11 @@ importers:
packages/protocol: packages/protocol:
dependencies: dependencies:
'@ocas/core': '@ocas/core':
specifier: ^0.3.0 specifier: ^0.4.0
version: 0.3.0 version: 0.4.0
'@ocas/fs': '@ocas/fs':
specifier: ^0.3.0 specifier: ^0.4.0
version: 0.3.0 version: 0.4.0
devDependencies: devDependencies:
typescript: typescript:
specifier: ^5.8.3 specifier: ^5.8.3
@@ -275,11 +275,11 @@ importers:
packages/util-agent: packages/util-agent:
dependencies: dependencies:
'@ocas/core': '@ocas/core':
specifier: ^0.3.0 specifier: ^0.4.0
version: 0.3.0 version: 0.4.0
'@ocas/fs': '@ocas/fs':
specifier: ^0.3.0 specifier: ^0.4.0
version: 0.3.0 version: 0.4.0
'@united-workforce/protocol': '@united-workforce/protocol':
specifier: workspace:^ specifier: workspace:^
version: link:../protocol version: link:../protocol
@@ -892,11 +892,13 @@ packages:
resolution: {integrity: sha512-oGB+UxlgWcgQkgwo8GcEGwemoTFt3FIO9ababBmaGwXIoBKZ+GTy0pP185beGg7Llih/NSHSV2XAs1lnznocSg==} resolution: {integrity: sha512-oGB+UxlgWcgQkgwo8GcEGwemoTFt3FIO9ababBmaGwXIoBKZ+GTy0pP185beGg7Llih/NSHSV2XAs1lnznocSg==}
engines: {node: '>= 8'} engines: {node: '>= 8'}
'@ocas/core@0.3.0': '@ocas/core@0.4.0':
resolution: {integrity: sha512-ejDDZbmQkTj2GoJg+cNjXa3eHlQGybW3PrUZlwERBvBFjjnYBLHOG7AQQYM48bI52UiqucafgZjPEYk9SZd6AQ==} resolution: {integrity: sha512-6JvHd3nr5GncMOBNaZTf9ZTWou/txONTfZbkrblmgqL/H+YuRj1FfeFY+b1ndUlfwR7AuJ6bvoSxR5RP+AbC0w==}
engines: {node: '>=22.5.0'}
'@ocas/fs@0.3.0': '@ocas/fs@0.4.0':
resolution: {integrity: sha512-/6/nICYVJWXeWx2LcPoHHJAFoqXpJoAtvhLKLS0zpkwtsZX3g0D9X6J5soHCV1QS+BOWybuOJ0+W3cB1FBRkZA==} resolution: {integrity: sha512-AQG6dk1YCL1qpSszUWUgEY+LQhYbTv5hXYrs3J2pHAi2/lY615O2cTgjwEeh6JTcrqHsFwiDsDdKIKMpADchZA==}
engines: {node: '>=22.5.0'}
'@open-draft/deferred-promise@2.2.0': '@open-draft/deferred-promise@2.2.0':
resolution: {integrity: sha512-CecwLWx3rhxVQF6V4bAgPS5t+So2sTbPgAzafKkVizyi7tlwpcFpdFqq+wqF2OwNBmqFuu6tOyouTuxgpMfzmA==} resolution: {integrity: sha512-CecwLWx3rhxVQF6V4bAgPS5t+So2sTbPgAzafKkVizyi7tlwpcFpdFqq+wqF2OwNBmqFuu6tOyouTuxgpMfzmA==}
@@ -1152,8 +1154,8 @@ packages:
'@sec-ant/readable-stream@0.4.1': '@sec-ant/readable-stream@0.4.1':
resolution: {integrity: sha512-831qok9r2t8AlxLko40y2ebgSDhenenCatLVeW/uBtnHPyhHOvG0C7TvfgecV+wHzIm5KUICgzmVpWS+IMEAeg==} resolution: {integrity: sha512-831qok9r2t8AlxLko40y2ebgSDhenenCatLVeW/uBtnHPyhHOvG0C7TvfgecV+wHzIm5KUICgzmVpWS+IMEAeg==}
'@shazhou/proman@0.5.1': '@shazhou/proman@0.6.3':
resolution: {integrity: sha512-GmFUvd8SAOUW/eaDIEh31pVKSE3XhbgHOZ5vSpX4xS+F8Zl6lAfhgVCjcjRK8w5d43tsH47CVorwyxQcRaJFfA==} resolution: {integrity: sha512-KguWl1xHrWXx1YWYrWj47v4NRbaQuKCm7Hd7T8dzrqnkM8UL8em3R9rC7GeDzI8YDDfriFeLTX+xb03UHkhTDA==}
hasBin: true hasBin: true
peerDependencies: peerDependencies:
'@biomejs/biome': ^2.0.0 '@biomejs/biome': ^2.0.0
@@ -3896,16 +3898,16 @@ snapshots:
'@nodelib/fs.scandir': 2.1.5 '@nodelib/fs.scandir': 2.1.5
fastq: 1.20.1 fastq: 1.20.1
'@ocas/core@0.3.0': '@ocas/core@0.4.0':
dependencies: dependencies:
ajv: 8.20.0 ajv: 8.20.0
cborg: 4.5.8 cborg: 4.5.8
liquidjs: 10.27.0 liquidjs: 10.27.0
xxhash-wasm: 1.1.0 xxhash-wasm: 1.1.0
'@ocas/fs@0.3.0': '@ocas/fs@0.4.0':
dependencies: dependencies:
'@ocas/core': 0.3.0 '@ocas/core': 0.4.0
cborg: 4.5.8 cborg: 4.5.8
'@open-draft/deferred-promise@2.2.0': {} '@open-draft/deferred-promise@2.2.0': {}
@@ -4049,7 +4051,7 @@ snapshots:
'@sec-ant/readable-stream@0.4.1': {} '@sec-ant/readable-stream@0.4.1': {}
'@shazhou/proman@0.5.1(@biomejs/biome@2.4.16)(typescript@5.9.3)(vite@7.3.5(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(yaml@2.9.0))(vitest@3.2.6(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(msw@2.14.6(@types/node@25.9.1)(typescript@5.9.3))(yaml@2.9.0))': '@shazhou/proman@0.6.3(@biomejs/biome@2.4.16)(typescript@5.9.3)(vite@7.3.5(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(yaml@2.9.0))(vitest@3.2.6(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(msw@2.14.6(@types/node@25.9.1)(typescript@5.9.3))(yaml@2.9.0))':
dependencies: dependencies:
'@biomejs/biome': 2.4.16 '@biomejs/biome': 2.4.16
typescript: 5.9.3 typescript: 5.9.3
-329
View File
@@ -1,329 +0,0 @@
name: solve-issue
description: TDD-driven issue resolution adapted for the workflow monorepo with bun + vitest
roles:
planner:
description: Analyzes issue and outputs a TDD test spec
goal: You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify.
capabilities:
- issue-analysis
- planning
procedure: 'On first run (no previous steps):
1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
2. Look for project conventions files (CLAUDE.md, CONTRIBUTING.md) in the repo
3. Assess whether the issue has enough information to produce a test spec
4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output $status=insufficient_info
5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
On subsequent runs (bounced back by tester with fix_spec):
1. Read the tester''s output from the previous step to understand what''s wrong with the spec
2. Revise the test spec accordingly
After producing the test spec:
1. The test spec is stored in CAS automatically by the uwf pipeline (agents do not need to call `ocas put` directly)
2. Put the hash in frontmatter.plan (required when $status=ready)
3. Set repoPath to the absolute path of the repository root
IMPORTANT: Extract the repo remote (owner/repo) from git:
```bash
git remote get-url origin | sed ''s|.*[:/]\([^/]*/[^.]*\).*|\1|''
```
Store the result as repoRemote in your frontmatter output so downstream roles can use it for tea/API calls.'
output: Output a brief summary of the test spec. Set $status to ready (with plan hash and repoPath) or insufficient_info.
frontmatter:
oneOf:
- properties:
$status:
const: ready
plan:
type: string
repoPath:
type: string
repoRemote:
type: string
required:
- $status
- plan
- repoPath
- properties:
$status:
const: insufficient_info
reason:
type: string
required:
- $status
- reason
developer:
description: TDD implementation per test spec
goal: You are a developer agent. You implement code changes following TDD — write tests first, then implementation.
capabilities:
- coding
procedure: "IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.\nThe repo path and other details are provided in your task prompt.\n\nBefore starting any work,\
\ set up an isolated worktree:\n1. cd into the repo path provided in your task prompt\n2. `git fetch origin` to get latest refs\n3. First time (no existing branch):\n - `git worktree add .worktrees/fix/<issue-number>-<short-slug>\
\ -b fix/<issue-number>-<short-slug> origin/main`\n - `cd .worktrees/fix/<issue-number>-<short-slug> && bun install`\n4. If bounced back from reviewer or tester (branch already exists):\n - cd\
\ into the existing worktree under `.worktrees/fix/<issue-number>-<short-slug>`\n - `git fetch origin && git rebase origin/main`\n5. ALL subsequent work must happen inside the worktree directory.\n\
\nThen implement TDD:\n6. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner's output in your task prompt)\n7. If bounced back from reviewer or tester: read the\
\ previous role's feedback in your task prompt\n8. Write tests first based on the spec (use vitest)\n9. Implement the code to make tests pass\n10. Ensure `bun run build` passes with no errors\n11.\
\ Run `bun test` to verify all tests pass\n\nIf you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,\nor repeated attempts fail), set $status=failed\
\ with a reason.\n"
output: List all files changed and provide a summary. Set $status to done (with branch/worktree), or failed (with reason).
frontmatter:
oneOf:
- properties:
$status:
const: done
branch:
type: string
worktree:
type: string
repoRemote:
type: string
required:
- $status
- branch
- worktree
- properties:
$status:
const: failed
reason:
type: string
repoRemote:
type: string
required:
- $status
- reason
reviewer:
description: Code standards compliance check
goal: You are a code reviewer. You verify code standards compliance — NOT functionality (that's the tester's job).
capabilities:
- code-review
- static-analysis
procedure: 'The worktree path is provided in your task prompt. cd into it first.
Before reviewing, verify the git branch:
1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
2. If the branch doesn''t correspond to the issue, flag it in your output and reject
Then perform code review:
Hard checks (must all pass):
3. `bun run build` — no build errors
4. `bunx biome check` — no lint violations
5. TypeScript strict mode — no type errors
Soft checks (review against project conventions from CLAUDE.md):
- Functional-first: functions + types, no classes (except for errors or third-party requirements)
- Named exports only, no default exports
- No optional properties (use `T | null` instead of `?:`)
- Folder module discipline: index.ts only re-exports, types in types.ts
- Crockford Base32 log tags (8-char, unique per call site)
- No `console.log` in production code (use createLogger from @united-workforce/util)
- No dynamic imports in production code
Only review standards compliance. Do NOT test functionality.
If rejecting, you MUST explain the specific reason in your output.
'
output: Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments).
frontmatter:
oneOf:
- properties:
$status:
const: approved
branch:
type: string
worktree:
type: string
repoRemote:
type: string
required:
- $status
- branch
- worktree
- properties:
$status:
const: rejected
comments:
type: string
worktree:
type: string
repoRemote:
type: string
required:
- $status
- comments
- worktree
tester:
description: Functional correctness verification
goal: You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec.
capabilities:
- testing
procedure: "The worktree path is provided in your task prompt. cd into it first.\n\n1. Run `bun test` for automated test verification\n2. Read the test spec from CAS: `ocas get <plan hash>` (find\
\ the hash from the planner step in the thread history)\n3. Verify each scenario in the spec is covered and passing\n4. Determine outcome:\n - passed: all scenarios verified, tests pass\n - fix_code:\
\ tests fail or implementation doesn't match spec → send back to developer\n - fix_spec: the spec itself is wrong or incomplete → send back to planner\n"
output: Report test results per scenario. Set $status to passed (with branch/worktree), fix_code (with report), or fix_spec (with report).
frontmatter:
oneOf:
- properties:
$status:
const: passed
branch:
type: string
worktree:
type: string
repoRemote:
type: string
required:
- $status
- branch
- worktree
- properties:
$status:
const: fix_code
report:
type: string
repoRemote:
type: string
worktree:
type: string
branch:
type: string
required:
- $status
- report
- properties:
$status:
const: fix_spec
report:
type: string
repoRemote:
type: string
worktree:
type: string
branch:
type: string
required:
- $status
- report
committer:
description: Commits and creates PR
goal: You are a committer agent. You create a clean commit and push a PR linking the original issue.
capabilities: []
procedure: "The worktree path, branch name, and repo remote (owner/repo) are provided in your task prompt.\ncd into the worktree first.\n\nNote: You inherit the developer's worktree and branch. Do NOT\
\ create a new branch.\n1. Stage all changes: `git add -A`\n2. Commit with a descriptive message referencing the issue: `git commit -m \"type: description\\n\\nFixes #N\"`\n3. Push the branch: `git\
\ push -u origin <branch-name>`\n4. **Verify push succeeded** — run `git ls-remote origin <branch-name>` and confirm it prints a commit hash.\n - If no output or push failed: capture the error, mark hook_failed\n\
5. Create a PR using the Gitea API (do NOT use `tea pr create` — it fails in worktrees):\n ```bash\n GITEA_TOKEN=$(cfg get GITEA_TOKEN)\n curl -s -X POST -H \"Authorization: token $GITEA_TOKEN\" -H \"Content-Type: application/json\" \\\n\
\ \"https://git.shazhou.work/api/v1/repos/<owner>/<repo>/pulls\" \\\n -d '{\"title\":\"...\",\"body\":\"...\",\"head\":\"<branch>\",\"base\":\"main\"}'\n ```\n - The repo remote (owner/repo format, e.g. \"shazhou/united-workforce\") is given in your task prompt — use it directly.\n\
\ - PR body must include: What / Why / Changes / Ref sections, with `Fixes #N` in Ref\n6. **Verify PR was created** — parse the curl response JSON: it must contain a `\"number\"` field. Print the PR URL.\n\
\ - If curl returns an error or no number field: capture the response, mark hook_failed\n7. After PR creation, clean up the worktree:\n - cd to the repo root (parent of .worktrees)\n - `git worktree remove <worktree-path>`"
output: Include PR URL on success or error log on failure. Set $status to committed (with prUrl) or hook_failed (with error).
frontmatter:
oneOf:
- properties:
$status:
const: committed
prUrl:
type: string
repoRemote:
type: string
worktree:
type: string
branch:
type: string
required:
- $status
- prUrl
- properties:
$status:
const: hook_failed
error:
type: string
repoRemote:
type: string
worktree:
type: string
branch:
type: string
required:
- $status
- error
graph:
$START:
new:
role: planner
prompt: Analyze the issue and produce an implementation plan.
resume:
role: planner
prompt: Review the previous run output and continue the work.
planner:
insufficient_info:
role: $SUSPEND
prompt: "信息不足,需要补充:{{{reason}}}"
ready:
role: developer
prompt: 'Implement the TDD test spec (CAS hash: {{{plan}}}) in repo {{{repoPath}}}. Repo remote: {{{repoRemote}}}.'
developer:
done:
role: reviewer
prompt: 'Review branch {{{branch}}} at {{{worktree}}} for code standards compliance. Repo remote: {{{repoRemote}}}.'
failed:
role: $END
prompt: 'Developer failed: {{{reason}}}. Ending workflow.'
reviewer:
rejected:
role: developer
prompt: 'Reviewer rejected: {{{comments}}}. Fix the issues in repo {{{worktree}}}. Repo remote: {{{repoRemote}}}.'
approved:
role: tester
prompt: 'Review passed. Run tests on branch {{{branch}}} at {{{worktree}}}. Repo remote: {{{repoRemote}}}.'
tester:
fix_code:
role: developer
prompt: 'Tests found code issues: {{{report}}}. Fix and re-submit. Worktree: {{{worktree}}}. Repo remote: {{{repoRemote}}}.'
fix_spec:
role: planner
prompt: 'Tests found spec issues: {{{report}}}. Revise the test spec. Repo remote: {{{repoRemote}}}.'
passed:
role: committer
prompt: 'All tests passed. Commit and push branch {{{branch}}} from {{{worktree}}}. Repo remote (owner/repo): {{{repoRemote}}}.'
committer:
hook_failed:
role: developer
prompt: 'Push hook failed: {{{error}}}. Fix and re-submit. Worktree: {{{worktree}}}. Repo remote: {{{repoRemote}}}.'
committed:
role: $END
prompt: 'PR created: {{{prUrl}}}. Workflow complete.'