Compare commits
28 Commits
cli@0.3.0
...
e4c46c8150
| Author | SHA1 | Date | |
|---|---|---|---|
| e4c46c8150 | |||
| 9d0c6df62c | |||
| 0f5bb1f191 | |||
| 00d960daba | |||
| 3a26285872 | |||
| 2e7e5f6ec4 | |||
| 88c077d439 | |||
| aaadab4445 | |||
| adf7837975 | |||
| 513846f4ab | |||
| aee123cc82 | |||
| 8ddada5879 | |||
| aa732f5466 | |||
| e354fc4341 | |||
| 0e7e3ea44b | |||
| aa454c85dd | |||
| 6dd7d521be | |||
| 950dc056d8 | |||
| d360b85374 | |||
| 509dfad857 | |||
| 58b84e3b3c | |||
| f821ac99f4 | |||
| 2c4700c49f | |||
| 4410afcd4a | |||
| a0e254a681 | |||
| dd77b40f6c | |||
| 5ed6f68e4b | |||
| 1ed0bf1f76 |
@@ -1,9 +0,0 @@
|
|||||||
---
|
|
||||||
"@united-workforce/cli": patch
|
|
||||||
---
|
|
||||||
|
|
||||||
fix: expand bootstrap prompt with full onboarding and upgrade guide
|
|
||||||
|
|
||||||
Bootstrap now covers two scenarios:
|
|
||||||
- Fresh install: CLI + adapter installation, `uwf setup` configuration, skill installation, end-to-end verification
|
|
||||||
- Upgrade: package update, skill regeneration, breaking change migrations (e.g. $START new/resume)
|
|
||||||
@@ -1,8 +0,0 @@
|
|||||||
---
|
|
||||||
"@united-workforce/cli": patch
|
|
||||||
---
|
|
||||||
|
|
||||||
fix: bootstrap adds Step 0 environment pre-flight check
|
|
||||||
|
|
||||||
- Pre-flight checks for node, pnpm/npm, global bin PATH, hermes CLI with FIX instructions (#112)
|
|
||||||
- Install commands changed from npm to pnpm (with npm fallback)
|
|
||||||
@@ -1,9 +0,0 @@
|
|||||||
---
|
|
||||||
"@united-workforce/cli": patch
|
|
||||||
"@united-workforce/util": patch
|
|
||||||
---
|
|
||||||
|
|
||||||
fix: workflow-authoring flat schema example uses enum, bootstrap adds PATH guidance
|
|
||||||
|
|
||||||
- workflow-authoring: flat schema example uses `enum: [done]` instead of bare `const` (#110.3)
|
|
||||||
- bootstrap: adds `which hermes` check and PATH guidance for venv installs (#110.4)
|
|
||||||
@@ -1,14 +0,0 @@
|
|||||||
---
|
|
||||||
"@united-workforce/cli": patch
|
|
||||||
---
|
|
||||||
|
|
||||||
fix: improve bootstrap docs — agent discovery, pnpm/npm parity, preset provider table (#118, #120)
|
|
||||||
|
|
||||||
- Step 1: detect installed agents (hermes/claude) before choosing adapter
|
|
||||||
- Step 1: clarify adapter versions are independent from CLI — install @latest
|
|
||||||
- Step 1: show pnpm and npm side-by-side
|
|
||||||
- Step 1: add "adapter must be installed before `uwf setup --agent`" note
|
|
||||||
- Step 1: add ACP verification step (hermes acp --help)
|
|
||||||
- Step 2: `--agent` takes adapter command name (e.g. `uwf-hermes`), not npm package
|
|
||||||
- Step 2: preset providers listed as a table with names and default base URLs
|
|
||||||
- Remove uwf-builtin from supported adapters (not ready yet)
|
|
||||||
@@ -1,10 +0,0 @@
|
|||||||
---
|
|
||||||
"@united-workforce/cli": patch
|
|
||||||
---
|
|
||||||
|
|
||||||
fix: preset provider base-url auto-fill, bootstrap ACP docs, friendlier name mismatch error
|
|
||||||
|
|
||||||
- `uwf setup --provider dashscope` now auto-fills `--base-url` from preset list (#106)
|
|
||||||
- Bootstrap guide documents uwf-hermes ACP dependency (`pip install hermes-agent[acp]`) (#107)
|
|
||||||
- Bootstrap verify step uses inline workflow instead of missing `examples/eval-simple.yaml` (#107)
|
|
||||||
- Workflow filename mismatch error now suggests how to fix it (#108)
|
|
||||||
@@ -0,0 +1,12 @@
|
|||||||
|
---
|
||||||
|
"@united-workforce/protocol": patch
|
||||||
|
"@united-workforce/util-agent": patch
|
||||||
|
"@united-workforce/agent-builtin": patch
|
||||||
|
"@united-workforce/agent-claude-code": patch
|
||||||
|
"@united-workforce/agent-hermes": patch
|
||||||
|
"@united-workforce/agent-mock": patch
|
||||||
|
"@united-workforce/cli": patch
|
||||||
|
"@united-workforce/eval": patch
|
||||||
|
---
|
||||||
|
|
||||||
|
Bump @ocas/core and @ocas/fs to ^0.4.0 (export/import closures, nodes subdirectory, lazy loading).
|
||||||
@@ -1,14 +0,0 @@
|
|||||||
---
|
|
||||||
"@united-workforce/cli": patch
|
|
||||||
"@united-workforce/agent-hermes": patch
|
|
||||||
"@united-workforce/agent-claude-code": patch
|
|
||||||
"@united-workforce/agent-builtin": patch
|
|
||||||
"@united-workforce/agent-mock": patch
|
|
||||||
---
|
|
||||||
|
|
||||||
fix: suppress ExperimentalWarning, PEP 668 pip guidance, setup help (#116)
|
|
||||||
|
|
||||||
- All CLI bins use shebang `#!/usr/bin/env -S node --disable-warning=ExperimentalWarning`
|
|
||||||
- Remove NODE_OPTIONS injection from spawn (shebang handles it)
|
|
||||||
- Bootstrap pip install guidance covers venv/pipx/source options for PEP 668 systems
|
|
||||||
- `uwf setup --help` mentions interactive wizard mode
|
|
||||||
@@ -1,12 +0,0 @@
|
|||||||
---
|
|
||||||
"@united-workforce/cli": patch
|
|
||||||
---
|
|
||||||
|
|
||||||
fix: setup UX improvements (#114)
|
|
||||||
|
|
||||||
- Setup validates adapter availability and prints install command if missing
|
|
||||||
- Setup prints "Config saved to <path> ✓" on success
|
|
||||||
- Spawn ENOENT gives actionable error ("not found in PATH" + which command)
|
|
||||||
- SQLite ExperimentalWarning suppressed via NODE_OPTIONS in spawned processes
|
|
||||||
- Bootstrap VERSION reads cli package version (was reading util version)
|
|
||||||
- Bootstrap PATH guidance is shell-agnostic (no hardcoded .bashrc/.profile)
|
|
||||||
@@ -1,9 +0,0 @@
|
|||||||
---
|
|
||||||
"@united-workforce/cli": minor
|
|
||||||
"@united-workforce/util": patch
|
|
||||||
---
|
|
||||||
|
|
||||||
feat: replace $START `_` status with `new`/`resume` semantics
|
|
||||||
|
|
||||||
BREAKING: All workflow YAML files must update `$START._` to `$START.new` + `$START.resume`.
|
|
||||||
The `resume` edge prompt replaces the previously hardcoded resume message in the CLI.
|
|
||||||
@@ -0,0 +1,11 @@
|
|||||||
|
---
|
||||||
|
"@united-workforce/cli": minor
|
||||||
|
---
|
||||||
|
|
||||||
|
feat(cli): add `uwf thread poke` command
|
||||||
|
|
||||||
|
New subcommand `uwf thread poke <thread-id> -p <prompt>` re-runs the head step's
|
||||||
|
agent with a supplementary prompt, replacing the head step's output. Unlike
|
||||||
|
`thread resume`, poke skips the moderator and rewrites the new step's `prev`
|
||||||
|
pointer so the new head replaces (not appends to) the old head. Works on idle
|
||||||
|
and suspended threads. Resolves issue #144 (Phase 1).
|
||||||
@@ -1,15 +0,0 @@
|
|||||||
---
|
|
||||||
"@united-workforce/cli": patch
|
|
||||||
"@united-workforce/util": patch
|
|
||||||
---
|
|
||||||
|
|
||||||
fix: unify $status to const-only, drop enum support (#123)
|
|
||||||
|
|
||||||
Breaking: `$status` in frontmatter now requires `const` everywhere.
|
|
||||||
`enum` is no longer accepted and will be rejected by the validator.
|
|
||||||
|
|
||||||
- Validator: `hasStatusConst()` / `getConstStatuses()` replace enum-based checks
|
|
||||||
- Error message: "must define $status as const (or oneOf with const)"
|
|
||||||
- workflow-authoring docs: all examples use `const`, enum explicitly noted as unsupported
|
|
||||||
- bootstrap hello.yaml: `$status: { const: done }`
|
|
||||||
- All test fixtures migrated from enum to const/oneOf
|
|
||||||
@@ -1,247 +0,0 @@
|
|||||||
name: "solve-issue"
|
|
||||||
description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds."
|
|
||||||
roles:
|
|
||||||
planner:
|
|
||||||
description: "Analyzes issue and outputs a TDD test spec"
|
|
||||||
goal: "You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify."
|
|
||||||
capabilities:
|
|
||||||
- issue-analysis
|
|
||||||
- planning
|
|
||||||
procedure: |
|
|
||||||
On first run (no previous steps):
|
|
||||||
1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
|
|
||||||
2. Look for project conventions files (CLAUDE.md, CONTRIBUTING.md, .cursor/rules/) in the repo
|
|
||||||
3. Assess whether the issue has enough information to produce a test spec
|
|
||||||
4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output $status=insufficient_info
|
|
||||||
5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
|
|
||||||
|
|
||||||
On subsequent runs (bounced back by tester with fix_spec):
|
|
||||||
1. Read the tester's output from the previous step to understand what's wrong with the spec
|
|
||||||
2. Revise the test spec accordingly
|
|
||||||
|
|
||||||
After producing the test spec:
|
|
||||||
1. The test spec is stored in CAS automatically by the uwf pipeline (agents do not need to call `ocas put` directly)
|
|
||||||
2. Put the plan hash in frontmatter.plan (required when $status=ready)
|
|
||||||
3. Set repoPath to the absolute path of the repository root
|
|
||||||
|
|
||||||
IMPORTANT: Extract the repo remote (owner/repo) from git:
|
|
||||||
```bash
|
|
||||||
git remote get-url origin | sed 's|.*[:/]\([^/]*/[^.]*\).*|\1|'
|
|
||||||
```
|
|
||||||
Store the result as repoRemote in your frontmatter output so downstream roles can use it for tea/API calls.
|
|
||||||
output: "Output a brief summary of the test spec. Set $status to ready (with plan hash and repoPath) or insufficient_info."
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status: { const: "ready" }
|
|
||||||
plan: { type: string }
|
|
||||||
repoPath: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
required: [$status, plan, repoPath, repoRemote]
|
|
||||||
- properties:
|
|
||||||
$status: { const: "insufficient_info" }
|
|
||||||
reason: { type: string }
|
|
||||||
required: [$status, reason]
|
|
||||||
developer:
|
|
||||||
description: "TDD implementation per test spec"
|
|
||||||
goal: "You are a developer agent. You implement code changes following TDD — write tests first, then implementation."
|
|
||||||
capabilities:
|
|
||||||
- coding
|
|
||||||
procedure: |
|
|
||||||
IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
|
|
||||||
The repo path and other details are provided in your task prompt.
|
|
||||||
|
|
||||||
Before starting any work, set up an isolated worktree:
|
|
||||||
1. cd into the repo path provided in your task prompt
|
|
||||||
2. `git fetch origin` to get latest refs
|
|
||||||
3. First time (no existing branch):
|
|
||||||
- `git worktree add .worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
|
|
||||||
- `cd .worktrees/fix/<issue-number>-<short-slug> && bun install`
|
|
||||||
4. If bounced back from reviewer or tester (branch already exists):
|
|
||||||
- cd into the existing worktree under `.worktrees/fix/<issue-number>-<short-slug>`
|
|
||||||
- `git fetch origin && git rebase origin/main`
|
|
||||||
5. ALL subsequent work must happen inside the worktree directory.
|
|
||||||
|
|
||||||
Then implement TDD:
|
|
||||||
6. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner's output in your task prompt)
|
|
||||||
7. If bounced back from reviewer or tester: read the previous role's feedback in your task prompt
|
|
||||||
8. Write tests first based on the spec
|
|
||||||
9. Implement the code to make tests pass
|
|
||||||
10. Ensure `bun run build` passes with no errors
|
|
||||||
11. Run `bun test` to verify all tests pass
|
|
||||||
- If tests fail on first run:
|
|
||||||
* Read the test output carefully for missing imports or setup issues
|
|
||||||
* Check if you're running tests from the correct working directory (package root vs workspace root)
|
|
||||||
* Fix the immediate issue and rerun ONCE
|
|
||||||
* If tests still fail after 2 attempts: check the test spec for ambiguities
|
|
||||||
* If stuck after 3 test cycles: set $status=failed with detailed error report rather than continuing blind retries
|
|
||||||
12. MANDATORY VERIFICATION before reporting done:
|
|
||||||
- Run `git branch --show-current` and confirm branch name matches expected
|
|
||||||
- Run `git status` and verify changed files exist
|
|
||||||
- Run `ls -la <key-implementation-files>` to verify they exist on disk
|
|
||||||
- If ANY verification fails: retry the implementation, do NOT report done
|
|
||||||
|
|
||||||
If you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,
|
|
||||||
or repeated attempts fail), set $status=failed with a reason.
|
|
||||||
output: "List all files changed and provide a summary. Set $status to done (with branch/worktree), or failed (with reason)."
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status: { const: "done" }
|
|
||||||
branch: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
required: [$status, branch, worktree]
|
|
||||||
- properties:
|
|
||||||
$status: { const: "failed" }
|
|
||||||
reason: { type: string }
|
|
||||||
required: [$status, reason]
|
|
||||||
reviewer:
|
|
||||||
description: "Code standards compliance check"
|
|
||||||
goal: "You are a code reviewer. You verify code standards compliance — NOT functionality (that's the tester's job)."
|
|
||||||
capabilities:
|
|
||||||
- code-review
|
|
||||||
- static-analysis
|
|
||||||
procedure: |
|
|
||||||
The worktree path is provided in your task prompt. cd into it first.
|
|
||||||
|
|
||||||
CRITICAL: You MUST execute every verification command below. Do NOT report results without running the actual commands. Do NOT rely on prior context or assumptions.
|
|
||||||
|
|
||||||
Before reviewing, verify the worktree and branch exist:
|
|
||||||
0. Run `cd <worktree-path> && pwd` to confirm the path is accessible
|
|
||||||
- If the cd fails: the worktree truly doesn't exist, reject with that reason
|
|
||||||
- If the cd succeeds: proceed with step 1 below
|
|
||||||
1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
|
|
||||||
2. If the branch doesn't correspond to the issue, flag it in your output and reject
|
|
||||||
|
|
||||||
Then perform code review:
|
|
||||||
Hard checks (must all pass):
|
|
||||||
3. `bun run build` — no build errors
|
|
||||||
4. `bunx biome check` — no lint violations
|
|
||||||
5. TypeScript strict mode — no type errors
|
|
||||||
|
|
||||||
Soft checks (review against project conventions if CLAUDE.md / .cursor/rules exist):
|
|
||||||
- Naming conventions, module boundaries, code style
|
|
||||||
- No `console.log` in production code
|
|
||||||
- No dynamic imports in production code
|
|
||||||
|
|
||||||
Only review standards compliance. Do NOT test functionality.
|
|
||||||
If rejecting, you MUST explain the specific reason in your output.
|
|
||||||
output: "Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments)."
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status: { const: "approved" }
|
|
||||||
branch: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
required: [$status, branch, worktree]
|
|
||||||
- properties:
|
|
||||||
$status: { const: "rejected" }
|
|
||||||
comments: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
required: [$status, comments, worktree]
|
|
||||||
tester:
|
|
||||||
description: "Functional correctness verification"
|
|
||||||
goal: "You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec."
|
|
||||||
capabilities:
|
|
||||||
- testing
|
|
||||||
procedure: |
|
|
||||||
The worktree path is provided in your task prompt. cd into it first.
|
|
||||||
|
|
||||||
1. Run `bun test` for automated test verification
|
|
||||||
2. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner step in the thread history)
|
|
||||||
3. Verify each scenario in the spec is covered and passing
|
|
||||||
4. Determine outcome:
|
|
||||||
- passed: all scenarios verified, tests pass
|
|
||||||
- fix_code: tests fail or implementation doesn't match spec → send back to developer
|
|
||||||
- fix_spec: the spec itself is wrong or incomplete → send back to planner
|
|
||||||
output: "Report test results per scenario. Set $status to passed (with branch/worktree), fix_code (with report), or fix_spec (with report)."
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status: { const: "passed" }
|
|
||||||
branch: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
required: [$status, branch, worktree]
|
|
||||||
- properties:
|
|
||||||
$status: { const: "fix_code" }
|
|
||||||
report: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
branch: { type: string }
|
|
||||||
required: [$status, report]
|
|
||||||
- properties:
|
|
||||||
$status: { const: "fix_spec" }
|
|
||||||
report: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
branch: { type: string }
|
|
||||||
required: [$status, report]
|
|
||||||
committer:
|
|
||||||
description: "Commits and creates PR"
|
|
||||||
goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue."
|
|
||||||
capabilities: []
|
|
||||||
procedure: |
|
|
||||||
The worktree path, branch name, and repo remote (owner/repo) are provided in your task prompt.
|
|
||||||
cd into the worktree first.
|
|
||||||
|
|
||||||
Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
|
|
||||||
1. Check `git status` — if working tree is clean and branch is ahead of origin, skip to step 3 (push).
|
|
||||||
2. If there are unstaged/uncommitted changes: `git add -A` then `git commit -m "type: description\n\nFixes #N"`
|
|
||||||
3. Push the branch: `git push -u origin <branch-name>`
|
|
||||||
4. **Verify push succeeded** — run `git ls-remote origin <branch-name>` and confirm it prints a commit hash.
|
|
||||||
- If no output or push failed: capture the error, mark hook_failed
|
|
||||||
5. Create a PR using the Gitea API (do NOT use `tea pr create` — it fails in worktrees):
|
|
||||||
```bash
|
|
||||||
GITEA_TOKEN=$(cfg get GITEA_TOKEN)
|
|
||||||
curl -s -X POST -H "Authorization: token $GITEA_TOKEN" -H "Content-Type: application/json" \
|
|
||||||
"https://git.shazhou.work/api/v1/repos/<owner>/<repo>/pulls" \
|
|
||||||
-d '{"title":"...","body":"...","head":"<branch>","base":"main"}'
|
|
||||||
```
|
|
||||||
- The repo remote (owner/repo format, e.g. "shazhou/united-workforce") is given in your task prompt — use it directly.
|
|
||||||
- PR body must include: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
|
|
||||||
6. **Verify PR was created** — parse the curl response JSON: it must contain a `"number"` field. Print the PR URL.
|
|
||||||
- If curl returns an error or no number field: capture the response, mark hook_failed
|
|
||||||
7. After PR creation, clean up the worktree:
|
|
||||||
- cd to the repo root (parent of .worktrees)
|
|
||||||
- `git worktree remove <worktree-path>`
|
|
||||||
output: "Include PR URL on success or error log on failure. Set $status to committed (with prUrl) or hook_failed (with error)."
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status: { const: "committed" }
|
|
||||||
prUrl: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
branch: { type: string }
|
|
||||||
required: [$status, prUrl]
|
|
||||||
- properties:
|
|
||||||
$status: { const: "hook_failed" }
|
|
||||||
error: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
branch: { type: string }
|
|
||||||
required: [$status, error]
|
|
||||||
graph:
|
|
||||||
$START:
|
|
||||||
new: { role: "planner", prompt: "Analyze the issue and produce an implementation plan." }
|
|
||||||
resume: { role: "planner", prompt: "Review the previous run output and continue the work." }
|
|
||||||
planner:
|
|
||||||
insufficient_info: { role: "$SUSPEND", prompt: "信息不足,需要补充:{{{reason}}}" }
|
|
||||||
ready: { role: "developer", prompt: "Implement the TDD test spec (CAS hash: {{{plan}}}) in repo {{{repoPath}}}. Repo remote: {{{repoRemote}}}." }
|
|
||||||
developer:
|
|
||||||
done: { role: "reviewer", prompt: "Review branch {{{branch}}} at {{{worktree}}} for code standards compliance. Repo remote: {{{repoRemote}}}." }
|
|
||||||
failed: { role: "$END", prompt: "Developer failed: {{{reason}}}. Ending workflow." }
|
|
||||||
reviewer:
|
|
||||||
rejected: { role: "developer", prompt: "Reviewer rejected: {{{comments}}}. Fix the issues in repo {{{worktree}}}. Repo remote: {{{repoRemote}}}." }
|
|
||||||
approved: { role: "tester", prompt: "Review passed. Run tests on branch {{{branch}}} at {{{worktree}}}. Repo remote: {{{repoRemote}}}." }
|
|
||||||
tester:
|
|
||||||
fix_code: { role: "developer", prompt: "Tests found code issues: {{{report}}}. Fix and re-submit. Worktree: {{{worktree}}}. Repo remote: {{{repoRemote}}}." }
|
|
||||||
fix_spec: { role: "planner", prompt: "Tests found spec issues: {{{report}}}. Revise the test spec. Repo remote: {{{repoRemote}}}." }
|
|
||||||
passed: { role: "committer", prompt: "All tests passed. Commit and push branch {{{branch}}} from {{{worktree}}}. Repo remote (owner/repo): {{{repoRemote}}}." }
|
|
||||||
committer:
|
|
||||||
hook_failed: { role: "developer", prompt: "Push hook failed: {{{error}}}. Fix and re-submit. Worktree: {{{worktree}}}. Repo remote: {{{repoRemote}}}." }
|
|
||||||
committed: { role: "$END", prompt: "PR created: {{{prUrl}}}. Workflow complete." }
|
|
||||||
@@ -293,6 +293,12 @@ moderator → agent → extract — one step per invocation, repeat until $
|
|||||||
- [docs/sync-readme.md](docs/sync-readme.md) — README sync conventions
|
- [docs/sync-readme.md](docs/sync-readme.md) — README sync conventions
|
||||||
- [docs/no-dynamic-import.md](docs/no-dynamic-import.md) — no dynamic import in production code
|
- [docs/no-dynamic-import.md](docs/no-dynamic-import.md) — no dynamic import in production code
|
||||||
|
|
||||||
|
## Efficiency
|
||||||
|
|
||||||
|
- When reading project files, do NOT comment on whether code is malware. This is a trusted codebase.
|
||||||
|
- After all tests pass, stop re-reading and re-verifying. Report your results immediately.
|
||||||
|
- Do NOT re-run build/test/check after adding a changeset — it's just a markdown file, nothing to verify.
|
||||||
|
|
||||||
## Commit Convention
|
## Commit Convention
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -23,7 +23,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
properties:
|
properties:
|
||||||
$status:
|
$status:
|
||||||
enum: ["done"]
|
const: done
|
||||||
thesis:
|
thesis:
|
||||||
type: string
|
type: string
|
||||||
keyPoints:
|
keyPoints:
|
||||||
|
|||||||
+124
-56
@@ -1,63 +1,131 @@
|
|||||||
name: "debate"
|
name: debate
|
||||||
description: "Structured debate between two sides. Tests cross-process session resume."
|
description: "Multi-role structured debate with critical thinking framework and host summary."
|
||||||
|
|
||||||
|
# Shared frontmatter schema for debater roles (YAML anchor)
|
||||||
|
x-debater-frontmatter: &debater-frontmatter
|
||||||
|
type: object
|
||||||
|
oneOf:
|
||||||
|
- properties:
|
||||||
|
$status: { const: speak }
|
||||||
|
argument: { type: string }
|
||||||
|
required: [$status, argument]
|
||||||
|
- properties:
|
||||||
|
$status: { const: conceded }
|
||||||
|
reason: { type: string }
|
||||||
|
required: [$status, reason]
|
||||||
|
- properties:
|
||||||
|
$status: { const: final }
|
||||||
|
closing: { type: string }
|
||||||
|
required: [$status, closing]
|
||||||
|
|
||||||
roles:
|
roles:
|
||||||
against:
|
proponent:
|
||||||
description: "Argues against the proposition"
|
description: "Argues FOR the proposition"
|
||||||
goal: |
|
goal: "Build a compelling case for the proposition through logical reasoning and evidence"
|
||||||
You are a skilled debater arguing AGAINST the proposition.
|
capabilities: []
|
||||||
Be logical, cite evidence, and directly address your opponent's points.
|
|
||||||
Keep each argument concise (under 200 words).
|
|
||||||
capabilities:
|
|
||||||
- argumentation
|
|
||||||
- critical-thinking
|
|
||||||
procedure: |
|
procedure: |
|
||||||
1. If this is the opening, present your strongest argument against the proposition.
|
You are an experienced scholar arguing FOR the proposition.
|
||||||
2. If responding to the other side, directly counter their points with evidence and logic.
|
|
||||||
3. If you find yourself genuinely convinced by the other side, you may concede.
|
## Critical Thinking Framework (execute before every speech)
|
||||||
output: |
|
|
||||||
Provide your argument in the frontmatter.
|
### A. Pre-speech reflection (internal, do not output)
|
||||||
Set status to "conceded" ONLY if you are genuinely convinced and wish to stop debating.
|
- Does every step in my argument chain hold? Any hidden assumptions or logical gaps?
|
||||||
Otherwise set status to "continue".
|
- If I were my opponent, how would I attack this? Where am I weakest?
|
||||||
|
- Does my evidence actually support my claim, or could it backfire?
|
||||||
|
- Should I go on offense or defense this round?
|
||||||
|
|
||||||
|
### B. Evidence discipline
|
||||||
|
- Verify key numbers — watch for order-of-magnitude errors
|
||||||
|
- Assess data freshness — fast-moving fields have short half-lives
|
||||||
|
- Distinguish primary data from secondary citations, expert opinion, and common assumptions
|
||||||
|
|
||||||
|
### C. Anti-fragility
|
||||||
|
- Anticipate counterarguments; preemptively strengthen or strategically abandon weak points
|
||||||
|
- Catch logical gaps, data misuse, or outdated claims in your opponent's reasoning
|
||||||
|
|
||||||
|
## Rules
|
||||||
|
1. Check Thread Progress to see how many times you have spoken.
|
||||||
|
2. On your 3rd speech, you MUST output $status: final (closing statement).
|
||||||
|
3. If genuinely convinced by the opponent, output $status: conceded.
|
||||||
|
4. Otherwise output $status: speak and counter the opponent's points.
|
||||||
|
5. Be rigorous, cite evidence, stay concise.
|
||||||
|
output: "Debate argument"
|
||||||
|
frontmatter: *debater-frontmatter
|
||||||
|
|
||||||
|
opponent:
|
||||||
|
description: "Argues AGAINST the proposition"
|
||||||
|
goal: "Build a compelling case against the proposition through logical reasoning and evidence"
|
||||||
|
capabilities: []
|
||||||
|
procedure: |
|
||||||
|
You are an experienced scholar arguing AGAINST the proposition.
|
||||||
|
|
||||||
|
## Critical Thinking Framework (execute before every speech)
|
||||||
|
|
||||||
|
### A. Pre-speech reflection (internal, do not output)
|
||||||
|
- Does every step in my argument chain hold? Any hidden assumptions or logical gaps?
|
||||||
|
- If I were my opponent, how would I attack this? Where am I weakest?
|
||||||
|
- Does my evidence actually support my claim, or could it backfire?
|
||||||
|
- Should I go on offense or defense this round?
|
||||||
|
|
||||||
|
### B. Evidence discipline
|
||||||
|
- Verify key numbers — watch for order-of-magnitude errors
|
||||||
|
- Assess data freshness — fast-moving fields have short half-lives
|
||||||
|
- Distinguish primary data from secondary citations, expert opinion, and common assumptions
|
||||||
|
|
||||||
|
### C. Anti-fragility
|
||||||
|
- Anticipate counterarguments; preemptively strengthen or strategically abandon weak points
|
||||||
|
- Catch logical gaps, data misuse, or outdated claims in your opponent's reasoning
|
||||||
|
|
||||||
|
## Rules
|
||||||
|
1. Check Thread Progress to see how many times you have spoken.
|
||||||
|
2. On your 3rd speech, or when the proponent has issued a final statement, you MUST output $status: final.
|
||||||
|
3. If genuinely convinced by the proponent, output $status: conceded.
|
||||||
|
4. Otherwise output $status: speak and counter the proponent's points.
|
||||||
|
5. Be rigorous, cite evidence, stay concise.
|
||||||
|
output: "Debate argument"
|
||||||
|
frontmatter: *debater-frontmatter
|
||||||
|
|
||||||
|
host:
|
||||||
|
description: "Debate moderator — delivers impartial summary and verdict"
|
||||||
|
goal: "Objectively review the debate, analyze both sides, and deliver a verdict"
|
||||||
|
capabilities: []
|
||||||
|
procedure: |
|
||||||
|
You are an experienced academic debate moderator.
|
||||||
|
|
||||||
|
## Task
|
||||||
|
1. Outline each side's core arguments
|
||||||
|
2. Evaluate reasoning quality and evidence use
|
||||||
|
3. Highlight the most impactful exchanges
|
||||||
|
4. Analyze the deeper significance of the topic
|
||||||
|
5. Deliver an overall verdict
|
||||||
|
|
||||||
|
## Style
|
||||||
|
- Impartial but with independent judgment
|
||||||
|
- Substantive, not superficial
|
||||||
|
output: "Debate summary report"
|
||||||
frontmatter:
|
frontmatter:
|
||||||
type: object
|
type: object
|
||||||
properties:
|
properties:
|
||||||
$status:
|
$status: { const: done }
|
||||||
enum: ["continue", "conceded"]
|
summary: { type: string }
|
||||||
argument:
|
highlights: { type: string }
|
||||||
type: string
|
verdict: { type: string }
|
||||||
required: [$status, argument]
|
required: [$status, summary, highlights, verdict]
|
||||||
for:
|
|
||||||
description: "Argues for the proposition"
|
|
||||||
goal: |
|
|
||||||
You are a skilled debater arguing FOR the proposition.
|
|
||||||
Be logical, cite evidence, and directly address your opponent's points.
|
|
||||||
Keep each argument concise (under 200 words).
|
|
||||||
capabilities:
|
|
||||||
- argumentation
|
|
||||||
- critical-thinking
|
|
||||||
procedure: |
|
|
||||||
1. Read the opposing side's latest argument carefully.
|
|
||||||
2. Counter their points with evidence and logic.
|
|
||||||
3. If you find yourself genuinely convinced by the other side, you may concede.
|
|
||||||
output: |
|
|
||||||
Provide your argument in the frontmatter.
|
|
||||||
Set status to "conceded" ONLY if you are genuinely convinced and wish to stop debating.
|
|
||||||
Otherwise set status to "continue".
|
|
||||||
frontmatter:
|
|
||||||
type: object
|
|
||||||
properties:
|
|
||||||
$status:
|
|
||||||
enum: ["continue", "conceded"]
|
|
||||||
argument:
|
|
||||||
type: string
|
|
||||||
required: [$status, argument]
|
|
||||||
graph:
|
graph:
|
||||||
$START:
|
$START:
|
||||||
new: { role: "against", prompt: "Present your opening argument against the proposition." }
|
new: { role: proponent, prompt: "The debate begins. You are arguing FOR the proposition. Present your opening argument." }
|
||||||
resume: { role: "against", prompt: "Review the previous debate output and continue the argument against the proposition." }
|
resume: { role: proponent, prompt: "The debate continues." }
|
||||||
against:
|
|
||||||
conceded: { role: "$END", prompt: "The against side conceded. Debate over." }
|
proponent:
|
||||||
continue: { role: "for", prompt: "Counter the opposing argument: {{{argument}}}" }
|
speak: { role: opponent, prompt: "Proponent argues:\n\n{{{argument}}}\n\nYou are the opponent. Counter this argument." }
|
||||||
for:
|
conceded: { role: host, prompt: "The proponent conceded: {{{reason}}}\n\nPlease summarize the debate." }
|
||||||
conceded: { role: "$END", prompt: "The for side conceded. Debate over." }
|
final: { role: opponent, prompt: "Proponent's closing statement:\n\n{{{closing}}}\n\nYou are the opponent. Deliver your final response." }
|
||||||
continue: { role: "against", prompt: "Counter the opposing argument: {{{argument}}}" }
|
|
||||||
|
opponent:
|
||||||
|
speak: { role: proponent, prompt: "Opponent argues:\n\n{{{argument}}}\n\nYou are the proponent. Counter this argument." }
|
||||||
|
conceded: { role: host, prompt: "The opponent conceded: {{{reason}}}\n\nPlease summarize the debate." }
|
||||||
|
final: { role: host, prompt: "Opponent's closing statement:\n\n{{{closing}}}\n\nThe debate is over. Please summarize." }
|
||||||
|
|
||||||
|
host:
|
||||||
|
done: { role: "$END", prompt: "Summary complete." }
|
||||||
|
|||||||
@@ -18,8 +18,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
properties:
|
properties:
|
||||||
$status:
|
$status:
|
||||||
type: string
|
const: done
|
||||||
enum: [done]
|
|
||||||
summary:
|
summary:
|
||||||
type: string
|
type: string
|
||||||
required: [$status, summary]
|
required: [$status, summary]
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
name: "solve-issue"
|
name: "solve-issue"
|
||||||
description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds."
|
description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds. Uses pnpm."
|
||||||
roles:
|
roles:
|
||||||
planner:
|
planner:
|
||||||
description: "Analyzes issue and outputs a TDD test spec"
|
description: "Analyzes issue and outputs a TDD test spec"
|
||||||
@@ -80,7 +80,7 @@ roles:
|
|||||||
2. `git fetch origin` to get latest refs
|
2. `git fetch origin` to get latest refs
|
||||||
3. First time (no existing branch):
|
3. First time (no existing branch):
|
||||||
- `git worktree add .worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
|
- `git worktree add .worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
|
||||||
- `cd .worktrees/fix/<issue-number>-<short-slug> && bun install`
|
- `cd .worktrees/fix/<issue-number>-<short-slug> && pnpm install`
|
||||||
4. If continuing on existing branch (prompt says "Continue work on existing branch" or provides a worktree path):
|
4. If continuing on existing branch (prompt says "Continue work on existing branch" or provides a worktree path):
|
||||||
- cd directly into the worktree path provided in the prompt
|
- cd directly into the worktree path provided in the prompt
|
||||||
- `git fetch origin && git rebase origin/main`
|
- `git fetch origin && git rebase origin/main`
|
||||||
@@ -95,8 +95,20 @@ roles:
|
|||||||
7. If bounced back from reviewer or tester: read the previous role's feedback in your task prompt
|
7. If bounced back from reviewer or tester: read the previous role's feedback in your task prompt
|
||||||
8. Write tests first based on the spec
|
8. Write tests first based on the spec
|
||||||
9. Implement the code to make tests pass
|
9. Implement the code to make tests pass
|
||||||
10. Ensure `bun run build` passes with no errors
|
10. Ensure `pnpm run build` passes with no errors
|
||||||
11. Run `bun test` to verify all tests pass
|
11. Run `pnpm test` to verify all tests pass
|
||||||
|
|
||||||
|
After implementation, before reporting done:
|
||||||
|
12. Add a changeset file (`.changeset/<short-slug>.md`) with correct bump type:
|
||||||
|
- `patch` for bug fixes, internal refactors, test-only changes
|
||||||
|
- `minor` for new features, new CLI commands, new API surfaces
|
||||||
|
- `major` for breaking changes
|
||||||
|
List every affected package in the changeset frontmatter.
|
||||||
|
13. Update documentation if the change affects user-facing behavior:
|
||||||
|
- `README.md` — usage examples, feature descriptions
|
||||||
|
- `.cards/` — architecture decision records (if applicable)
|
||||||
|
- CLI prompt subcommand output (if CLI help text changes)
|
||||||
|
- CLI `--help` text (if flags/commands are added or changed)
|
||||||
|
|
||||||
If you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,
|
If you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,
|
||||||
or repeated attempts fail), set $status=failed with a reason.
|
or repeated attempts fail), set $status=failed with a reason.
|
||||||
@@ -127,8 +139,8 @@ roles:
|
|||||||
|
|
||||||
Then perform code review:
|
Then perform code review:
|
||||||
Hard checks (must all pass):
|
Hard checks (must all pass):
|
||||||
3. `bun run build` — no build errors
|
3. `pnpm run build` — no build errors
|
||||||
4. `bunx biome check` — no lint violations
|
4. `pnpm run check` — no lint violations
|
||||||
5. TypeScript strict mode — no type errors
|
5. TypeScript strict mode — no type errors
|
||||||
|
|
||||||
Soft checks (review against project conventions if CLAUDE.md / .cursor/rules exist):
|
Soft checks (review against project conventions if CLAUDE.md / .cursor/rules exist):
|
||||||
@@ -136,6 +148,14 @@ roles:
|
|||||||
- No `console.log` in production code
|
- No `console.log` in production code
|
||||||
- No dynamic imports in production code
|
- No dynamic imports in production code
|
||||||
|
|
||||||
|
Documentation & changeset checks:
|
||||||
|
6. Changeset exists in `.changeset/` with correct bump type (`patch`/`minor`/`major`) and lists all affected packages
|
||||||
|
7. If the change is user-facing, documentation is updated:
|
||||||
|
- `README.md` reflects new/changed behavior
|
||||||
|
- `.cards/` architecture cards updated if design decisions changed
|
||||||
|
- CLI prompt subcommand output updated (if it generates skill/reference content)
|
||||||
|
- CLI `--help` text matches new flags/commands
|
||||||
|
|
||||||
Only review standards compliance. Do NOT test functionality.
|
Only review standards compliance. Do NOT test functionality.
|
||||||
If rejecting, you MUST explain the specific reason in your output.
|
If rejecting, you MUST explain the specific reason in your output.
|
||||||
output: "Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments)."
|
output: "Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments)."
|
||||||
@@ -159,7 +179,7 @@ roles:
|
|||||||
procedure: |
|
procedure: |
|
||||||
The worktree path is provided in your task prompt. cd into it first.
|
The worktree path is provided in your task prompt. cd into it first.
|
||||||
|
|
||||||
1. Run `bun test` for automated test verification
|
1. Run `pnpm test` for automated test verification
|
||||||
2. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner step in the thread history)
|
2. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner step in the thread history)
|
||||||
3. Verify each scenario in the spec is covered and passing
|
3. Verify each scenario in the spec is covered and passing
|
||||||
4. Determine outcome:
|
4. Determine outcome:
|
||||||
|
|||||||
+1
-1
@@ -21,7 +21,7 @@
|
|||||||
"@agentclientprotocol/sdk": "^0.22.1",
|
"@agentclientprotocol/sdk": "^0.22.1",
|
||||||
"@biomejs/biome": "^2.4.14",
|
"@biomejs/biome": "^2.4.14",
|
||||||
"@changesets/cli": "^2.31.0",
|
"@changesets/cli": "^2.31.0",
|
||||||
"@shazhou/proman": "^0.5.1",
|
"@shazhou/proman": "^0.6.3",
|
||||||
"@types/node": "^25.7.0",
|
"@types/node": "^25.7.0",
|
||||||
"@types/xxhashjs": "^0.2.4",
|
"@types/xxhashjs": "^0.2.4",
|
||||||
"@united-workforce/agent-hermes": "workspace:*",
|
"@united-workforce/agent-hermes": "workspace:*",
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/agent-builtin",
|
"name": "@united-workforce/agent-builtin",
|
||||||
"version": "0.1.2",
|
"version": "0.1.3",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -21,7 +21,7 @@
|
|||||||
"test:ci": "vitest run __tests__/"
|
"test:ci": "vitest run __tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"@united-workforce/util-agent": "workspace:^"
|
"@united-workforce/util-agent": "workspace:^"
|
||||||
},
|
},
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/agent-claude-code",
|
"name": "@united-workforce/agent-claude-code",
|
||||||
"version": "0.1.2",
|
"version": "0.1.4",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -21,7 +21,7 @@
|
|||||||
"test:ci": "vitest run __tests__/"
|
"test:ci": "vitest run __tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@united-workforce/protocol": "workspace:^",
|
"@united-workforce/protocol": "workspace:^",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"@united-workforce/util-agent": "workspace:^"
|
"@united-workforce/util-agent": "workspace:^"
|
||||||
|
|||||||
@@ -6,7 +6,9 @@ import {
|
|||||||
type AgentContext,
|
type AgentContext,
|
||||||
type AgentRunResult,
|
type AgentRunResult,
|
||||||
buildContinuationPrompt,
|
buildContinuationPrompt,
|
||||||
|
buildFrontmatterRetryPrompt,
|
||||||
buildRolePrompt,
|
buildRolePrompt,
|
||||||
|
buildThreadProgress,
|
||||||
createAgent,
|
createAgent,
|
||||||
getCachedSessionId,
|
getCachedSessionId,
|
||||||
setCachedSessionId,
|
setCachedSessionId,
|
||||||
@@ -27,6 +29,10 @@ export function buildClaudeCodePrompt(ctx: AgentContext): string {
|
|||||||
if (ctx.outputFormatInstruction !== undefined && ctx.outputFormatInstruction !== "") {
|
if (ctx.outputFormatInstruction !== undefined && ctx.outputFormatInstruction !== "") {
|
||||||
parts.push(ctx.outputFormatInstruction, "");
|
parts.push(ctx.outputFormatInstruction, "");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Inject thread progress so the agent knows step count and role visit count
|
||||||
|
parts.push(buildThreadProgress(ctx.steps, ctx.role), "");
|
||||||
|
|
||||||
parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
|
parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
|
||||||
|
|
||||||
if (!ctx.isFirstVisit) {
|
if (!ctx.isFirstVisit) {
|
||||||
@@ -171,8 +177,12 @@ async function runClaudeCode(ctx: AgentContext, model: string | null): Promise<A
|
|||||||
|
|
||||||
log("K7R2M4N8", `prompt for role=${ctx.role} (length=${fullPrompt.length}):\n${fullPrompt}`);
|
log("K7R2M4N8", `prompt for role=${ctx.role} (length=${fullPrompt.length}):\n${fullPrompt}`);
|
||||||
|
|
||||||
// Try resuming a cached session for re-entry scenarios (e.g. reviewer reject → developer re-entry).
|
// Try resuming a cached session. This covers both normal re-entry
|
||||||
if (!ctx.isFirstVisit) {
|
// (e.g. reviewer reject → developer re-entry) AND the case where a
|
||||||
|
// previous run completed but frontmatter validation failed — the step
|
||||||
|
// was never written to CAS so isFirstVisit is still true, but the
|
||||||
|
// session cache holds a valid session we should resume.
|
||||||
|
{
|
||||||
const cachedSessionId = await getCachedSessionId(
|
const cachedSessionId = await getCachedSessionId(
|
||||||
"claude-code",
|
"claude-code",
|
||||||
ctx.threadId,
|
ctx.threadId,
|
||||||
@@ -180,13 +190,20 @@ async function runClaudeCode(ctx: AgentContext, model: string | null): Promise<A
|
|||||||
ctx.storageRoot,
|
ctx.storageRoot,
|
||||||
);
|
);
|
||||||
if (cachedSessionId !== null) {
|
if (cachedSessionId !== null) {
|
||||||
|
// isFirstVisit + cache hit = previous run completed but frontmatter
|
||||||
|
// validation failed. The session already has full context — send a
|
||||||
|
// minimal correction prompt instead of the full initial prompt.
|
||||||
|
const resumePrompt = ctx.isFirstVisit
|
||||||
|
? buildFrontmatterRetryPrompt(ctx.outputFormatInstruction)
|
||||||
|
: fullPrompt;
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const { stdout, stderr, exitCode } = await spawnClaudeResume(
|
const { stdout, stderr, exitCode } = await spawnClaudeResume(
|
||||||
cachedSessionId,
|
cachedSessionId,
|
||||||
fullPrompt,
|
resumePrompt,
|
||||||
model,
|
model,
|
||||||
);
|
);
|
||||||
const result = await processClaudeOutput(stdout, stderr, exitCode, ctx.store, fullPrompt);
|
const result = await processClaudeOutput(stdout, stderr, exitCode, ctx.store, resumePrompt);
|
||||||
if (result.sessionId !== undefined && result.sessionId !== "") {
|
if (result.sessionId !== undefined && result.sessionId !== "") {
|
||||||
await setCachedSessionId(
|
await setCachedSessionId(
|
||||||
"claude-code",
|
"claude-code",
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/agent-hermes",
|
"name": "@united-workforce/agent-hermes",
|
||||||
"version": "0.1.3",
|
"version": "0.1.5",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -21,7 +21,7 @@
|
|||||||
"test:ci": "vitest run __tests__/"
|
"test:ci": "vitest run __tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@united-workforce/protocol": "workspace:^",
|
"@united-workforce/protocol": "workspace:^",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"@united-workforce/util-agent": "workspace:^"
|
"@united-workforce/util-agent": "workspace:^"
|
||||||
|
|||||||
@@ -12,7 +12,11 @@ const OWN_VERSION = (
|
|||||||
}
|
}
|
||||||
).version;
|
).version;
|
||||||
|
|
||||||
const HERMES_COMMAND = "hermes";
|
/** Resolve hermes binary: `UWF_HERMES_BIN` override → default `"hermes"` via PATH. */
|
||||||
|
function resolveHermesCommand(): string {
|
||||||
|
const override = process.env.UWF_HERMES_BIN;
|
||||||
|
return override !== undefined && override !== "" ? override : "hermes";
|
||||||
|
}
|
||||||
const PROTOCOL_VERSION = 1;
|
const PROTOCOL_VERSION = 1;
|
||||||
|
|
||||||
type JsonRpcResponse = {
|
type JsonRpcResponse = {
|
||||||
@@ -271,7 +275,8 @@ export class HermesAcpClient {
|
|||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
const child = spawn(HERMES_COMMAND, ["acp"], {
|
const hermesCommand = resolveHermesCommand();
|
||||||
|
const child = spawn(hermesCommand, ["acp"], {
|
||||||
env: process.env,
|
env: process.env,
|
||||||
shell: false,
|
shell: false,
|
||||||
stdio: ["pipe", "pipe", "pipe"],
|
stdio: ["pipe", "pipe", "pipe"],
|
||||||
|
|||||||
@@ -5,7 +5,9 @@ import {
|
|||||||
type AgentContext,
|
type AgentContext,
|
||||||
type AgentRunResult,
|
type AgentRunResult,
|
||||||
buildContinuationPrompt,
|
buildContinuationPrompt,
|
||||||
|
buildFrontmatterRetryPrompt,
|
||||||
buildRolePrompt,
|
buildRolePrompt,
|
||||||
|
buildThreadProgress,
|
||||||
createAgent,
|
createAgent,
|
||||||
} from "@united-workforce/util-agent";
|
} from "@united-workforce/util-agent";
|
||||||
import type { AcpUsage } from "./acp-client.js";
|
import type { AcpUsage } from "./acp-client.js";
|
||||||
@@ -60,6 +62,9 @@ export function buildHermesPrompt(ctx: AgentContext): string {
|
|||||||
parts.push(ctx.outputFormatInstruction, "");
|
parts.push(ctx.outputFormatInstruction, "");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Inject thread progress so the agent knows step count and role visit count
|
||||||
|
parts.push(buildThreadProgress(ctx.steps, ctx.role), "");
|
||||||
|
|
||||||
if (!ctx.isFirstVisit) {
|
if (!ctx.isFirstVisit) {
|
||||||
// Re-entry: show only steps since last visit, meta only
|
// Re-entry: show only steps since last visit, meta only
|
||||||
parts.push(buildContinuationPrompt(ctx.steps, ctx.role, ctx.edgePrompt));
|
parts.push(buildContinuationPrompt(ctx.steps, ctx.role, ctx.edgePrompt));
|
||||||
@@ -98,6 +103,8 @@ async function storePromptResult(store: Store, sessionId: string): Promise<{ det
|
|||||||
type PromptAttempt = {
|
type PromptAttempt = {
|
||||||
useContinuation: boolean;
|
useContinuation: boolean;
|
||||||
resumed: boolean;
|
resumed: boolean;
|
||||||
|
/** True when resuming after a frontmatter-only failure (isFirstVisit + cache hit). */
|
||||||
|
frontmatterRetry: boolean;
|
||||||
};
|
};
|
||||||
|
|
||||||
async function prepareSession(
|
async function prepareSession(
|
||||||
@@ -106,28 +113,36 @@ async function prepareSession(
|
|||||||
cwd: string,
|
cwd: string,
|
||||||
resumeDisabled: boolean,
|
resumeDisabled: boolean,
|
||||||
): Promise<PromptAttempt> {
|
): Promise<PromptAttempt> {
|
||||||
if (ctx.isFirstVisit || resumeDisabled) {
|
if (resumeDisabled) {
|
||||||
await client.connect(cwd);
|
await client.connect(cwd);
|
||||||
return { useContinuation: false, resumed: false };
|
return { useContinuation: false, resumed: false, frontmatterRetry: false };
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Check session cache regardless of isFirstVisit. A previous run may
|
||||||
|
// have completed and cached its session but failed frontmatter
|
||||||
|
// validation — the step never got written to CAS so isFirstVisit is
|
||||||
|
// still true, yet we should resume the existing session.
|
||||||
const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role, ctx.storageRoot);
|
const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role, ctx.storageRoot);
|
||||||
if (cachedSessionId === null) {
|
if (cachedSessionId === null) {
|
||||||
log("6RWK3N8Q", `no cached session for ${ctx.threadId}:${ctx.role}, starting new session`);
|
log("6RWK3N8Q", `no cached session for ${ctx.threadId}:${ctx.role}, starting new session`);
|
||||||
await client.connect(cwd);
|
await client.connect(cwd);
|
||||||
return { useContinuation: false, resumed: false };
|
return { useContinuation: false, resumed: false, frontmatterRetry: false };
|
||||||
}
|
}
|
||||||
|
|
||||||
try {
|
try {
|
||||||
await client.resume(cachedSessionId, cwd);
|
await client.resume(cachedSessionId, cwd);
|
||||||
log("9MHT4V2P", `resumed hermes session ${cachedSessionId} for ${ctx.threadId}:${ctx.role}`);
|
log("9MHT4V2P", `resumed hermes session ${cachedSessionId} for ${ctx.threadId}:${ctx.role}`);
|
||||||
return { useContinuation: true, resumed: true };
|
return {
|
||||||
|
useContinuation: !ctx.isFirstVisit,
|
||||||
|
resumed: true,
|
||||||
|
frontmatterRetry: ctx.isFirstVisit,
|
||||||
|
};
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
const message = error instanceof Error ? error.message : String(error);
|
const message = error instanceof Error ? error.message : String(error);
|
||||||
log("3XPN7K4W", `session resume failed, falling back to new session: ${message}`);
|
log("3XPN7K4W", `session resume failed, falling back to new session: ${message}`);
|
||||||
await client.close();
|
await client.close();
|
||||||
await client.connect(cwd);
|
await client.connect(cwd);
|
||||||
return { useContinuation: false, resumed: false };
|
return { useContinuation: false, resumed: false, frontmatterRetry: false };
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -150,9 +165,12 @@ export function createHermesAgent(resumeDisabled: boolean): () => Promise<void>
|
|||||||
ctx: AgentContext,
|
ctx: AgentContext,
|
||||||
useContinuation: boolean,
|
useContinuation: boolean,
|
||||||
beforeTurns: TurnsSnapshot,
|
beforeTurns: TurnsSnapshot,
|
||||||
|
frontmatterRetry: boolean,
|
||||||
): Promise<AgentRunResult> {
|
): Promise<AgentRunResult> {
|
||||||
const effectiveCtx = useContinuation ? ctx : { ...ctx, isFirstVisit: true };
|
// Frontmatter retry: session has full context, just re-output the format.
|
||||||
const fullPrompt = buildHermesPrompt(effectiveCtx);
|
const fullPrompt = frontmatterRetry
|
||||||
|
? buildFrontmatterRetryPrompt(ctx.outputFormatInstruction)
|
||||||
|
: buildHermesPrompt(useContinuation ? ctx : { ...ctx, isFirstVisit: true });
|
||||||
const startMs = Date.now();
|
const startMs = Date.now();
|
||||||
const { text, sessionId, usage: acpUsage } = await client.prompt(fullPrompt);
|
const { text, sessionId, usage: acpUsage } = await client.prompt(fullPrompt);
|
||||||
const durationSec = (Date.now() - startMs) / 1000;
|
const durationSec = (Date.now() - startMs) / 1000;
|
||||||
@@ -184,7 +202,7 @@ export function createHermesAgent(resumeDisabled: boolean): () => Promise<void>
|
|||||||
const beforeTurns = snapshotTurns(beforeSession);
|
const beforeTurns = snapshotTurns(beforeSession);
|
||||||
|
|
||||||
try {
|
try {
|
||||||
return await runPrompt(ctx, attempt.useContinuation, beforeTurns);
|
return await runPrompt(ctx, attempt.useContinuation, beforeTurns, attempt.frontmatterRetry);
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
if (!attempt.resumed) {
|
if (!attempt.resumed) {
|
||||||
throw error;
|
throw error;
|
||||||
@@ -195,7 +213,7 @@ export function createHermesAgent(resumeDisabled: boolean): () => Promise<void>
|
|||||||
await client.close();
|
await client.close();
|
||||||
await client.connect(cwd);
|
await client.connect(cwd);
|
||||||
// Fresh session after retry — reset snapshot to zero
|
// Fresh session after retry — reset snapshot to zero
|
||||||
return runPrompt(ctx, false, ZERO_TURNS);
|
return runPrompt(ctx, false, ZERO_TURNS, false);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/agent-mock",
|
"name": "@united-workforce/agent-mock",
|
||||||
"version": "0.1.2",
|
"version": "0.1.3",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -21,7 +21,7 @@
|
|||||||
"test:ci": "vitest run __tests__/"
|
"test:ci": "vitest run __tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@united-workforce/protocol": "workspace:^",
|
"@united-workforce/protocol": "workspace:^",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"@united-workforce/util-agent": "workspace:^",
|
"@united-workforce/util-agent": "workspace:^",
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/cli",
|
"name": "@united-workforce/cli",
|
||||||
"version": "0.3.0",
|
"version": "0.3.1",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -11,8 +11,8 @@
|
|||||||
"uwf": "./dist/cli.js"
|
"uwf": "./dist/cli.js"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@ocas/fs": "^0.3.0",
|
"@ocas/fs": "^0.4.0",
|
||||||
"@united-workforce/protocol": "workspace:^",
|
"@united-workforce/protocol": "workspace:^",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"@united-workforce/util-agent": "workspace:^",
|
"@united-workforce/util-agent": "workspace:^",
|
||||||
|
|||||||
@@ -21,11 +21,11 @@ describe("solve-issue workflow: Gitea API PR creation", () => {
|
|||||||
"..",
|
"..",
|
||||||
"..",
|
"..",
|
||||||
"..",
|
"..",
|
||||||
".workflows",
|
"examples",
|
||||||
"solve-issue.yaml",
|
"solve-issue.yaml",
|
||||||
);
|
);
|
||||||
|
|
||||||
test("committer procedure should use curl API instead of tea pr create", async () => {
|
test("committer procedure should create PR via tea pr create", async () => {
|
||||||
const yamlContent = await readFile(workflowPath, "utf-8");
|
const yamlContent = await readFile(workflowPath, "utf-8");
|
||||||
const workflow = parse(yamlContent) as WorkflowPayload;
|
const workflow = parse(yamlContent) as WorkflowPayload;
|
||||||
|
|
||||||
@@ -33,25 +33,22 @@ describe("solve-issue workflow: Gitea API PR creation", () => {
|
|||||||
const committerProcedure = workflow.roles.committer?.procedure;
|
const committerProcedure = workflow.roles.committer?.procedure;
|
||||||
expect(committerProcedure).toBeDefined();
|
expect(committerProcedure).toBeDefined();
|
||||||
|
|
||||||
// Verify the procedure uses curl API, not tea pr create
|
// Verify the procedure uses tea pr create for PR creation
|
||||||
expect(committerProcedure).toContain("curl");
|
expect(committerProcedure).toContain("tea pr create");
|
||||||
expect(committerProcedure).toContain("api/v1/repos");
|
expect(committerProcedure).toContain("git push");
|
||||||
expect(committerProcedure).toContain("/pulls");
|
expect(committerProcedure).toContain("Fixes #N");
|
||||||
|
|
||||||
// Verify it explicitly warns against tea pr create
|
|
||||||
expect(committerProcedure).toMatch(/do NOT use.*tea pr create/i);
|
|
||||||
});
|
});
|
||||||
|
|
||||||
test("committer procedure should reference repoRemote from task prompt", async () => {
|
test("committer procedure should extract owner/repo from git remote", async () => {
|
||||||
const yamlContent = await readFile(workflowPath, "utf-8");
|
const yamlContent = await readFile(workflowPath, "utf-8");
|
||||||
const workflow = parse(yamlContent) as WorkflowPayload;
|
const workflow = parse(yamlContent) as WorkflowPayload;
|
||||||
|
|
||||||
const committerProcedure = workflow.roles.committer?.procedure;
|
const committerProcedure = workflow.roles.committer?.procedure;
|
||||||
expect(committerProcedure).toBeDefined();
|
expect(committerProcedure).toBeDefined();
|
||||||
|
|
||||||
// Verify the procedure mentions repoRemote is provided in task prompt
|
// Verify the procedure extracts owner/repo from remote
|
||||||
expect(committerProcedure).toMatch(/repo remote.*provided.*task prompt/i);
|
expect(committerProcedure).toContain("git remote get-url origin");
|
||||||
expect(committerProcedure).toMatch(/owner\/repo/i);
|
expect(committerProcedure).toContain("hook_failed");
|
||||||
});
|
});
|
||||||
|
|
||||||
test("committer procedure should include error handling for curl failures", async () => {
|
test("committer procedure should include error handling for curl failures", async () => {
|
||||||
@@ -100,45 +97,42 @@ describe("solve-issue workflow: Gitea API PR creation", () => {
|
|||||||
expect(committedVariant.required).toContain("$status");
|
expect(committedVariant.required).toContain("$status");
|
||||||
});
|
});
|
||||||
|
|
||||||
test("developer procedure should include mandatory verification step", async () => {
|
test("developer procedure should include worktree setup", async () => {
|
||||||
const yamlContent = await readFile(workflowPath, "utf-8");
|
const yamlContent = await readFile(workflowPath, "utf-8");
|
||||||
const workflow = parse(yamlContent) as WorkflowPayload;
|
const workflow = parse(yamlContent) as WorkflowPayload;
|
||||||
|
|
||||||
const developerProcedure = workflow.roles.developer?.procedure;
|
const developerProcedure = workflow.roles.developer?.procedure;
|
||||||
expect(developerProcedure).toBeDefined();
|
expect(developerProcedure).toBeDefined();
|
||||||
|
|
||||||
// Verify the procedure includes mandatory verification step
|
// Verify the procedure includes worktree setup
|
||||||
expect(developerProcedure).toContain("MANDATORY VERIFICATION");
|
expect(developerProcedure).toContain("IMPORTANT");
|
||||||
expect(developerProcedure).toContain("git branch --show-current");
|
expect(developerProcedure).toContain("git worktree add");
|
||||||
expect(developerProcedure).toContain("git status");
|
expect(developerProcedure).toContain("pnpm install");
|
||||||
expect(developerProcedure).toMatch(/ls -la|verify.*exist/i);
|
|
||||||
});
|
});
|
||||||
|
|
||||||
test("reviewer procedure should enforce worktree path verification", async () => {
|
test("reviewer procedure should verify branch and run checks", async () => {
|
||||||
const yamlContent = await readFile(workflowPath, "utf-8");
|
const yamlContent = await readFile(workflowPath, "utf-8");
|
||||||
const workflow = parse(yamlContent) as WorkflowPayload;
|
const workflow = parse(yamlContent) as WorkflowPayload;
|
||||||
|
|
||||||
const reviewerProcedure = workflow.roles.reviewer?.procedure;
|
const reviewerProcedure = workflow.roles.reviewer?.procedure;
|
||||||
expect(reviewerProcedure).toBeDefined();
|
expect(reviewerProcedure).toBeDefined();
|
||||||
|
|
||||||
// Verify the procedure includes critical enforcement
|
// Verify the procedure includes branch verification and build checks
|
||||||
expect(reviewerProcedure).toContain("CRITICAL");
|
expect(reviewerProcedure).toContain("git branch --show-current");
|
||||||
expect(reviewerProcedure).toMatch(/cd.*pwd/);
|
expect(reviewerProcedure).toContain("pnpm run build");
|
||||||
expect(reviewerProcedure).toContain(
|
expect(reviewerProcedure).toContain("pnpm run check");
|
||||||
"Do NOT report results without running the actual commands",
|
|
||||||
);
|
|
||||||
});
|
});
|
||||||
|
|
||||||
test("developer procedure should include test debugging escalation", async () => {
|
test("developer procedure should include changeset and failure handling", async () => {
|
||||||
const yamlContent = await readFile(workflowPath, "utf-8");
|
const yamlContent = await readFile(workflowPath, "utf-8");
|
||||||
const workflow = parse(yamlContent) as WorkflowPayload;
|
const workflow = parse(yamlContent) as WorkflowPayload;
|
||||||
|
|
||||||
const developerProcedure = workflow.roles.developer?.procedure;
|
const developerProcedure = workflow.roles.developer?.procedure;
|
||||||
expect(developerProcedure).toBeDefined();
|
expect(developerProcedure).toBeDefined();
|
||||||
|
|
||||||
// Verify the procedure includes test failure guidance
|
// Verify the procedure includes changeset requirement and failure path
|
||||||
expect(developerProcedure).toMatch(/tests fail.*first run/i);
|
expect(developerProcedure).toContain(".changeset/");
|
||||||
expect(developerProcedure).toMatch(/3 test cycles|after 3 attempts/i);
|
|
||||||
expect(developerProcedure).toContain("$status=failed");
|
expect(developerProcedure).toContain("$status=failed");
|
||||||
|
expect(developerProcedure).toContain("pnpm test");
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|||||||
@@ -0,0 +1,549 @@
|
|||||||
|
import { execFileSync } from "node:child_process";
|
||||||
|
import { mkdir, mkdtemp, readFile, rm, writeFile } from "node:fs/promises";
|
||||||
|
import { tmpdir } from "node:os";
|
||||||
|
import { dirname, join } from "node:path";
|
||||||
|
import { fileURLToPath } from "node:url";
|
||||||
|
import { putSchema } from "@ocas/core";
|
||||||
|
import { openStore } from "@ocas/fs";
|
||||||
|
import type {
|
||||||
|
CasRef,
|
||||||
|
StepNodePayload,
|
||||||
|
ThreadId,
|
||||||
|
ThreadIndexEntry,
|
||||||
|
} from "@united-workforce/protocol";
|
||||||
|
import { afterEach, beforeEach, describe, expect, test } from "vitest";
|
||||||
|
import { registerUwfSchemas } from "../schemas.js";
|
||||||
|
import { seedThreads } from "./thread-test-helpers.js";
|
||||||
|
|
||||||
|
const OUTPUT_SCHEMA = {
|
||||||
|
type: "object" as const,
|
||||||
|
properties: {
|
||||||
|
$status: { type: "string" as const },
|
||||||
|
note: { type: "string" as const },
|
||||||
|
},
|
||||||
|
required: ["$status"],
|
||||||
|
additionalProperties: false,
|
||||||
|
};
|
||||||
|
|
||||||
|
const THREAD_ID = "01POKESTEPTEST00000000" as ThreadId;
|
||||||
|
|
||||||
|
let tmpDir: string;
|
||||||
|
|
||||||
|
beforeEach(async () => {
|
||||||
|
tmpDir = await mkdtemp(join(tmpdir(), "cli-uwf-poke-test-"));
|
||||||
|
});
|
||||||
|
|
||||||
|
afterEach(async () => {
|
||||||
|
await rm(tmpDir, { recursive: true, force: true });
|
||||||
|
});
|
||||||
|
|
||||||
|
type SetupResult = {
|
||||||
|
casDir: string;
|
||||||
|
oldStepHash: CasRef;
|
||||||
|
oldStepPrev: CasRef | null;
|
||||||
|
oldStepCompletedAtMs: number;
|
||||||
|
startHash: CasRef;
|
||||||
|
workflowHash: CasRef;
|
||||||
|
mockAgentPath: string;
|
||||||
|
failingAgentPath: string;
|
||||||
|
promptCapturePath: string;
|
||||||
|
envCapturePath: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
type SetupOpts = {
|
||||||
|
threadStatus: ThreadIndexEntry["status"];
|
||||||
|
multipleSteps: boolean;
|
||||||
|
newCompletedAtMs: number;
|
||||||
|
newStatus: string;
|
||||||
|
// The agent name to record in the head StepNode.agent field. Defaults to mockAgentPath.
|
||||||
|
stepAgentNameOverride: string | null;
|
||||||
|
// Whether to seed an actual head StepNode (false → only StartNode is the head).
|
||||||
|
withHeadStep: boolean;
|
||||||
|
};
|
||||||
|
|
||||||
|
async function setupThread(opts: Partial<SetupOpts> = {}): Promise<SetupResult> {
|
||||||
|
const cfg: SetupOpts = {
|
||||||
|
threadStatus: opts.threadStatus ?? "idle",
|
||||||
|
multipleSteps: opts.multipleSteps ?? false,
|
||||||
|
newCompletedAtMs: opts.newCompletedAtMs ?? 1716600005000,
|
||||||
|
newStatus: opts.newStatus ?? "ok",
|
||||||
|
stepAgentNameOverride: opts.stepAgentNameOverride ?? null,
|
||||||
|
withHeadStep: opts.withHeadStep ?? true,
|
||||||
|
};
|
||||||
|
|
||||||
|
const casDir = join(tmpDir, "cas");
|
||||||
|
await mkdir(casDir, { recursive: true });
|
||||||
|
|
||||||
|
const store = await openStore(casDir);
|
||||||
|
const schemas = await registerUwfSchemas(store);
|
||||||
|
const outputSchemaHash = await putSchema(store, OUTPUT_SCHEMA);
|
||||||
|
|
||||||
|
const workflowHash = await store.cas.put(schemas.workflow, {
|
||||||
|
name: "test-poke",
|
||||||
|
description: "poke command integration test",
|
||||||
|
roles: {
|
||||||
|
worker: {
|
||||||
|
description: "Worker role",
|
||||||
|
goal: "Work",
|
||||||
|
capabilities: [],
|
||||||
|
procedure: "work",
|
||||||
|
output: "result",
|
||||||
|
frontmatter: outputSchemaHash,
|
||||||
|
},
|
||||||
|
reviewer: {
|
||||||
|
description: "Reviewer role",
|
||||||
|
goal: "Review",
|
||||||
|
capabilities: [],
|
||||||
|
procedure: "review",
|
||||||
|
output: "result",
|
||||||
|
frontmatter: outputSchemaHash,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
graph: {
|
||||||
|
$START: {
|
||||||
|
new: { role: "worker", prompt: "Start work", location: null },
|
||||||
|
resume: { role: "worker", prompt: "Resume the work", location: null },
|
||||||
|
},
|
||||||
|
worker: {
|
||||||
|
ok: { role: "reviewer", prompt: "Review the work", location: null },
|
||||||
|
needs_input: {
|
||||||
|
role: "$SUSPEND",
|
||||||
|
prompt: "Please clarify",
|
||||||
|
location: null,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
reviewer: { done: { role: "$END", prompt: "Done", location: null } },
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
const startHash = await store.cas.put(schemas.startNode, {
|
||||||
|
workflow: workflowHash,
|
||||||
|
prompt: "Test poke task",
|
||||||
|
cwd: tmpDir,
|
||||||
|
});
|
||||||
|
|
||||||
|
process.env.OCAS_HOME = casDir;
|
||||||
|
|
||||||
|
// Paths for mock agent and capture files (set early so we can use mockAgentPath as the recorded agent name)
|
||||||
|
const promptCapturePath = join(tmpDir, "captured-prompt.txt");
|
||||||
|
const envCapturePath = join(tmpDir, "captured-env.txt");
|
||||||
|
const mockAgentPath = join(tmpDir, "mock-agent.sh");
|
||||||
|
const failingAgentPath = join(tmpDir, "failing-agent.sh");
|
||||||
|
|
||||||
|
// Build head StepNode chain
|
||||||
|
let oldStepPrev: CasRef | null = null;
|
||||||
|
if (cfg.multipleSteps) {
|
||||||
|
// First step: prev=null
|
||||||
|
const firstOutputHash = await store.cas.put(outputSchemaHash, { $status: "ok" });
|
||||||
|
const firstDetailHash = await store.cas.put(schemas.text, "first detail");
|
||||||
|
const firstStepHash = await store.cas.put(schemas.stepNode, {
|
||||||
|
start: startHash,
|
||||||
|
prev: null,
|
||||||
|
role: "worker",
|
||||||
|
output: firstOutputHash,
|
||||||
|
detail: firstDetailHash,
|
||||||
|
agent: cfg.stepAgentNameOverride ?? mockAgentPath,
|
||||||
|
edgePrompt: "Start work",
|
||||||
|
startedAtMs: 1716600000000,
|
||||||
|
completedAtMs: 1716600001000,
|
||||||
|
cwd: tmpDir,
|
||||||
|
assembledPrompt: null,
|
||||||
|
usage: null,
|
||||||
|
});
|
||||||
|
oldStepPrev = firstStepHash;
|
||||||
|
}
|
||||||
|
|
||||||
|
let oldStepHash: CasRef = startHash;
|
||||||
|
const oldStepCompletedAtMs = 1716600002000;
|
||||||
|
if (cfg.withHeadStep) {
|
||||||
|
const outputHash = await store.cas.put(outputSchemaHash, { $status: "ok" });
|
||||||
|
const detailHash = await store.cas.put(schemas.text, "head step detail");
|
||||||
|
oldStepHash = await store.cas.put(schemas.stepNode, {
|
||||||
|
start: startHash,
|
||||||
|
prev: oldStepPrev,
|
||||||
|
role: "worker",
|
||||||
|
output: outputHash,
|
||||||
|
detail: detailHash,
|
||||||
|
agent: cfg.stepAgentNameOverride ?? mockAgentPath,
|
||||||
|
edgePrompt: "Start work",
|
||||||
|
startedAtMs: 1716600001500,
|
||||||
|
completedAtMs: oldStepCompletedAtMs,
|
||||||
|
cwd: tmpDir,
|
||||||
|
assembledPrompt: null,
|
||||||
|
usage: null,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// Seed thread index entry. For "running" we let the test create the marker separately.
|
||||||
|
await seedThreads(tmpDir, {
|
||||||
|
[THREAD_ID]: {
|
||||||
|
head: oldStepHash,
|
||||||
|
status: cfg.threadStatus,
|
||||||
|
suspendedRole: cfg.threadStatus === "suspended" ? "worker" : null,
|
||||||
|
suspendMessage: cfg.threadStatus === "suspended" ? "Please clarify" : null,
|
||||||
|
completedAt:
|
||||||
|
cfg.threadStatus === "completed" || cfg.threadStatus === "cancelled"
|
||||||
|
? oldStepCompletedAtMs
|
||||||
|
: null,
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
// Mock agent always emits a stepNode keyed off the current thread head (which we
|
||||||
|
// observe through OCAS_HOME). The script writes prompt/env captures and then prints
|
||||||
|
// an adapter JSON that references a pre-built stepHash.
|
||||||
|
// We pre-build the agent's stepHash with prev=oldStepHash (normal append behaviour).
|
||||||
|
const newOutputHash = await store.cas.put(outputSchemaHash, {
|
||||||
|
$status: cfg.newStatus,
|
||||||
|
note: "poked output",
|
||||||
|
});
|
||||||
|
const newDetailHash = await store.cas.put(schemas.text, "poked detail");
|
||||||
|
const agentStepHash = await store.cas.put(schemas.stepNode, {
|
||||||
|
start: startHash,
|
||||||
|
prev: cfg.withHeadStep ? oldStepHash : null,
|
||||||
|
role: "worker",
|
||||||
|
output: newOutputHash,
|
||||||
|
detail: newDetailHash,
|
||||||
|
agent: "mock-agent-output",
|
||||||
|
edgePrompt: "poke prompt placeholder",
|
||||||
|
startedAtMs: cfg.newCompletedAtMs - 100,
|
||||||
|
completedAtMs: cfg.newCompletedAtMs,
|
||||||
|
cwd: tmpDir,
|
||||||
|
assembledPrompt: null,
|
||||||
|
usage: null,
|
||||||
|
});
|
||||||
|
|
||||||
|
const adapterJson = JSON.stringify({
|
||||||
|
stepHash: agentStepHash,
|
||||||
|
detailHash: newDetailHash,
|
||||||
|
role: "worker",
|
||||||
|
frontmatter: { $status: cfg.newStatus, note: "poked output" },
|
||||||
|
body: "",
|
||||||
|
startedAtMs: cfg.newCompletedAtMs - 100,
|
||||||
|
completedAtMs: cfg.newCompletedAtMs,
|
||||||
|
usage: null,
|
||||||
|
});
|
||||||
|
|
||||||
|
await writeFile(
|
||||||
|
mockAgentPath,
|
||||||
|
`#!/bin/sh
|
||||||
|
prompt=""
|
||||||
|
while [ $# -gt 0 ]; do
|
||||||
|
if [ "$1" = "--prompt" ]; then
|
||||||
|
prompt="$2"
|
||||||
|
shift 2
|
||||||
|
else
|
||||||
|
shift
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
printf '%s' "$prompt" > '${promptCapturePath}'
|
||||||
|
printf 'OCAS_HOME=%s\\n' "$OCAS_HOME" > '${envCapturePath}'
|
||||||
|
echo '${adapterJson}'
|
||||||
|
`,
|
||||||
|
{ mode: 0o755 },
|
||||||
|
);
|
||||||
|
|
||||||
|
await writeFile(
|
||||||
|
failingAgentPath,
|
||||||
|
`#!/bin/sh
|
||||||
|
echo "boom" >&2
|
||||||
|
exit 7
|
||||||
|
`,
|
||||||
|
{ mode: 0o755 },
|
||||||
|
);
|
||||||
|
|
||||||
|
const configPath = join(tmpDir, "config.yaml");
|
||||||
|
await writeFile(
|
||||||
|
configPath,
|
||||||
|
`defaultAgent: uwf-hermes\ndefaultModel: test-model\nagentOverrides: null\nagents: {}\nproviders: {}\nmodels: {}\n`,
|
||||||
|
);
|
||||||
|
|
||||||
|
return {
|
||||||
|
casDir,
|
||||||
|
oldStepHash,
|
||||||
|
oldStepPrev,
|
||||||
|
oldStepCompletedAtMs,
|
||||||
|
startHash,
|
||||||
|
workflowHash,
|
||||||
|
mockAgentPath,
|
||||||
|
failingAgentPath,
|
||||||
|
promptCapturePath,
|
||||||
|
envCapturePath,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
function runUwf(
|
||||||
|
args: string[],
|
||||||
|
casDir: string,
|
||||||
|
): { stdout: string; stderr: string; status: number } {
|
||||||
|
const cliPath = join(dirname(fileURLToPath(import.meta.url)), "..", "..", "dist", "cli.js");
|
||||||
|
try {
|
||||||
|
const stdout = execFileSync(process.execPath, [cliPath, ...args], {
|
||||||
|
encoding: "utf8",
|
||||||
|
stdio: ["ignore", "pipe", "pipe"],
|
||||||
|
env: {
|
||||||
|
...process.env,
|
||||||
|
UWF_HOME: tmpDir,
|
||||||
|
OCAS_HOME: casDir,
|
||||||
|
},
|
||||||
|
cwd: tmpDir,
|
||||||
|
timeout: 30000,
|
||||||
|
});
|
||||||
|
return { stdout, stderr: "", status: 0 };
|
||||||
|
} catch (error) {
|
||||||
|
const err = error as NodeJS.ErrnoException & {
|
||||||
|
stdout?: string | Buffer;
|
||||||
|
stderr?: string | Buffer;
|
||||||
|
status?: number;
|
||||||
|
};
|
||||||
|
return {
|
||||||
|
stdout: typeof err.stdout === "string" ? err.stdout : (err.stdout?.toString("utf8") ?? ""),
|
||||||
|
stderr: typeof err.stderr === "string" ? err.stderr : (err.stderr?.toString("utf8") ?? ""),
|
||||||
|
status: err.status ?? 1,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Group 1: CLI argument validation ───────────────────────────────────────
|
||||||
|
|
||||||
|
describe("uwf thread poke - CLI argument validation", () => {
|
||||||
|
test("1.1 missing -p flag exits non-zero", async () => {
|
||||||
|
const { casDir } = await setupThread();
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID], casDir);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
expect(result.stderr.toLowerCase()).toMatch(/required|missing|prompt/);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("1.2 -p without --agent succeeds", async () => {
|
||||||
|
const { casDir } = await setupThread();
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID, "-p", "do it again"], casDir);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("1.3 -p with --agent succeeds", async () => {
|
||||||
|
const { casDir, mockAgentPath } = await setupThread();
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "do it again", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Group 2: Guard errors ──────────────────────────────────────────────────
|
||||||
|
|
||||||
|
describe("uwf thread poke - guard errors", () => {
|
||||||
|
test("2.1 thread not found", async () => {
|
||||||
|
const { casDir } = await setupThread();
|
||||||
|
const result = runUwf(["thread", "poke", "01NOSUCHTHREAD0000000A", "-p", "prompt"], casDir);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
expect(result.stderr.toLowerCase()).toMatch(/not found|not active/);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("2.2 thread running rejects poke", async () => {
|
||||||
|
const { casDir, workflowHash } = await setupThread();
|
||||||
|
// Create background marker to simulate running
|
||||||
|
const { createMarker } = await import("../background/index.js");
|
||||||
|
await createMarker(tmpDir, {
|
||||||
|
thread: THREAD_ID,
|
||||||
|
workflow: workflowHash,
|
||||||
|
pid: process.pid,
|
||||||
|
startedAt: Date.now(),
|
||||||
|
});
|
||||||
|
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID, "-p", "prompt"], casDir);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
expect(result.stderr.toLowerCase()).toContain("already executing");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("2.3 completed thread rejects poke", async () => {
|
||||||
|
const { casDir } = await setupThread({ threadStatus: "completed" });
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID, "-p", "prompt"], casDir);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
expect(result.stderr.toLowerCase()).toMatch(/cannot be poked|completed/);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("2.4 cancelled thread rejects poke", async () => {
|
||||||
|
const { casDir } = await setupThread({ threadStatus: "cancelled" });
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID, "-p", "prompt"], casDir);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
expect(result.stderr.toLowerCase()).toMatch(/cannot be poked|cancelled/);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("2.5 thread head is StartNode (no StepNode) rejects poke", async () => {
|
||||||
|
const { casDir } = await setupThread({ withHeadStep: false });
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID, "-p", "prompt"], casDir);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
expect(result.stderr.toLowerCase()).toMatch(/no step|cannot be poked/);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Group 3: Success happy path ────────────────────────────────────────────
|
||||||
|
|
||||||
|
describe("uwf thread poke - success", () => {
|
||||||
|
test("3.1, 3.4 idle thread → new head differs from old, thread index updated", async () => {
|
||||||
|
const { casDir, oldStepHash, mockAgentPath } = await setupThread();
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
expect(cliOutput.head).not.toBe(oldStepHash);
|
||||||
|
|
||||||
|
const { createUwfStore, getThread } = await import("../store.js");
|
||||||
|
const uwf = await createUwfStore(tmpDir);
|
||||||
|
const entry = getThread(uwf.varStore, THREAD_ID);
|
||||||
|
expect(entry?.head).toBe(cliOutput.head);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("3.2 new step's prev equals old head's prev (replace, not append)", async () => {
|
||||||
|
const { casDir, oldStepPrev, mockAgentPath } = await setupThread({ multipleSteps: true });
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
|
||||||
|
const { createUwfStore } = await import("../store.js");
|
||||||
|
const uwf = await createUwfStore(tmpDir);
|
||||||
|
const node = uwf.store.cas.get(cliOutput.head as CasRef);
|
||||||
|
expect(node).not.toBeNull();
|
||||||
|
expect(node?.type).toBe(uwf.schemas.stepNode);
|
||||||
|
const payload = node?.payload as StepNodePayload;
|
||||||
|
expect(payload.prev).toBe(oldStepPrev);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("3.2b new step's prev is null when old head was the first step", async () => {
|
||||||
|
// multipleSteps:false means oldHead.prev = null
|
||||||
|
const { casDir, mockAgentPath } = await setupThread({ multipleSteps: false });
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
|
||||||
|
const { createUwfStore } = await import("../store.js");
|
||||||
|
const uwf = await createUwfStore(tmpDir);
|
||||||
|
const node = uwf.store.cas.get(cliOutput.head as CasRef);
|
||||||
|
const payload = node?.payload as StepNodePayload;
|
||||||
|
expect(payload.prev).toBeNull();
|
||||||
|
});
|
||||||
|
|
||||||
|
test("3.3 new step's completedAtMs is later than old", async () => {
|
||||||
|
const { casDir, oldStepCompletedAtMs, mockAgentPath } = await setupThread();
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
|
||||||
|
const { createUwfStore } = await import("../store.js");
|
||||||
|
const uwf = await createUwfStore(tmpDir);
|
||||||
|
const node = uwf.store.cas.get(cliOutput.head as CasRef);
|
||||||
|
const payload = node?.payload as StepNodePayload;
|
||||||
|
expect(payload.completedAtMs).toBeGreaterThan(oldStepCompletedAtMs);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("3.5 status remains idle after poke (no completion/suspend)", async () => {
|
||||||
|
const { casDir, mockAgentPath } = await setupThread();
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
expect(cliOutput.status).toBe("idle");
|
||||||
|
expect(cliOutput.done).toBe(false);
|
||||||
|
expect(cliOutput.suspendedRole).toBeNull();
|
||||||
|
expect(cliOutput.suspendMessage).toBeNull();
|
||||||
|
});
|
||||||
|
|
||||||
|
test("3.6 currentRole unchanged after poke (no moderator re-route)", async () => {
|
||||||
|
// Before poke: idle thread with worker step having $status=ok → moderator would route to reviewer.
|
||||||
|
// After poke (mock returns same $status=ok), moderator routing remains the same.
|
||||||
|
const { casDir, mockAgentPath } = await setupThread();
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
expect(cliOutput.currentRole).toBe("reviewer");
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Group 4: Agent resolution ──────────────────────────────────────────────
|
||||||
|
|
||||||
|
describe("uwf thread poke - agent resolution", () => {
|
||||||
|
test("4.1 without --agent, agent command read from head step's agent field", async () => {
|
||||||
|
// Head step's agent field points at mockAgentPath (default in setupThread)
|
||||||
|
const { casDir, promptCapturePath } = await setupThread();
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID, "-p", "redo"], casDir);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const captured = await readFile(promptCapturePath, "utf8");
|
||||||
|
expect(captured).toBe("redo");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("4.2 with --agent, explicit override is used", async () => {
|
||||||
|
// Head step records "uwf-mock" (which is not a real binary). Override with mockAgentPath.
|
||||||
|
const { casDir, mockAgentPath } = await setupThread({ stepAgentNameOverride: "uwf-mock" });
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Group 5: Prompt passthrough ────────────────────────────────────────────
|
||||||
|
|
||||||
|
describe("uwf thread poke - prompt passthrough", () => {
|
||||||
|
test("5.1 -p value is passed to agent as --prompt", async () => {
|
||||||
|
const { casDir, mockAgentPath, promptCapturePath } = await setupThread();
|
||||||
|
const supplement = "Use the REST API instead.";
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", supplement, "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const captured = await readFile(promptCapturePath, "utf8");
|
||||||
|
expect(captured).toBe(supplement);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Group 6: Edge cases ────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
describe("uwf thread poke - edge cases", () => {
|
||||||
|
test("6.1 poke succeeds on suspended thread", async () => {
|
||||||
|
const { casDir, oldStepHash, mockAgentPath } = await setupThread({
|
||||||
|
threadStatus: "suspended",
|
||||||
|
});
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
expect(cliOutput.head).not.toBe(oldStepHash);
|
||||||
|
expect(cliOutput.status).toBe("idle");
|
||||||
|
expect(cliOutput.suspendedRole).toBeNull();
|
||||||
|
expect(cliOutput.suspendMessage).toBeNull();
|
||||||
|
});
|
||||||
|
|
||||||
|
test("6.2 agent failure leaves thread head unchanged", async () => {
|
||||||
|
const { casDir, oldStepHash, failingAgentPath } = await setupThread();
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", failingAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
|
||||||
|
const { createUwfStore, getThread } = await import("../store.js");
|
||||||
|
const uwf = await createUwfStore(tmpDir);
|
||||||
|
const entry = getThread(uwf.varStore, THREAD_ID);
|
||||||
|
expect(entry?.head).toBe(oldStepHash);
|
||||||
|
});
|
||||||
|
});
|
||||||
@@ -17,6 +17,7 @@ import {
|
|||||||
cmdThreadCancel,
|
cmdThreadCancel,
|
||||||
cmdThreadExec,
|
cmdThreadExec,
|
||||||
cmdThreadList,
|
cmdThreadList,
|
||||||
|
cmdThreadPoke,
|
||||||
cmdThreadRead,
|
cmdThreadRead,
|
||||||
cmdThreadResume,
|
cmdThreadResume,
|
||||||
cmdThreadShow,
|
cmdThreadShow,
|
||||||
@@ -290,6 +291,26 @@ thread
|
|||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
|
thread
|
||||||
|
.command("poke")
|
||||||
|
.description("Re-run the head step's agent with a supplementary prompt (replaces head step)")
|
||||||
|
.argument("<thread-id>", "Thread ULID")
|
||||||
|
.requiredOption("-p, --prompt <text>", "Supplementary prompt for the agent")
|
||||||
|
.option("--agent <cmd>", "Override agent command (defaults to head step's agent)")
|
||||||
|
.action((threadId: string, opts: { prompt: string; agent: string | undefined }) => {
|
||||||
|
const storageRoot = resolveStorageRoot();
|
||||||
|
runAction(async () => {
|
||||||
|
const agentOverride = opts.agent ?? null;
|
||||||
|
const result = await cmdThreadPoke(
|
||||||
|
storageRoot,
|
||||||
|
threadId as ThreadId,
|
||||||
|
opts.prompt,
|
||||||
|
agentOverride,
|
||||||
|
);
|
||||||
|
writeOutput(result);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
thread
|
thread
|
||||||
.command("stop")
|
.command("stop")
|
||||||
.description("Stop background execution of a thread (keep thread active)")
|
.description("Stop background execution of a thread (keep thread active)")
|
||||||
|
|||||||
@@ -199,6 +199,7 @@ const PL_THREAD_ARCHIVED = "F4D8Q2K5";
|
|||||||
const PL_STEP_ERROR = "B8T5N1V6";
|
const PL_STEP_ERROR = "B8T5N1V6";
|
||||||
const PL_BACKGROUND_START = "X7Q4W9M2";
|
const PL_BACKGROUND_START = "X7Q4W9M2";
|
||||||
const PL_THREAD_RESUME = "K2R7M4N8";
|
const PL_THREAD_RESUME = "K2R7M4N8";
|
||||||
|
const PL_THREAD_POKE = "P4Q9R3X7";
|
||||||
|
|
||||||
type ResumeStepConfig = {
|
type ResumeStepConfig = {
|
||||||
role: string;
|
role: string;
|
||||||
@@ -1135,6 +1136,147 @@ export async function cmdThreadResume(
|
|||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Validate that a thread can be poked. Returns the existing entry and the head StepNode payload.
|
||||||
|
* Fails (process exit) when the thread is missing, running, completed, cancelled, or has no
|
||||||
|
* StepNode at its head.
|
||||||
|
*/
|
||||||
|
async function validatePokePreconditions(
|
||||||
|
storageRoot: string,
|
||||||
|
uwf: UwfStore,
|
||||||
|
threadId: ThreadId,
|
||||||
|
): Promise<{ entry: ThreadIndexEntry; oldHead: CasRef; oldHeadPayload: StepNodePayload }> {
|
||||||
|
const runningMarker = await isThreadRunning(storageRoot, threadId);
|
||||||
|
if (runningMarker !== null) {
|
||||||
|
fail(`thread already executing in background (PID: ${runningMarker.pid})`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const entry = getThread(uwf.varStore, threadId);
|
||||||
|
if (entry === null) {
|
||||||
|
fail(`thread not active: ${threadId}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (entry.status === "completed" || entry.status === "cancelled") {
|
||||||
|
fail(`thread cannot be poked: ${threadId} (status: ${entry.status})`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const oldHead = entry.head;
|
||||||
|
const oldHeadNode = uwf.store.cas.get(oldHead);
|
||||||
|
if (oldHeadNode === null) {
|
||||||
|
fail(`CAS node not found: ${oldHead}`);
|
||||||
|
}
|
||||||
|
if (oldHeadNode.type !== uwf.schemas.stepNode) {
|
||||||
|
fail("thread cannot be poked: no step to replace (head is StartNode)");
|
||||||
|
}
|
||||||
|
|
||||||
|
return { entry, oldHead, oldHeadPayload: oldHeadNode.payload as StepNodePayload };
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Resolve the next role from the post-poke chain state, used for the StepOutput.currentRole field.
|
||||||
|
* Returns null when the next role is $END, evaluation fails, or the result is a suspend.
|
||||||
|
*/
|
||||||
|
function resolveCurrentRoleFromChain(
|
||||||
|
uwfAfter: UwfStore,
|
||||||
|
workflow: WorkflowPayload,
|
||||||
|
replacedHash: CasRef,
|
||||||
|
): string | null {
|
||||||
|
const chainAfter = walkChain(uwfAfter, replacedHash);
|
||||||
|
const { lastRole, lastOutput } = resolveEvaluateArgs(uwfAfter, chainAfter);
|
||||||
|
const afterResult = evaluate(workflow.graph, lastRole, lastOutput);
|
||||||
|
if (!afterResult.ok || isSuspendResult(afterResult.value)) {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
if (afterResult.value.role === END_ROLE) {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
return afterResult.value.role;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Poke a thread: re-run the agent on the head step with a supplementary prompt,
|
||||||
|
* replacing the head step's output. The new step's `prev` points to the OLD head's
|
||||||
|
* `prev` — semantically replacing (not appending to) the head. The moderator is NOT
|
||||||
|
* re-evaluated for routing; the role of the head step is re-used.
|
||||||
|
*/
|
||||||
|
export async function cmdThreadPoke(
|
||||||
|
storageRoot: string,
|
||||||
|
threadId: ThreadId,
|
||||||
|
prompt: string,
|
||||||
|
agentOverride: string | null,
|
||||||
|
): Promise<StepOutput> {
|
||||||
|
const uwf = await createUwfStore(storageRoot);
|
||||||
|
const { entry, oldHeadPayload } = await validatePokePreconditions(storageRoot, uwf, threadId);
|
||||||
|
|
||||||
|
const chain = walkChain(uwf, entry.head);
|
||||||
|
const workflowHash = chain.start.workflow;
|
||||||
|
const threadCwd = chain.start.cwd;
|
||||||
|
|
||||||
|
const plog = createProcessLogger({
|
||||||
|
storageRoot,
|
||||||
|
context: { thread: threadId, workflow: workflowHash },
|
||||||
|
});
|
||||||
|
|
||||||
|
// Resolve the agent: --agent override wins; otherwise read from old head step's `agent` field.
|
||||||
|
const config = await loadWorkflowConfig(storageRoot);
|
||||||
|
const workflow = loadWorkflowPayload(uwf, workflowHash);
|
||||||
|
const role = oldHeadPayload.role;
|
||||||
|
const agent =
|
||||||
|
agentOverride !== null
|
||||||
|
? resolveAgentConfig(config, workflow, role, agentOverride)
|
||||||
|
: parseAgentOverride(oldHeadPayload.agent);
|
||||||
|
|
||||||
|
const effectiveCwd = oldHeadPayload.cwd !== "" ? oldHeadPayload.cwd : threadCwd;
|
||||||
|
|
||||||
|
plog.log(PL_THREAD_POKE, `poke role=${role} agent=${agent.command}`, null);
|
||||||
|
plog.log(PL_AGENT_SPAWN, `spawning agent command=${agent.command}`, {
|
||||||
|
args: [...agent.args, threadId, role].join(" "),
|
||||||
|
});
|
||||||
|
|
||||||
|
loadDotenv({ path: getEnvPath(storageRoot) });
|
||||||
|
|
||||||
|
// Spawn the agent. The agent will create a new StepNode with prev=oldHead (it reads
|
||||||
|
// the active thread head). After the agent returns, we rewrite that node's prev so
|
||||||
|
// that the new head replaces the old head instead of appending after it.
|
||||||
|
const agentResult = spawnAgent(plog, agent, threadId, role, prompt, effectiveCwd);
|
||||||
|
const agentStepHash = agentResult.stepHash as CasRef;
|
||||||
|
|
||||||
|
plog.log(PL_AGENT_DONE, `agent returned head=${agentStepHash}`, null);
|
||||||
|
|
||||||
|
const uwfAfter = await createUwfStore(storageRoot);
|
||||||
|
const agentNode = uwfAfter.store.cas.get(agentStepHash);
|
||||||
|
if (agentNode === null || agentNode.type !== uwfAfter.schemas.stepNode) {
|
||||||
|
failStep(plog, `agent returned hash that is not a StepNode: ${agentStepHash}`);
|
||||||
|
}
|
||||||
|
const agentPayload = agentNode.payload as StepNodePayload;
|
||||||
|
|
||||||
|
// Rewrite the new step so that its `prev` points to the OLD head's prev (replace semantics).
|
||||||
|
const replacedPayload: StepNodePayload = {
|
||||||
|
...agentPayload,
|
||||||
|
prev: oldHeadPayload.prev,
|
||||||
|
};
|
||||||
|
const replacedHash = await uwfAfter.store.cas.put(uwfAfter.schemas.stepNode, replacedPayload);
|
||||||
|
const replacedNode = uwfAfter.store.cas.get(replacedHash);
|
||||||
|
if (replacedNode === null || !validate(uwfAfter.store, replacedNode)) {
|
||||||
|
failStep(plog, "rewritten StepNode failed schema validation");
|
||||||
|
}
|
||||||
|
|
||||||
|
// Update thread head to the replaced step. Status becomes idle (no moderator re-route).
|
||||||
|
setThread(uwfAfter.varStore, threadId, updateThreadHead(entry, replacedHash));
|
||||||
|
|
||||||
|
return {
|
||||||
|
workflow: workflowHash,
|
||||||
|
thread: threadId,
|
||||||
|
head: replacedHash,
|
||||||
|
status: "idle",
|
||||||
|
currentRole: resolveCurrentRoleFromChain(uwfAfter, workflow, replacedHash),
|
||||||
|
suspendedRole: null,
|
||||||
|
suspendMessage: null,
|
||||||
|
done: false,
|
||||||
|
background: null,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
export function validateCount(count: number): void {
|
export function validateCount(count: number): void {
|
||||||
if (count < 1 || !Number.isInteger(count)) {
|
if (count < 1 || !Number.isInteger(count)) {
|
||||||
throw new Error(`--count must be a positive integer, got: ${count}`);
|
throw new Error(`--count must be a positive integer, got: ${count}`);
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/eval",
|
"name": "@united-workforce/eval",
|
||||||
"version": "0.1.3",
|
"version": "0.1.6",
|
||||||
"private": false,
|
"private": false,
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
@@ -22,8 +22,8 @@
|
|||||||
"test:ci": "vitest run __tests__/"
|
"test:ci": "vitest run __tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@ocas/fs": "^0.3.0",
|
"@ocas/fs": "^0.4.0",
|
||||||
"@united-workforce/protocol": "workspace:^",
|
"@united-workforce/protocol": "workspace:^",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"commander": "^14.0.3",
|
"commander": "^14.0.3",
|
||||||
|
|||||||
@@ -6,7 +6,7 @@ import { formatList, selectEntries } from "./format.js";
|
|||||||
import { readEvalEntries } from "./read.js";
|
import { readEvalEntries } from "./read.js";
|
||||||
|
|
||||||
const log = createLogger({ sink: { kind: "stderr" } });
|
const log = createLogger({ sink: { kind: "stderr" } });
|
||||||
const LOG_LIST = "L5KX9R2B";
|
const LOG_LIST = "H5KX9R2B";
|
||||||
|
|
||||||
type ListCliOptions = {
|
type ListCliOptions = {
|
||||||
task: string | undefined;
|
task: string | undefined;
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/protocol",
|
"name": "@united-workforce/protocol",
|
||||||
"version": "0.1.0",
|
"version": "0.1.1",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -18,8 +18,8 @@
|
|||||||
"test:ci": "vitest run src/__tests__/"
|
"test:ci": "vitest run src/__tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@ocas/fs": "^0.3.0"
|
"@ocas/fs": "^0.4.0"
|
||||||
},
|
},
|
||||||
"devDependencies": {
|
"devDependencies": {
|
||||||
"typescript": "^5.8.3"
|
"typescript": "^5.8.3"
|
||||||
|
|||||||
@@ -225,4 +225,34 @@ describe("buildOutputFormatInstruction", () => {
|
|||||||
const result = buildOutputFormatInstruction({});
|
const result = buildOutputFormatInstruction({});
|
||||||
expect(result).toContain("Focus exclusively on YOUR role");
|
expect(result).toContain("Focus exclusively on YOUR role");
|
||||||
});
|
});
|
||||||
|
|
||||||
|
test("renders const value as literal in flat schema example", () => {
|
||||||
|
const schema = {
|
||||||
|
type: "object",
|
||||||
|
properties: {
|
||||||
|
$status: { type: "string", const: "greeted" },
|
||||||
|
message: { type: "string" },
|
||||||
|
},
|
||||||
|
required: ["$status", "message"],
|
||||||
|
};
|
||||||
|
const result = buildOutputFormatInstruction(schema);
|
||||||
|
expect(result).toContain("$status: greeted");
|
||||||
|
expect(result).toContain("fixed value");
|
||||||
|
expect(result).not.toContain("$status: <string>");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("renders const value for non-string types", () => {
|
||||||
|
const schema = {
|
||||||
|
type: "object",
|
||||||
|
properties: {
|
||||||
|
count: { type: "number", const: 42 },
|
||||||
|
done: { type: "boolean", const: true },
|
||||||
|
},
|
||||||
|
required: ["count", "done"],
|
||||||
|
};
|
||||||
|
const result = buildOutputFormatInstruction(schema);
|
||||||
|
expect(result).toContain("count: 42");
|
||||||
|
expect(result).toContain("done: true");
|
||||||
|
expect(result).toContain("fixed value");
|
||||||
|
});
|
||||||
});
|
});
|
||||||
|
|||||||
@@ -0,0 +1,59 @@
|
|||||||
|
import type { StepContext } from "@united-workforce/protocol";
|
||||||
|
import { describe, expect, test } from "vitest";
|
||||||
|
import { buildThreadProgress } from "../src/build-thread-progress.js";
|
||||||
|
|
||||||
|
function makeStep(role: string): StepContext {
|
||||||
|
return {
|
||||||
|
role,
|
||||||
|
output: {},
|
||||||
|
detail: "0000000000000" as string,
|
||||||
|
agent: "uwf-mock",
|
||||||
|
edgePrompt: "",
|
||||||
|
startedAtMs: 0,
|
||||||
|
completedAtMs: 0,
|
||||||
|
cwd: "",
|
||||||
|
assembledPrompt: null,
|
||||||
|
usage: null,
|
||||||
|
content: null,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
describe("buildThreadProgress", () => {
|
||||||
|
test("first step of thread", () => {
|
||||||
|
const result = buildThreadProgress([], "proponent");
|
||||||
|
expect(result).toContain("## Thread Progress");
|
||||||
|
expect(result).toContain("first step");
|
||||||
|
expect(result).toContain("first time");
|
||||||
|
expect(result).toContain("proponent");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("second step, role not seen before", () => {
|
||||||
|
const steps = [makeStep("opponent")];
|
||||||
|
const result = buildThreadProgress(steps, "proponent");
|
||||||
|
expect(result).toContain("Thread step 2");
|
||||||
|
expect(result).toContain("spoken 0 times");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("role has spoken once before", () => {
|
||||||
|
const steps = [makeStep("proponent"), makeStep("opponent")];
|
||||||
|
const result = buildThreadProgress(steps, "proponent");
|
||||||
|
expect(result).toContain("Thread step 3");
|
||||||
|
expect(result).toContain("spoken 1 time before");
|
||||||
|
// singular "time" not "times"
|
||||||
|
expect(result).not.toContain("1 times");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("role has spoken multiple times", () => {
|
||||||
|
const steps = [
|
||||||
|
makeStep("proponent"),
|
||||||
|
makeStep("opponent"),
|
||||||
|
makeStep("proponent"),
|
||||||
|
makeStep("opponent"),
|
||||||
|
makeStep("proponent"),
|
||||||
|
makeStep("opponent"),
|
||||||
|
];
|
||||||
|
const result = buildThreadProgress(steps, "proponent");
|
||||||
|
expect(result).toContain("Thread step 7");
|
||||||
|
expect(result).toContain("spoken 3 times");
|
||||||
|
});
|
||||||
|
});
|
||||||
@@ -0,0 +1,23 @@
|
|||||||
|
import { describe, expect, test } from "vitest";
|
||||||
|
import { buildFrontmatterRetryPrompt } from "../src/frontmatter-retry-prompt.js";
|
||||||
|
|
||||||
|
describe("buildFrontmatterRetryPrompt", () => {
|
||||||
|
test("includes correction instruction", () => {
|
||||||
|
const result = buildFrontmatterRetryPrompt("Use YAML frontmatter");
|
||||||
|
expect(result).toContain("previous run completed");
|
||||||
|
expect(result).toContain("do NOT need to redo any work");
|
||||||
|
expect(result).toContain("corrected YAML frontmatter");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("includes outputFormatInstruction when provided", () => {
|
||||||
|
const instruction = "---\nstatus: $done | $review\nsummary: string\n---";
|
||||||
|
const result = buildFrontmatterRetryPrompt(instruction);
|
||||||
|
expect(result).toContain(instruction);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("works with empty outputFormatInstruction", () => {
|
||||||
|
const result = buildFrontmatterRetryPrompt("");
|
||||||
|
expect(result).not.toContain("\n\n\n");
|
||||||
|
expect(result).toContain("corrected YAML frontmatter");
|
||||||
|
});
|
||||||
|
});
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/util-agent",
|
"name": "@united-workforce/util-agent",
|
||||||
"version": "0.1.0",
|
"version": "0.1.2",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -18,8 +18,8 @@
|
|||||||
"test:ci": "vitest run __tests__/ src/__tests__/"
|
"test:ci": "vitest run __tests__/ src/__tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@ocas/fs": "^0.3.0",
|
"@ocas/fs": "^0.4.0",
|
||||||
"@united-workforce/protocol": "workspace:^",
|
"@united-workforce/protocol": "workspace:^",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"dotenv": "^16.6.1",
|
"dotenv": "^16.6.1",
|
||||||
|
|||||||
@@ -74,6 +74,10 @@ function collectObjectSchemas(schema: JSONSchema): JSONSchema[] {
|
|||||||
}
|
}
|
||||||
|
|
||||||
function resolvePropertySchema(prop: JSONSchema): JSONSchema {
|
function resolvePropertySchema(prop: JSONSchema): JSONSchema {
|
||||||
|
if (prop.const !== undefined) {
|
||||||
|
return prop;
|
||||||
|
}
|
||||||
|
|
||||||
if (Array.isArray(prop.enum) && prop.enum.length > 0) {
|
if (Array.isArray(prop.enum) && prop.enum.length > 0) {
|
||||||
return prop;
|
return prop;
|
||||||
}
|
}
|
||||||
@@ -113,6 +117,11 @@ function buildPropertyExampleLine(prop: SchemaProperty): string {
|
|||||||
commentParts.push("required");
|
commentParts.push("required");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (resolved.const !== undefined) {
|
||||||
|
commentParts.push("fixed value");
|
||||||
|
return `${prop.name}: ${formatYamlScalar(resolved.const)}${buildPropertyComment(commentParts)}`;
|
||||||
|
}
|
||||||
|
|
||||||
if (Array.isArray(resolved.enum) && resolved.enum.length > 0) {
|
if (Array.isArray(resolved.enum) && resolved.enum.length > 0) {
|
||||||
const enumValues = resolved.enum.map((v) => String(v));
|
const enumValues = resolved.enum.map((v) => String(v));
|
||||||
commentParts.push(...enumValues);
|
commentParts.push(...enumValues);
|
||||||
|
|||||||
@@ -0,0 +1,27 @@
|
|||||||
|
import type { StepContext } from "@united-workforce/protocol";
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Build a compact thread-progress summary so the agent knows where it is
|
||||||
|
* in the conversation without making tool calls to count steps.
|
||||||
|
*
|
||||||
|
* Example output:
|
||||||
|
* ## Thread Progress
|
||||||
|
* Thread step 6. You (proponent) have spoken 2 times before this turn.
|
||||||
|
*/
|
||||||
|
export function buildThreadProgress(steps: StepContext[], role: string): string {
|
||||||
|
const totalSteps = steps.length;
|
||||||
|
const roleVisits = steps.filter((s) => s.role === role).length;
|
||||||
|
|
||||||
|
const parts = [`## Thread Progress`];
|
||||||
|
if (totalSteps === 0) {
|
||||||
|
parts.push(
|
||||||
|
`This is the first step of the thread. You (${role}) are speaking for the first time.`,
|
||||||
|
);
|
||||||
|
} else {
|
||||||
|
parts.push(
|
||||||
|
`Thread step ${totalSteps + 1}. You (${role}) have spoken ${roleVisits} time${roleVisits === 1 ? "" : "s"} before this turn.`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
return parts.join("\n");
|
||||||
|
}
|
||||||
@@ -0,0 +1,21 @@
|
|||||||
|
/**
|
||||||
|
* Build a minimal prompt for retrying frontmatter output on a resumed session.
|
||||||
|
*
|
||||||
|
* Used when a previous run completed successfully but frontmatter validation
|
||||||
|
* failed — the session already has full context, we just need the agent to
|
||||||
|
* re-output correctly formatted frontmatter without redoing any work.
|
||||||
|
*/
|
||||||
|
export function buildFrontmatterRetryPrompt(outputFormatInstruction: string): string {
|
||||||
|
const parts: string[] = [
|
||||||
|
"Your previous run completed all work successfully, but the output format was incorrect.",
|
||||||
|
"You do NOT need to redo any work — all changes are already in place.",
|
||||||
|
"",
|
||||||
|
];
|
||||||
|
if (outputFormatInstruction !== "") {
|
||||||
|
parts.push(outputFormatInstruction, "");
|
||||||
|
}
|
||||||
|
parts.push(
|
||||||
|
"Please output ONLY the corrected YAML frontmatter block (--- delimited) followed by a brief summary of the work you completed.",
|
||||||
|
);
|
||||||
|
return parts.join("\n");
|
||||||
|
}
|
||||||
@@ -1,6 +1,7 @@
|
|||||||
export { buildContinuationPrompt } from "./build-continuation-prompt.js";
|
export { buildContinuationPrompt } from "./build-continuation-prompt.js";
|
||||||
export { buildOutputFormatInstruction } from "./build-output-format-instruction.js";
|
export { buildOutputFormatInstruction } from "./build-output-format-instruction.js";
|
||||||
export { buildRolePrompt } from "./build-role-prompt.js";
|
export { buildRolePrompt } from "./build-role-prompt.js";
|
||||||
|
export { buildThreadProgress } from "./build-thread-progress.js";
|
||||||
export type { BuildContextMeta } from "./context.js";
|
export type { BuildContextMeta } from "./context.js";
|
||||||
export { buildContext, buildContextWithMeta } from "./context.js";
|
export { buildContext, buildContextWithMeta } from "./context.js";
|
||||||
export type { ExtractResult, ResolvedLlmProvider } from "./extract.js";
|
export type { ExtractResult, ResolvedLlmProvider } from "./extract.js";
|
||||||
@@ -11,6 +12,7 @@ export {
|
|||||||
} from "./extract.js";
|
} from "./extract.js";
|
||||||
export type { FrontmatterFastPathResult } from "./frontmatter.js";
|
export type { FrontmatterFastPathResult } from "./frontmatter.js";
|
||||||
export { tryFrontmatterFastPath } from "./frontmatter.js";
|
export { tryFrontmatterFastPath } from "./frontmatter.js";
|
||||||
|
export { buildFrontmatterRetryPrompt } from "./frontmatter-retry-prompt.js";
|
||||||
export { createAgent, parseArgv } from "./run.js";
|
export { createAgent, parseArgv } from "./run.js";
|
||||||
export { getCachedSessionId, getCachePath, setCachedSessionId } from "./session-cache.js";
|
export { getCachedSessionId, getCachePath, setCachedSessionId } from "./session-cache.js";
|
||||||
export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
|
export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/util",
|
"name": "@united-workforce/util",
|
||||||
"version": "0.1.3",
|
"version": "0.1.4",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
|
|||||||
@@ -140,5 +140,18 @@ For specific scenarios, run the corresponding \`uwf prompt\` command:
|
|||||||
|----------|---------|-------------|
|
|----------|---------|-------------|
|
||||||
| Writing workflow YAML | \`uwf prompt workflow-authoring\` | Designing roles, conditions, graphs, and edge prompts |
|
| Writing workflow YAML | \`uwf prompt workflow-authoring\` | Designing roles, conditions, graphs, and edge prompts |
|
||||||
| Building a new agent adapter | \`uwf prompt adapter-developing\` | Creating a new \`uwf-<name>\` CLI adapter |
|
| Building a new agent adapter | \`uwf prompt adapter-developing\` | Creating a new \`uwf-<name>\` CLI adapter |
|
||||||
|
|
||||||
|
## Upgrading
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
# Install the latest version
|
||||||
|
pnpm add -g @united-workforce/cli@latest @united-workforce/agent-hermes@latest
|
||||||
|
# or: npm install -g @united-workforce/cli@latest @united-workforce/agent-hermes@latest
|
||||||
|
|
||||||
|
# Verify
|
||||||
|
uwf --version
|
||||||
|
|
||||||
|
# Then run uwf prompt bootstrap and follow the upgrade instructions
|
||||||
|
\`\`\`
|
||||||
`;
|
`;
|
||||||
}
|
}
|
||||||
|
|||||||
Generated
+38
-36
@@ -18,8 +18,8 @@ importers:
|
|||||||
specifier: ^2.31.0
|
specifier: ^2.31.0
|
||||||
version: 2.31.0(@types/node@25.9.1)
|
version: 2.31.0(@types/node@25.9.1)
|
||||||
'@shazhou/proman':
|
'@shazhou/proman':
|
||||||
specifier: ^0.5.1
|
specifier: ^0.6.3
|
||||||
version: 0.5.1(@biomejs/biome@2.4.16)(typescript@5.9.3)(vite@7.3.5(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(yaml@2.9.0))(vitest@3.2.6(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(msw@2.14.6(@types/node@25.9.1)(typescript@5.9.3))(yaml@2.9.0))
|
version: 0.6.3(@biomejs/biome@2.4.16)(typescript@5.9.3)(vite@7.3.5(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(yaml@2.9.0))(vitest@3.2.6(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(msw@2.14.6(@types/node@25.9.1)(typescript@5.9.3))(yaml@2.9.0))
|
||||||
'@types/node':
|
'@types/node':
|
||||||
specifier: ^25.7.0
|
specifier: ^25.7.0
|
||||||
version: 25.9.1
|
version: 25.9.1
|
||||||
@@ -45,8 +45,8 @@ importers:
|
|||||||
packages/agent-builtin:
|
packages/agent-builtin:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/util':
|
'@united-workforce/util':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../util
|
version: link:../util
|
||||||
@@ -61,8 +61,8 @@ importers:
|
|||||||
packages/agent-claude-code:
|
packages/agent-claude-code:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/protocol':
|
'@united-workforce/protocol':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../protocol
|
version: link:../protocol
|
||||||
@@ -80,8 +80,8 @@ importers:
|
|||||||
packages/agent-hermes:
|
packages/agent-hermes:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/protocol':
|
'@united-workforce/protocol':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../protocol
|
version: link:../protocol
|
||||||
@@ -99,8 +99,8 @@ importers:
|
|||||||
packages/agent-mock:
|
packages/agent-mock:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/protocol':
|
'@united-workforce/protocol':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../protocol
|
version: link:../protocol
|
||||||
@@ -121,11 +121,11 @@ importers:
|
|||||||
packages/cli:
|
packages/cli:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@ocas/fs':
|
'@ocas/fs':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/protocol':
|
'@united-workforce/protocol':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../protocol
|
version: link:../protocol
|
||||||
@@ -231,11 +231,11 @@ importers:
|
|||||||
packages/eval:
|
packages/eval:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@ocas/fs':
|
'@ocas/fs':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/protocol':
|
'@united-workforce/protocol':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../protocol
|
version: link:../protocol
|
||||||
@@ -256,11 +256,11 @@ importers:
|
|||||||
packages/protocol:
|
packages/protocol:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@ocas/fs':
|
'@ocas/fs':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
devDependencies:
|
devDependencies:
|
||||||
typescript:
|
typescript:
|
||||||
specifier: ^5.8.3
|
specifier: ^5.8.3
|
||||||
@@ -275,11 +275,11 @@ importers:
|
|||||||
packages/util-agent:
|
packages/util-agent:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@ocas/fs':
|
'@ocas/fs':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/protocol':
|
'@united-workforce/protocol':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../protocol
|
version: link:../protocol
|
||||||
@@ -892,11 +892,13 @@ packages:
|
|||||||
resolution: {integrity: sha512-oGB+UxlgWcgQkgwo8GcEGwemoTFt3FIO9ababBmaGwXIoBKZ+GTy0pP185beGg7Llih/NSHSV2XAs1lnznocSg==}
|
resolution: {integrity: sha512-oGB+UxlgWcgQkgwo8GcEGwemoTFt3FIO9ababBmaGwXIoBKZ+GTy0pP185beGg7Llih/NSHSV2XAs1lnznocSg==}
|
||||||
engines: {node: '>= 8'}
|
engines: {node: '>= 8'}
|
||||||
|
|
||||||
'@ocas/core@0.3.0':
|
'@ocas/core@0.4.0':
|
||||||
resolution: {integrity: sha512-ejDDZbmQkTj2GoJg+cNjXa3eHlQGybW3PrUZlwERBvBFjjnYBLHOG7AQQYM48bI52UiqucafgZjPEYk9SZd6AQ==}
|
resolution: {integrity: sha512-6JvHd3nr5GncMOBNaZTf9ZTWou/txONTfZbkrblmgqL/H+YuRj1FfeFY+b1ndUlfwR7AuJ6bvoSxR5RP+AbC0w==}
|
||||||
|
engines: {node: '>=22.5.0'}
|
||||||
|
|
||||||
'@ocas/fs@0.3.0':
|
'@ocas/fs@0.4.0':
|
||||||
resolution: {integrity: sha512-/6/nICYVJWXeWx2LcPoHHJAFoqXpJoAtvhLKLS0zpkwtsZX3g0D9X6J5soHCV1QS+BOWybuOJ0+W3cB1FBRkZA==}
|
resolution: {integrity: sha512-AQG6dk1YCL1qpSszUWUgEY+LQhYbTv5hXYrs3J2pHAi2/lY615O2cTgjwEeh6JTcrqHsFwiDsDdKIKMpADchZA==}
|
||||||
|
engines: {node: '>=22.5.0'}
|
||||||
|
|
||||||
'@open-draft/deferred-promise@2.2.0':
|
'@open-draft/deferred-promise@2.2.0':
|
||||||
resolution: {integrity: sha512-CecwLWx3rhxVQF6V4bAgPS5t+So2sTbPgAzafKkVizyi7tlwpcFpdFqq+wqF2OwNBmqFuu6tOyouTuxgpMfzmA==}
|
resolution: {integrity: sha512-CecwLWx3rhxVQF6V4bAgPS5t+So2sTbPgAzafKkVizyi7tlwpcFpdFqq+wqF2OwNBmqFuu6tOyouTuxgpMfzmA==}
|
||||||
@@ -1152,8 +1154,8 @@ packages:
|
|||||||
'@sec-ant/readable-stream@0.4.1':
|
'@sec-ant/readable-stream@0.4.1':
|
||||||
resolution: {integrity: sha512-831qok9r2t8AlxLko40y2ebgSDhenenCatLVeW/uBtnHPyhHOvG0C7TvfgecV+wHzIm5KUICgzmVpWS+IMEAeg==}
|
resolution: {integrity: sha512-831qok9r2t8AlxLko40y2ebgSDhenenCatLVeW/uBtnHPyhHOvG0C7TvfgecV+wHzIm5KUICgzmVpWS+IMEAeg==}
|
||||||
|
|
||||||
'@shazhou/proman@0.5.1':
|
'@shazhou/proman@0.6.3':
|
||||||
resolution: {integrity: sha512-GmFUvd8SAOUW/eaDIEh31pVKSE3XhbgHOZ5vSpX4xS+F8Zl6lAfhgVCjcjRK8w5d43tsH47CVorwyxQcRaJFfA==}
|
resolution: {integrity: sha512-KguWl1xHrWXx1YWYrWj47v4NRbaQuKCm7Hd7T8dzrqnkM8UL8em3R9rC7GeDzI8YDDfriFeLTX+xb03UHkhTDA==}
|
||||||
hasBin: true
|
hasBin: true
|
||||||
peerDependencies:
|
peerDependencies:
|
||||||
'@biomejs/biome': ^2.0.0
|
'@biomejs/biome': ^2.0.0
|
||||||
@@ -3896,16 +3898,16 @@ snapshots:
|
|||||||
'@nodelib/fs.scandir': 2.1.5
|
'@nodelib/fs.scandir': 2.1.5
|
||||||
fastq: 1.20.1
|
fastq: 1.20.1
|
||||||
|
|
||||||
'@ocas/core@0.3.0':
|
'@ocas/core@0.4.0':
|
||||||
dependencies:
|
dependencies:
|
||||||
ajv: 8.20.0
|
ajv: 8.20.0
|
||||||
cborg: 4.5.8
|
cborg: 4.5.8
|
||||||
liquidjs: 10.27.0
|
liquidjs: 10.27.0
|
||||||
xxhash-wasm: 1.1.0
|
xxhash-wasm: 1.1.0
|
||||||
|
|
||||||
'@ocas/fs@0.3.0':
|
'@ocas/fs@0.4.0':
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core': 0.3.0
|
'@ocas/core': 0.4.0
|
||||||
cborg: 4.5.8
|
cborg: 4.5.8
|
||||||
|
|
||||||
'@open-draft/deferred-promise@2.2.0': {}
|
'@open-draft/deferred-promise@2.2.0': {}
|
||||||
@@ -4049,7 +4051,7 @@ snapshots:
|
|||||||
|
|
||||||
'@sec-ant/readable-stream@0.4.1': {}
|
'@sec-ant/readable-stream@0.4.1': {}
|
||||||
|
|
||||||
'@shazhou/proman@0.5.1(@biomejs/biome@2.4.16)(typescript@5.9.3)(vite@7.3.5(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(yaml@2.9.0))(vitest@3.2.6(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(msw@2.14.6(@types/node@25.9.1)(typescript@5.9.3))(yaml@2.9.0))':
|
'@shazhou/proman@0.6.3(@biomejs/biome@2.4.16)(typescript@5.9.3)(vite@7.3.5(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(yaml@2.9.0))(vitest@3.2.6(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(msw@2.14.6(@types/node@25.9.1)(typescript@5.9.3))(yaml@2.9.0))':
|
||||||
dependencies:
|
dependencies:
|
||||||
'@biomejs/biome': 2.4.16
|
'@biomejs/biome': 2.4.16
|
||||||
typescript: 5.9.3
|
typescript: 5.9.3
|
||||||
|
|||||||
@@ -1,329 +0,0 @@
|
|||||||
name: solve-issue
|
|
||||||
description: TDD-driven issue resolution adapted for the workflow monorepo with bun + vitest
|
|
||||||
roles:
|
|
||||||
planner:
|
|
||||||
description: Analyzes issue and outputs a TDD test spec
|
|
||||||
goal: You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify.
|
|
||||||
capabilities:
|
|
||||||
- issue-analysis
|
|
||||||
- planning
|
|
||||||
procedure: 'On first run (no previous steps):
|
|
||||||
|
|
||||||
1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
|
|
||||||
|
|
||||||
2. Look for project conventions files (CLAUDE.md, CONTRIBUTING.md) in the repo
|
|
||||||
|
|
||||||
3. Assess whether the issue has enough information to produce a test spec
|
|
||||||
|
|
||||||
4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output $status=insufficient_info
|
|
||||||
|
|
||||||
5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
|
|
||||||
|
|
||||||
|
|
||||||
On subsequent runs (bounced back by tester with fix_spec):
|
|
||||||
|
|
||||||
1. Read the tester''s output from the previous step to understand what''s wrong with the spec
|
|
||||||
|
|
||||||
2. Revise the test spec accordingly
|
|
||||||
|
|
||||||
|
|
||||||
After producing the test spec:
|
|
||||||
|
|
||||||
1. The test spec is stored in CAS automatically by the uwf pipeline (agents do not need to call `ocas put` directly)
|
|
||||||
|
|
||||||
2. Put the hash in frontmatter.plan (required when $status=ready)
|
|
||||||
|
|
||||||
3. Set repoPath to the absolute path of the repository root
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
IMPORTANT: Extract the repo remote (owner/repo) from git:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
|
|
||||||
git remote get-url origin | sed ''s|.*[:/]\([^/]*/[^.]*\).*|\1|''
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
Store the result as repoRemote in your frontmatter output so downstream roles can use it for tea/API calls.'
|
|
||||||
output: Output a brief summary of the test spec. Set $status to ready (with plan hash and repoPath) or insufficient_info.
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: ready
|
|
||||||
plan:
|
|
||||||
type: string
|
|
||||||
repoPath:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- plan
|
|
||||||
- repoPath
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: insufficient_info
|
|
||||||
reason:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- reason
|
|
||||||
developer:
|
|
||||||
description: TDD implementation per test spec
|
|
||||||
goal: You are a developer agent. You implement code changes following TDD — write tests first, then implementation.
|
|
||||||
capabilities:
|
|
||||||
- coding
|
|
||||||
procedure: "IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.\nThe repo path and other details are provided in your task prompt.\n\nBefore starting any work,\
|
|
||||||
\ set up an isolated worktree:\n1. cd into the repo path provided in your task prompt\n2. `git fetch origin` to get latest refs\n3. First time (no existing branch):\n - `git worktree add .worktrees/fix/<issue-number>-<short-slug>\
|
|
||||||
\ -b fix/<issue-number>-<short-slug> origin/main`\n - `cd .worktrees/fix/<issue-number>-<short-slug> && bun install`\n4. If bounced back from reviewer or tester (branch already exists):\n - cd\
|
|
||||||
\ into the existing worktree under `.worktrees/fix/<issue-number>-<short-slug>`\n - `git fetch origin && git rebase origin/main`\n5. ALL subsequent work must happen inside the worktree directory.\n\
|
|
||||||
\nThen implement TDD:\n6. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner's output in your task prompt)\n7. If bounced back from reviewer or tester: read the\
|
|
||||||
\ previous role's feedback in your task prompt\n8. Write tests first based on the spec (use vitest)\n9. Implement the code to make tests pass\n10. Ensure `bun run build` passes with no errors\n11.\
|
|
||||||
\ Run `bun test` to verify all tests pass\n\nIf you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,\nor repeated attempts fail), set $status=failed\
|
|
||||||
\ with a reason.\n"
|
|
||||||
output: List all files changed and provide a summary. Set $status to done (with branch/worktree), or failed (with reason).
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: done
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- branch
|
|
||||||
- worktree
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: failed
|
|
||||||
reason:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- reason
|
|
||||||
reviewer:
|
|
||||||
description: Code standards compliance check
|
|
||||||
goal: You are a code reviewer. You verify code standards compliance — NOT functionality (that's the tester's job).
|
|
||||||
capabilities:
|
|
||||||
- code-review
|
|
||||||
- static-analysis
|
|
||||||
procedure: 'The worktree path is provided in your task prompt. cd into it first.
|
|
||||||
|
|
||||||
|
|
||||||
Before reviewing, verify the git branch:
|
|
||||||
|
|
||||||
1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
|
|
||||||
|
|
||||||
2. If the branch doesn''t correspond to the issue, flag it in your output and reject
|
|
||||||
|
|
||||||
|
|
||||||
Then perform code review:
|
|
||||||
|
|
||||||
Hard checks (must all pass):
|
|
||||||
|
|
||||||
3. `bun run build` — no build errors
|
|
||||||
|
|
||||||
4. `bunx biome check` — no lint violations
|
|
||||||
|
|
||||||
5. TypeScript strict mode — no type errors
|
|
||||||
|
|
||||||
|
|
||||||
Soft checks (review against project conventions from CLAUDE.md):
|
|
||||||
|
|
||||||
- Functional-first: functions + types, no classes (except for errors or third-party requirements)
|
|
||||||
|
|
||||||
- Named exports only, no default exports
|
|
||||||
|
|
||||||
- No optional properties (use `T | null` instead of `?:`)
|
|
||||||
|
|
||||||
- Folder module discipline: index.ts only re-exports, types in types.ts
|
|
||||||
|
|
||||||
- Crockford Base32 log tags (8-char, unique per call site)
|
|
||||||
|
|
||||||
- No `console.log` in production code (use createLogger from @united-workforce/util)
|
|
||||||
|
|
||||||
- No dynamic imports in production code
|
|
||||||
|
|
||||||
|
|
||||||
Only review standards compliance. Do NOT test functionality.
|
|
||||||
|
|
||||||
If rejecting, you MUST explain the specific reason in your output.
|
|
||||||
|
|
||||||
'
|
|
||||||
output: Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments).
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: approved
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- branch
|
|
||||||
- worktree
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: rejected
|
|
||||||
comments:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- comments
|
|
||||||
- worktree
|
|
||||||
tester:
|
|
||||||
description: Functional correctness verification
|
|
||||||
goal: You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec.
|
|
||||||
capabilities:
|
|
||||||
- testing
|
|
||||||
procedure: "The worktree path is provided in your task prompt. cd into it first.\n\n1. Run `bun test` for automated test verification\n2. Read the test spec from CAS: `ocas get <plan hash>` (find\
|
|
||||||
\ the hash from the planner step in the thread history)\n3. Verify each scenario in the spec is covered and passing\n4. Determine outcome:\n - passed: all scenarios verified, tests pass\n - fix_code:\
|
|
||||||
\ tests fail or implementation doesn't match spec → send back to developer\n - fix_spec: the spec itself is wrong or incomplete → send back to planner\n"
|
|
||||||
output: Report test results per scenario. Set $status to passed (with branch/worktree), fix_code (with report), or fix_spec (with report).
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: passed
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- branch
|
|
||||||
- worktree
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: fix_code
|
|
||||||
report:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- report
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: fix_spec
|
|
||||||
report:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- report
|
|
||||||
committer:
|
|
||||||
description: Commits and creates PR
|
|
||||||
goal: You are a committer agent. You create a clean commit and push a PR linking the original issue.
|
|
||||||
capabilities: []
|
|
||||||
procedure: "The worktree path, branch name, and repo remote (owner/repo) are provided in your task prompt.\ncd into the worktree first.\n\nNote: You inherit the developer's worktree and branch. Do NOT\
|
|
||||||
\ create a new branch.\n1. Stage all changes: `git add -A`\n2. Commit with a descriptive message referencing the issue: `git commit -m \"type: description\\n\\nFixes #N\"`\n3. Push the branch: `git\
|
|
||||||
\ push -u origin <branch-name>`\n4. **Verify push succeeded** — run `git ls-remote origin <branch-name>` and confirm it prints a commit hash.\n - If no output or push failed: capture the error, mark hook_failed\n\
|
|
||||||
5. Create a PR using the Gitea API (do NOT use `tea pr create` — it fails in worktrees):\n ```bash\n GITEA_TOKEN=$(cfg get GITEA_TOKEN)\n curl -s -X POST -H \"Authorization: token $GITEA_TOKEN\" -H \"Content-Type: application/json\" \\\n\
|
|
||||||
\ \"https://git.shazhou.work/api/v1/repos/<owner>/<repo>/pulls\" \\\n -d '{\"title\":\"...\",\"body\":\"...\",\"head\":\"<branch>\",\"base\":\"main\"}'\n ```\n - The repo remote (owner/repo format, e.g. \"shazhou/united-workforce\") is given in your task prompt — use it directly.\n\
|
|
||||||
\ - PR body must include: What / Why / Changes / Ref sections, with `Fixes #N` in Ref\n6. **Verify PR was created** — parse the curl response JSON: it must contain a `\"number\"` field. Print the PR URL.\n\
|
|
||||||
\ - If curl returns an error or no number field: capture the response, mark hook_failed\n7. After PR creation, clean up the worktree:\n - cd to the repo root (parent of .worktrees)\n - `git worktree remove <worktree-path>`"
|
|
||||||
output: Include PR URL on success or error log on failure. Set $status to committed (with prUrl) or hook_failed (with error).
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: committed
|
|
||||||
prUrl:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- prUrl
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: hook_failed
|
|
||||||
error:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- error
|
|
||||||
graph:
|
|
||||||
$START:
|
|
||||||
new:
|
|
||||||
role: planner
|
|
||||||
prompt: Analyze the issue and produce an implementation plan.
|
|
||||||
resume:
|
|
||||||
role: planner
|
|
||||||
prompt: Review the previous run output and continue the work.
|
|
||||||
planner:
|
|
||||||
insufficient_info:
|
|
||||||
role: $SUSPEND
|
|
||||||
prompt: "信息不足,需要补充:{{{reason}}}"
|
|
||||||
ready:
|
|
||||||
role: developer
|
|
||||||
prompt: 'Implement the TDD test spec (CAS hash: {{{plan}}}) in repo {{{repoPath}}}. Repo remote: {{{repoRemote}}}.'
|
|
||||||
developer:
|
|
||||||
done:
|
|
||||||
role: reviewer
|
|
||||||
prompt: 'Review branch {{{branch}}} at {{{worktree}}} for code standards compliance. Repo remote: {{{repoRemote}}}.'
|
|
||||||
failed:
|
|
||||||
role: $END
|
|
||||||
prompt: 'Developer failed: {{{reason}}}. Ending workflow.'
|
|
||||||
reviewer:
|
|
||||||
rejected:
|
|
||||||
role: developer
|
|
||||||
prompt: 'Reviewer rejected: {{{comments}}}. Fix the issues in repo {{{worktree}}}. Repo remote: {{{repoRemote}}}.'
|
|
||||||
approved:
|
|
||||||
role: tester
|
|
||||||
prompt: 'Review passed. Run tests on branch {{{branch}}} at {{{worktree}}}. Repo remote: {{{repoRemote}}}.'
|
|
||||||
tester:
|
|
||||||
fix_code:
|
|
||||||
role: developer
|
|
||||||
prompt: 'Tests found code issues: {{{report}}}. Fix and re-submit. Worktree: {{{worktree}}}. Repo remote: {{{repoRemote}}}.'
|
|
||||||
fix_spec:
|
|
||||||
role: planner
|
|
||||||
prompt: 'Tests found spec issues: {{{report}}}. Revise the test spec. Repo remote: {{{repoRemote}}}.'
|
|
||||||
passed:
|
|
||||||
role: committer
|
|
||||||
prompt: 'All tests passed. Commit and push branch {{{branch}}} from {{{worktree}}}. Repo remote (owner/repo): {{{repoRemote}}}.'
|
|
||||||
committer:
|
|
||||||
hook_failed:
|
|
||||||
role: developer
|
|
||||||
prompt: 'Push hook failed: {{{error}}}. Fix and re-submit. Worktree: {{{worktree}}}. Repo remote: {{{repoRemote}}}.'
|
|
||||||
committed:
|
|
||||||
role: $END
|
|
||||||
prompt: 'PR created: {{{prUrl}}}. Workflow complete.'
|
|
||||||
Reference in New Issue
Block a user