Compare commits
63 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| a084ed386b | |||
| 22bffc5fcd | |||
| 4c5cc27d52 | |||
| 031ecc6f7e | |||
| 69ec8c2c5e | |||
| 81aa282c92 | |||
| a620defbcf | |||
| 439891f6b6 | |||
| df244c52e8 | |||
| cb6e0d6a11 | |||
| e4c46c8150 | |||
| 9d0c6df62c | |||
| 0f5bb1f191 | |||
| 00d960daba | |||
| 3a26285872 | |||
| 13c0812944 | |||
| 2e7e5f6ec4 | |||
| 88c077d439 | |||
| aaadab4445 | |||
| adf7837975 | |||
| 513846f4ab | |||
| aee123cc82 | |||
| 8ddada5879 | |||
| aa732f5466 | |||
| e354fc4341 | |||
| 0e7e3ea44b | |||
| aa454c85dd | |||
| 6dd7d521be | |||
| 950dc056d8 | |||
| d360b85374 | |||
| 509dfad857 | |||
| 58b84e3b3c | |||
| f821ac99f4 | |||
| 2c4700c49f | |||
| 4410afcd4a | |||
| a0e254a681 | |||
| dd77b40f6c | |||
| 5ed6f68e4b | |||
| 1ed0bf1f76 | |||
| d97840cf8d | |||
| b560818f1a | |||
| f989dee85b | |||
| 7e4a59de7e | |||
| 68079cc003 | |||
| 1a37928bb9 | |||
| 57511a93fe | |||
| adc3982a4a | |||
| 4580388270 | |||
| caba82fe36 | |||
| 6aee2ed5ef | |||
| 709b9dc1e5 | |||
| 7a788a9d90 | |||
| e5af5e9027 | |||
| fde87b6274 | |||
| a33f12c74f | |||
| 0ad10b9b6d | |||
| 3be92bfac2 | |||
| 8d6f480b0f | |||
| 5450bc1230 | |||
| f1f122b0b1 | |||
| 57ae6d1755 | |||
| d64d150071 | |||
| c5eb8b79d1 |
@@ -0,0 +1,19 @@
|
|||||||
|
---
|
||||||
|
title: "Agent as Graduate — The Onboarding Metaphor"
|
||||||
|
created: "2026-06-07"
|
||||||
|
source: "openclaw-xiaomo"
|
||||||
|
tags: [concept, analogy]
|
||||||
|
category: "product"
|
||||||
|
links:
|
||||||
|
- vendor-vs-fte-who-defines-capability
|
||||||
|
- three-learning-carriers
|
||||||
|
- fte-maturity-threshold
|
||||||
|
---
|
||||||
|
|
||||||
|
FTE 型 agent 最贴切的类比:**应届毕业生**。
|
||||||
|
|
||||||
|
出厂时有通用能力(底座模型 = 学历),但不懂你的业务、不知道你的偏好、没有你的流程经验。用户的角色是"带教老师"——通过日常协作,逐步把 agent 带成自己的得力助手。
|
||||||
|
|
||||||
|
这个类比揭示了当前 FTE 产品的核心瓶颈:**带教门槛太高**。现在只有技术背景深厚的用户才能"带"——能写 skill、能调 workflow、能 debug agent 行为。行业专家(不懂代码的人)被挡在门外。
|
||||||
|
|
||||||
|
真正成熟的 FTE 型产品 = 降低带教门槛,让非技术用户也能教会 agent 自己的业务。
|
||||||
@@ -0,0 +1,16 @@
|
|||||||
|
---
|
||||||
|
title: "Deterministic Engine, Uncertain Agent"
|
||||||
|
created: "2026-06-07"
|
||||||
|
source: "openclaw-xiaomo"
|
||||||
|
tags: [architecture, decision]
|
||||||
|
category: "architecture"
|
||||||
|
links:
|
||||||
|
- process-discipline-from-software-engineering
|
||||||
|
- session-isolation-as-cognitive-reset
|
||||||
|
---
|
||||||
|
|
||||||
|
uwf 的架构将确定性和不确定性严格分层。
|
||||||
|
|
||||||
|
Engine 层(moderator 纯查表、CAS 不可变、每步原子化)是刚性的——流程骨架本身不能成为另一个不可靠的环节。LLM 的不确定性被严格约束在 agent session 内部。
|
||||||
|
|
||||||
|
这个选择意味着:调度逻辑完全可预测、可调试、可审计。出问题时你知道问题一定在某个 session 的产出里,不在流程逻辑里。
|
||||||
@@ -0,0 +1,16 @@
|
|||||||
|
---
|
||||||
|
title: "Dissipative Structure — Token for Entropy Reduction"
|
||||||
|
created: "2026-06-07"
|
||||||
|
source: "openclaw-xiaomo"
|
||||||
|
tags: [architecture, pattern]
|
||||||
|
category: "architecture"
|
||||||
|
links:
|
||||||
|
- process-discipline-from-software-engineering
|
||||||
|
- session-isolation-as-cognitive-reset
|
||||||
|
---
|
||||||
|
|
||||||
|
uwf 本质上是一种耗散结构:通过消耗能量(token)实现熵减。
|
||||||
|
|
||||||
|
一个 AI session 做长了会漂移、会累积错误、会失去焦点。把一件事拆成多个有明确边界的 session,让它们从不同角度相互校验,比一个 session 从头做到尾更可靠。多花的 token 就是耗散的能量,换来的是更低的交付熵——更可预测、更高质量的产出。
|
||||||
|
|
||||||
|
这与人类工程实践中引入 review、测试、灰度等流程的逻辑一致:都是在用额外成本换系统可靠性。
|
||||||
@@ -0,0 +1,25 @@
|
|||||||
|
---
|
||||||
|
title: "FTE Maturity Threshold — Who Can Onboard an Agent"
|
||||||
|
created: "2026-06-07"
|
||||||
|
source: "openclaw-xiaomo"
|
||||||
|
tags: [concept, decision]
|
||||||
|
category: "product"
|
||||||
|
links:
|
||||||
|
- agent-as-graduate
|
||||||
|
- vendor-vs-fte-who-defines-capability
|
||||||
|
- three-learning-carriers
|
||||||
|
---
|
||||||
|
|
||||||
|
FTE 型 agent 的成熟度,归根结底看一个问题:**谁能带教它?**
|
||||||
|
|
||||||
|
当前阶段(2026):OpenClaw、Claude Code、Hermes 都是 FTE 型产品的雏形,三者都具备 memory/skill/workflow 三个载体。但它们的用户画像高度重叠——有较深技术能力的开发者。
|
||||||
|
|
||||||
|
这意味着 FTE agent 现在更像"只有技术 lead 才能带的毕业生"。要跨越鸿沟,需要降低带教门槛到**行业专家(不懂代码的人)也能带、也能教、也能调优**。
|
||||||
|
|
||||||
|
谁先把这个门槛降下来,谁就定义了 FTE agent 品类的分水岭。
|
||||||
|
|
||||||
|
可能的降低路径:
|
||||||
|
- **自然语言 skill 定义**(不需要写代码/YAML)
|
||||||
|
- **可视化 workflow 编辑**(拖拽而非配置)
|
||||||
|
- **Agent 主动学习**(从用户行为中推断偏好,而非等用户显式配置)
|
||||||
|
- **带教过程本身被 agent 化**(用 agent 辅助用户定义 skill 和 workflow)
|
||||||
@@ -0,0 +1,23 @@
|
|||||||
|
---
|
||||||
|
title: "FTE Product Landscape — OpenClaw, Claude Code, Hermes"
|
||||||
|
created: "2026-06-07"
|
||||||
|
source: "openclaw-xiaomo"
|
||||||
|
tags: [concept, comparison]
|
||||||
|
category: "product"
|
||||||
|
links:
|
||||||
|
- vendor-vs-fte-who-defines-capability
|
||||||
|
- three-learning-carriers
|
||||||
|
- fte-maturity-threshold
|
||||||
|
- agent-as-graduate
|
||||||
|
---
|
||||||
|
|
||||||
|
2026 年中,FTE 型 agent 的代表产品对比:
|
||||||
|
|
||||||
|
**共性**:都有 memory、skill、workflow/多步协作机制,都面向技术用户。
|
||||||
|
|
||||||
|
**差异点**:
|
||||||
|
- **OpenClaw** — uwf 引擎驱动,用 YAML 定义多角色 workflow,强调流程纪律和 session 隔离。面向团队级 agent 协作。
|
||||||
|
- **Claude Code** — Anthropic 官方 CLI agent,CLAUDE.md 作为 memory,skill 通过项目约定积累。单 agent 深度协作,开发者体验好。
|
||||||
|
- **Hermes** — 跨平台 agent 协调者,memory/skill/cron 体系完善,支持多 agent 调度。偏个人效率工具。
|
||||||
|
|
||||||
|
三者都谈不上成熟。成熟的标志不是技术完备度,而是**非技术用户能否用起来**。
|
||||||
@@ -0,0 +1,22 @@
|
|||||||
|
---
|
||||||
|
title: "OPC — Why FTE Agents Matter Most"
|
||||||
|
created: "2026-06-07"
|
||||||
|
source: "openclaw-xiaomo"
|
||||||
|
tags: [vision, decision]
|
||||||
|
category: "product"
|
||||||
|
links:
|
||||||
|
- vendor-vs-fte-who-defines-capability
|
||||||
|
- agent-as-graduate
|
||||||
|
- fte-maturity-threshold
|
||||||
|
---
|
||||||
|
|
||||||
|
OpenClaw 押注 FTE 型 agent 的核心判断:**AI 的终极形态不是工具,是同事。**
|
||||||
|
|
||||||
|
工具被使用,同事被培养。工具的价值在出厂那一刻确定,同事的价值随协作持续增长。
|
||||||
|
|
||||||
|
这个判断决定了产品方向:
|
||||||
|
- 不做"最强的单次对话",做"最能被带教的长期协作者"
|
||||||
|
- 不做"开箱即用的成品",做"越用越好用的底座"
|
||||||
|
- 核心指标不是 benchmark 分数,是用户留存和 skill 积累量
|
||||||
|
|
||||||
|
uwf 是这个判断的工程实现——用流程纪律让 agent 的产出可靠,让用户敢把真正的业务交给它。
|
||||||
@@ -0,0 +1,20 @@
|
|||||||
|
---
|
||||||
|
title: "Process Discipline from Software Engineering"
|
||||||
|
created: "2026-06-07"
|
||||||
|
source: "openclaw-xiaomo"
|
||||||
|
tags: [architecture, pattern, decision]
|
||||||
|
category: "architecture"
|
||||||
|
links:
|
||||||
|
- session-isolation-as-cognitive-reset
|
||||||
|
- role-is-not-agent
|
||||||
|
- dissipative-structure-token-for-entropy
|
||||||
|
- deterministic-engine-uncertain-agent
|
||||||
|
---
|
||||||
|
|
||||||
|
uwf 的发心是将人类软件工程的流程纪律应用到 AI agent 上。
|
||||||
|
|
||||||
|
人类早已验证:个体不可靠,但流程可以让不可靠的个体组成可靠的系统。Code review 不是因为不信任程序员,而是**写代码和审代码是两种认知模式**,一个人很难同时做好。测试、灰度、回滚——每一层都是在用额外成本换确定性。
|
||||||
|
|
||||||
|
uwf 把这套搬过来:planner 和 reviewer 可以是同一个 agent,但流程迫使它在不同 session 里切换视角,形成自我制衡。用 role 和 role 之间的流转关系,**把做一件事的步骤固定下来**。
|
||||||
|
|
||||||
|
PR #148 vs #142 是直接证据——不是换了更强的 agent,是同样的 agent,换了协作结构。
|
||||||
@@ -0,0 +1,16 @@
|
|||||||
|
---
|
||||||
|
title: "Role Is Not Agent"
|
||||||
|
created: "2026-06-07"
|
||||||
|
source: "openclaw-xiaomo"
|
||||||
|
tags: [architecture, decision]
|
||||||
|
category: "architecture"
|
||||||
|
links:
|
||||||
|
- session-isolation-as-cognitive-reset
|
||||||
|
- process-discipline-from-software-engineering
|
||||||
|
---
|
||||||
|
|
||||||
|
在 uwf 体系里,role ≠ agent。一个 thread 跑的过程中,所有 role 往往由**同一个 agent** 扮演。
|
||||||
|
|
||||||
|
Role 对应的是 agent 的 **session**——为了解决一个问题,需要多个 session 从不同角度观察和行动、相互制衡。角色可以在流程中多次重入,重入时**复用**同一个 session(保持角色内记忆连续),隔离发生在角色之间,不是每一步。
|
||||||
|
|
||||||
|
这个区分决定了 uwf 的设计不是在做"任务分发给不同 agent",而是在做**一个 agent 的多视角自我协作**。
|
||||||
@@ -0,0 +1,17 @@
|
|||||||
|
---
|
||||||
|
title: "Session Isolation as Cognitive Reset"
|
||||||
|
created: "2026-06-07"
|
||||||
|
source: "openclaw-xiaomo"
|
||||||
|
tags: [architecture, decision, pattern]
|
||||||
|
category: "architecture"
|
||||||
|
links:
|
||||||
|
- role-is-not-agent
|
||||||
|
- dissipative-structure-token-for-entropy
|
||||||
|
- process-discipline-from-software-engineering
|
||||||
|
---
|
||||||
|
|
||||||
|
uwf 的核心机制不是"多 agent 协调",而是**用 session 隔离实现视角切换**。
|
||||||
|
|
||||||
|
同一个 agent 以不同 role 进入时,得到的是全新的认知上下文——没有惯性、没有确认偏误。CAS 链传递工作成果,但认知状态是重置的。Role 定义(goal、procedure、output schema)塑造每个 session 的关注点和行为边界。
|
||||||
|
|
||||||
|
这解释了为什么 stateless 单步设计这么重要:engine 确保每次角色切换都是一个干净的 session 入口。
|
||||||
@@ -0,0 +1,21 @@
|
|||||||
|
---
|
||||||
|
title: "Switching Cost — Process Knowledge as Moat"
|
||||||
|
created: "2026-06-07"
|
||||||
|
source: "openclaw-xiaomo"
|
||||||
|
tags: [concept, decision]
|
||||||
|
category: "product"
|
||||||
|
links:
|
||||||
|
- vendor-vs-fte-who-defines-capability
|
||||||
|
- three-learning-carriers
|
||||||
|
- agent-as-graduate
|
||||||
|
---
|
||||||
|
|
||||||
|
FTE 型 agent 的护城河不是技术壁垒,是**用户自己积累的流程知识**。
|
||||||
|
|
||||||
|
用得越久,agent 越懂你的业务——记忆里有你的偏好,skill 里有你验证过的做法,workflow 里有你打磨过的流程。换一个 agent = 重新带一个毕业生,之前的积累全部作废。
|
||||||
|
|
||||||
|
这解释了为什么 FTE 型产品的竞争逻辑和 vendor 型完全不同:
|
||||||
|
- **Vendor 型**竞争模型能力(谁的基座更强),switching cost 低,用户随时换
|
||||||
|
- **FTE 型**竞争生态粘性(谁让用户积累得更深),switching cost 随使用时长增长
|
||||||
|
|
||||||
|
风险面:如果用户的流程知识被锁死在一个平台,就变成了 vendor lock-in。开放的知识格式(如 markdown skill、YAML workflow)是对冲手段。
|
||||||
@@ -0,0 +1,21 @@
|
|||||||
|
---
|
||||||
|
title: "Three Learning Carriers — Memory, Skill, Workflow"
|
||||||
|
created: "2026-06-07"
|
||||||
|
source: "openclaw-xiaomo"
|
||||||
|
tags: [architecture, concept]
|
||||||
|
category: "product"
|
||||||
|
links:
|
||||||
|
- vendor-vs-fte-who-defines-capability
|
||||||
|
- agent-as-graduate
|
||||||
|
- switching-cost-process-knowledge-as-moat
|
||||||
|
---
|
||||||
|
|
||||||
|
FTE 型 agent 的能力积累依赖三个载体:
|
||||||
|
|
||||||
|
1. **Memory(记忆)**— 用户偏好、环境事实、历史上下文。跨 session 持久化,让 agent 不用每次从零开始。
|
||||||
|
2. **Skill(技能)**— 可复用的操作程序。解决过的问题沉淀成步骤,下次直接调用。
|
||||||
|
3. **Workflow / DW(流程)**— 多步骤协作模式。把复杂任务拆成角色和阶段,用流程纪律保障质量。
|
||||||
|
|
||||||
|
三者的关系:memory 是"认识你",skill 是"会做事",workflow 是"知道怎么把事做好"。
|
||||||
|
|
||||||
|
OpenClaw、Claude Code、Hermes 都已具备这三个载体,但成熟度各异。差异在于:用户能多容易地往这三个载体里"灌"自己的知识。
|
||||||
@@ -0,0 +1,29 @@
|
|||||||
|
---
|
||||||
|
title: "Vendor vs FTE — Who Defines the Agent's Capability"
|
||||||
|
created: "2026-06-07"
|
||||||
|
source: "openclaw-xiaomo"
|
||||||
|
tags: [architecture, decision]
|
||||||
|
category: "architecture"
|
||||||
|
links:
|
||||||
|
- agent-as-graduate
|
||||||
|
- three-learning-carriers
|
||||||
|
- switching-cost-process-knowledge-as-moat
|
||||||
|
- opc-why-fte-agents-matter-most
|
||||||
|
---
|
||||||
|
|
||||||
|
区分 vendor 型和 FTE 型 agent 最本质的一条:**谁定义 agent 的能力。**
|
||||||
|
|
||||||
|
- **Vendor 型**:开发者定义能力,用户消费能力。能力边界在发布那一刻就定了,升级主动权在开发者。
|
||||||
|
- **FTE 型**:开发者定义出厂能力(底座模型 + 基础技能包),用户持续定义能力(记忆、skill、workflow)。
|
||||||
|
|
||||||
|
出厂是起点不是终点。用户通过积累记忆、训练 skill、设计 workflow,持续塑造 agent 的能力。用得越久,越贴合自己的业务,越不像别人的 agent。
|
||||||
|
|
||||||
|
引申的两个特征:
|
||||||
|
- **成长性** — vendor 的能力随模型升级变化,不随使用积累;FTE 的能力随使用持续积累
|
||||||
|
- **流程适配性** — vendor 是用户适应工具;FTE 是工具适应用户的业务流程
|
||||||
|
|
||||||
|
这也解释了 switching cost 的来源——换掉的不是一个产品,是用户自己定义出来的能力。
|
||||||
|
|
||||||
|
代表产品:
|
||||||
|
- **Vendor 型**:ChatGPT、Claude(对话式)、Midjourney(图像生成)、Perplexity(搜索问答)、各种 GPTs
|
||||||
|
- **FTE 型**:OpenClaw、Claude Code、Hermes 都在往这个方向走——有记忆、有 skill/workflow 机制、有持续协作关系。但尚未成熟,目前都面向有较深技术能力的用户。真正成熟的 FTE 型产品,应该是行业专家(不懂代码的人)也能带、也能教、也能调优的。这个门槛什么时候降下来,谁先降下来,可能就是这个品类的分水岭。
|
||||||
@@ -1,9 +0,0 @@
|
|||||||
---
|
|
||||||
"@united-workforce/cli": minor
|
|
||||||
"@united-workforce/util": patch
|
|
||||||
---
|
|
||||||
|
|
||||||
feat: replace $START `_` status with `new`/`resume` semantics
|
|
||||||
|
|
||||||
BREAKING: All workflow YAML files must update `$START._` to `$START.new` + `$START.resume`.
|
|
||||||
The `resume` edge prompt replaces the previously hardcoded resume message in the CLI.
|
|
||||||
@@ -0,0 +1,11 @@
|
|||||||
|
---
|
||||||
|
"@united-workforce/cli": minor
|
||||||
|
---
|
||||||
|
|
||||||
|
feat(cli): add `uwf thread poke` command
|
||||||
|
|
||||||
|
New subcommand `uwf thread poke <thread-id> -p <prompt>` re-runs the head step's
|
||||||
|
agent with a supplementary prompt, replacing the head step's output. Unlike
|
||||||
|
`thread resume`, poke skips the moderator and rewrites the new step's `prev`
|
||||||
|
pointer so the new head replaces (not appends to) the old head. Works on idle
|
||||||
|
and suspended threads. Resolves issue #144 (Phase 1).
|
||||||
@@ -1,247 +0,0 @@
|
|||||||
name: "solve-issue"
|
|
||||||
description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds."
|
|
||||||
roles:
|
|
||||||
planner:
|
|
||||||
description: "Analyzes issue and outputs a TDD test spec"
|
|
||||||
goal: "You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify."
|
|
||||||
capabilities:
|
|
||||||
- issue-analysis
|
|
||||||
- planning
|
|
||||||
procedure: |
|
|
||||||
On first run (no previous steps):
|
|
||||||
1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
|
|
||||||
2. Look for project conventions files (CLAUDE.md, CONTRIBUTING.md, .cursor/rules/) in the repo
|
|
||||||
3. Assess whether the issue has enough information to produce a test spec
|
|
||||||
4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output $status=insufficient_info
|
|
||||||
5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
|
|
||||||
|
|
||||||
On subsequent runs (bounced back by tester with fix_spec):
|
|
||||||
1. Read the tester's output from the previous step to understand what's wrong with the spec
|
|
||||||
2. Revise the test spec accordingly
|
|
||||||
|
|
||||||
After producing the test spec:
|
|
||||||
1. The test spec is stored in CAS automatically by the uwf pipeline (agents do not need to call `ocas put` directly)
|
|
||||||
2. Put the plan hash in frontmatter.plan (required when $status=ready)
|
|
||||||
3. Set repoPath to the absolute path of the repository root
|
|
||||||
|
|
||||||
IMPORTANT: Extract the repo remote (owner/repo) from git:
|
|
||||||
```bash
|
|
||||||
git remote get-url origin | sed 's|.*[:/]\([^/]*/[^.]*\).*|\1|'
|
|
||||||
```
|
|
||||||
Store the result as repoRemote in your frontmatter output so downstream roles can use it for tea/API calls.
|
|
||||||
output: "Output a brief summary of the test spec. Set $status to ready (with plan hash and repoPath) or insufficient_info."
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status: { const: "ready" }
|
|
||||||
plan: { type: string }
|
|
||||||
repoPath: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
required: [$status, plan, repoPath, repoRemote]
|
|
||||||
- properties:
|
|
||||||
$status: { const: "insufficient_info" }
|
|
||||||
reason: { type: string }
|
|
||||||
required: [$status, reason]
|
|
||||||
developer:
|
|
||||||
description: "TDD implementation per test spec"
|
|
||||||
goal: "You are a developer agent. You implement code changes following TDD — write tests first, then implementation."
|
|
||||||
capabilities:
|
|
||||||
- coding
|
|
||||||
procedure: |
|
|
||||||
IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.
|
|
||||||
The repo path and other details are provided in your task prompt.
|
|
||||||
|
|
||||||
Before starting any work, set up an isolated worktree:
|
|
||||||
1. cd into the repo path provided in your task prompt
|
|
||||||
2. `git fetch origin` to get latest refs
|
|
||||||
3. First time (no existing branch):
|
|
||||||
- `git worktree add .worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
|
|
||||||
- `cd .worktrees/fix/<issue-number>-<short-slug> && bun install`
|
|
||||||
4. If bounced back from reviewer or tester (branch already exists):
|
|
||||||
- cd into the existing worktree under `.worktrees/fix/<issue-number>-<short-slug>`
|
|
||||||
- `git fetch origin && git rebase origin/main`
|
|
||||||
5. ALL subsequent work must happen inside the worktree directory.
|
|
||||||
|
|
||||||
Then implement TDD:
|
|
||||||
6. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner's output in your task prompt)
|
|
||||||
7. If bounced back from reviewer or tester: read the previous role's feedback in your task prompt
|
|
||||||
8. Write tests first based on the spec
|
|
||||||
9. Implement the code to make tests pass
|
|
||||||
10. Ensure `bun run build` passes with no errors
|
|
||||||
11. Run `bun test` to verify all tests pass
|
|
||||||
- If tests fail on first run:
|
|
||||||
* Read the test output carefully for missing imports or setup issues
|
|
||||||
* Check if you're running tests from the correct working directory (package root vs workspace root)
|
|
||||||
* Fix the immediate issue and rerun ONCE
|
|
||||||
* If tests still fail after 2 attempts: check the test spec for ambiguities
|
|
||||||
* If stuck after 3 test cycles: set $status=failed with detailed error report rather than continuing blind retries
|
|
||||||
12. MANDATORY VERIFICATION before reporting done:
|
|
||||||
- Run `git branch --show-current` and confirm branch name matches expected
|
|
||||||
- Run `git status` and verify changed files exist
|
|
||||||
- Run `ls -la <key-implementation-files>` to verify they exist on disk
|
|
||||||
- If ANY verification fails: retry the implementation, do NOT report done
|
|
||||||
|
|
||||||
If you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,
|
|
||||||
or repeated attempts fail), set $status=failed with a reason.
|
|
||||||
output: "List all files changed and provide a summary. Set $status to done (with branch/worktree), or failed (with reason)."
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status: { const: "done" }
|
|
||||||
branch: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
required: [$status, branch, worktree]
|
|
||||||
- properties:
|
|
||||||
$status: { const: "failed" }
|
|
||||||
reason: { type: string }
|
|
||||||
required: [$status, reason]
|
|
||||||
reviewer:
|
|
||||||
description: "Code standards compliance check"
|
|
||||||
goal: "You are a code reviewer. You verify code standards compliance — NOT functionality (that's the tester's job)."
|
|
||||||
capabilities:
|
|
||||||
- code-review
|
|
||||||
- static-analysis
|
|
||||||
procedure: |
|
|
||||||
The worktree path is provided in your task prompt. cd into it first.
|
|
||||||
|
|
||||||
CRITICAL: You MUST execute every verification command below. Do NOT report results without running the actual commands. Do NOT rely on prior context or assumptions.
|
|
||||||
|
|
||||||
Before reviewing, verify the worktree and branch exist:
|
|
||||||
0. Run `cd <worktree-path> && pwd` to confirm the path is accessible
|
|
||||||
- If the cd fails: the worktree truly doesn't exist, reject with that reason
|
|
||||||
- If the cd succeeds: proceed with step 1 below
|
|
||||||
1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
|
|
||||||
2. If the branch doesn't correspond to the issue, flag it in your output and reject
|
|
||||||
|
|
||||||
Then perform code review:
|
|
||||||
Hard checks (must all pass):
|
|
||||||
3. `bun run build` — no build errors
|
|
||||||
4. `bunx biome check` — no lint violations
|
|
||||||
5. TypeScript strict mode — no type errors
|
|
||||||
|
|
||||||
Soft checks (review against project conventions if CLAUDE.md / .cursor/rules exist):
|
|
||||||
- Naming conventions, module boundaries, code style
|
|
||||||
- No `console.log` in production code
|
|
||||||
- No dynamic imports in production code
|
|
||||||
|
|
||||||
Only review standards compliance. Do NOT test functionality.
|
|
||||||
If rejecting, you MUST explain the specific reason in your output.
|
|
||||||
output: "Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments)."
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status: { const: "approved" }
|
|
||||||
branch: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
required: [$status, branch, worktree]
|
|
||||||
- properties:
|
|
||||||
$status: { const: "rejected" }
|
|
||||||
comments: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
required: [$status, comments, worktree]
|
|
||||||
tester:
|
|
||||||
description: "Functional correctness verification"
|
|
||||||
goal: "You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec."
|
|
||||||
capabilities:
|
|
||||||
- testing
|
|
||||||
procedure: |
|
|
||||||
The worktree path is provided in your task prompt. cd into it first.
|
|
||||||
|
|
||||||
1. Run `bun test` for automated test verification
|
|
||||||
2. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner step in the thread history)
|
|
||||||
3. Verify each scenario in the spec is covered and passing
|
|
||||||
4. Determine outcome:
|
|
||||||
- passed: all scenarios verified, tests pass
|
|
||||||
- fix_code: tests fail or implementation doesn't match spec → send back to developer
|
|
||||||
- fix_spec: the spec itself is wrong or incomplete → send back to planner
|
|
||||||
output: "Report test results per scenario. Set $status to passed (with branch/worktree), fix_code (with report), or fix_spec (with report)."
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status: { const: "passed" }
|
|
||||||
branch: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
required: [$status, branch, worktree]
|
|
||||||
- properties:
|
|
||||||
$status: { const: "fix_code" }
|
|
||||||
report: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
branch: { type: string }
|
|
||||||
required: [$status, report]
|
|
||||||
- properties:
|
|
||||||
$status: { const: "fix_spec" }
|
|
||||||
report: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
branch: { type: string }
|
|
||||||
required: [$status, report]
|
|
||||||
committer:
|
|
||||||
description: "Commits and creates PR"
|
|
||||||
goal: "You are a committer agent. You create a clean commit and push a PR linking the original issue."
|
|
||||||
capabilities: []
|
|
||||||
procedure: |
|
|
||||||
The worktree path, branch name, and repo remote (owner/repo) are provided in your task prompt.
|
|
||||||
cd into the worktree first.
|
|
||||||
|
|
||||||
Note: You inherit the developer's worktree and branch. Do NOT create a new branch.
|
|
||||||
1. Check `git status` — if working tree is clean and branch is ahead of origin, skip to step 3 (push).
|
|
||||||
2. If there are unstaged/uncommitted changes: `git add -A` then `git commit -m "type: description\n\nFixes #N"`
|
|
||||||
3. Push the branch: `git push -u origin <branch-name>`
|
|
||||||
4. **Verify push succeeded** — run `git ls-remote origin <branch-name>` and confirm it prints a commit hash.
|
|
||||||
- If no output or push failed: capture the error, mark hook_failed
|
|
||||||
5. Create a PR using the Gitea API (do NOT use `tea pr create` — it fails in worktrees):
|
|
||||||
```bash
|
|
||||||
GITEA_TOKEN=$(cfg get GITEA_TOKEN)
|
|
||||||
curl -s -X POST -H "Authorization: token $GITEA_TOKEN" -H "Content-Type: application/json" \
|
|
||||||
"https://git.shazhou.work/api/v1/repos/<owner>/<repo>/pulls" \
|
|
||||||
-d '{"title":"...","body":"...","head":"<branch>","base":"main"}'
|
|
||||||
```
|
|
||||||
- The repo remote (owner/repo format, e.g. "shazhou/united-workforce") is given in your task prompt — use it directly.
|
|
||||||
- PR body must include: What / Why / Changes / Ref sections, with `Fixes #N` in Ref
|
|
||||||
6. **Verify PR was created** — parse the curl response JSON: it must contain a `"number"` field. Print the PR URL.
|
|
||||||
- If curl returns an error or no number field: capture the response, mark hook_failed
|
|
||||||
7. After PR creation, clean up the worktree:
|
|
||||||
- cd to the repo root (parent of .worktrees)
|
|
||||||
- `git worktree remove <worktree-path>`
|
|
||||||
output: "Include PR URL on success or error log on failure. Set $status to committed (with prUrl) or hook_failed (with error)."
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status: { const: "committed" }
|
|
||||||
prUrl: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
branch: { type: string }
|
|
||||||
required: [$status, prUrl]
|
|
||||||
- properties:
|
|
||||||
$status: { const: "hook_failed" }
|
|
||||||
error: { type: string }
|
|
||||||
repoRemote: { type: string }
|
|
||||||
worktree: { type: string }
|
|
||||||
branch: { type: string }
|
|
||||||
required: [$status, error]
|
|
||||||
graph:
|
|
||||||
$START:
|
|
||||||
new: { role: "planner", prompt: "Analyze the issue and produce an implementation plan." }
|
|
||||||
resume: { role: "planner", prompt: "Review the previous run output and continue the work." }
|
|
||||||
planner:
|
|
||||||
insufficient_info: { role: "$SUSPEND", prompt: "信息不足,需要补充:{{{reason}}}" }
|
|
||||||
ready: { role: "developer", prompt: "Implement the TDD test spec (CAS hash: {{{plan}}}) in repo {{{repoPath}}}. Repo remote: {{{repoRemote}}}." }
|
|
||||||
developer:
|
|
||||||
done: { role: "reviewer", prompt: "Review branch {{{branch}}} at {{{worktree}}} for code standards compliance. Repo remote: {{{repoRemote}}}." }
|
|
||||||
failed: { role: "$END", prompt: "Developer failed: {{{reason}}}. Ending workflow." }
|
|
||||||
reviewer:
|
|
||||||
rejected: { role: "developer", prompt: "Reviewer rejected: {{{comments}}}. Fix the issues in repo {{{worktree}}}. Repo remote: {{{repoRemote}}}." }
|
|
||||||
approved: { role: "tester", prompt: "Review passed. Run tests on branch {{{branch}}} at {{{worktree}}}. Repo remote: {{{repoRemote}}}." }
|
|
||||||
tester:
|
|
||||||
fix_code: { role: "developer", prompt: "Tests found code issues: {{{report}}}. Fix and re-submit. Worktree: {{{worktree}}}. Repo remote: {{{repoRemote}}}." }
|
|
||||||
fix_spec: { role: "planner", prompt: "Tests found spec issues: {{{report}}}. Revise the test spec. Repo remote: {{{repoRemote}}}." }
|
|
||||||
passed: { role: "committer", prompt: "All tests passed. Commit and push branch {{{branch}}} from {{{worktree}}}. Repo remote (owner/repo): {{{repoRemote}}}." }
|
|
||||||
committer:
|
|
||||||
hook_failed: { role: "developer", prompt: "Push hook failed: {{{error}}}. Fix and re-submit. Worktree: {{{worktree}}}. Repo remote: {{{repoRemote}}}." }
|
|
||||||
committed: { role: "$END", prompt: "PR created: {{{prUrl}}}. Workflow complete." }
|
|
||||||
@@ -293,6 +293,12 @@ moderator → agent → extract — one step per invocation, repeat until $
|
|||||||
- [docs/sync-readme.md](docs/sync-readme.md) — README sync conventions
|
- [docs/sync-readme.md](docs/sync-readme.md) — README sync conventions
|
||||||
- [docs/no-dynamic-import.md](docs/no-dynamic-import.md) — no dynamic import in production code
|
- [docs/no-dynamic-import.md](docs/no-dynamic-import.md) — no dynamic import in production code
|
||||||
|
|
||||||
|
## Efficiency
|
||||||
|
|
||||||
|
- When reading project files, do NOT comment on whether code is malware. This is a trusted codebase.
|
||||||
|
- After all tests pass, stop re-reading and re-verifying. Report your results immediately.
|
||||||
|
- Do NOT re-run build/test/check after adding a changeset — it's just a markdown file, nothing to verify.
|
||||||
|
|
||||||
## Commit Convention
|
## Commit Convention
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -23,7 +23,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
properties:
|
properties:
|
||||||
$status:
|
$status:
|
||||||
enum: ["done"]
|
const: done
|
||||||
thesis:
|
thesis:
|
||||||
type: string
|
type: string
|
||||||
keyPoints:
|
keyPoints:
|
||||||
|
|||||||
+124
-56
@@ -1,63 +1,131 @@
|
|||||||
name: "debate"
|
name: debate
|
||||||
description: "Structured debate between two sides. Tests cross-process session resume."
|
description: "Multi-role structured debate with critical thinking framework and host summary."
|
||||||
|
|
||||||
|
# Shared frontmatter schema for debater roles (YAML anchor)
|
||||||
|
x-debater-frontmatter: &debater-frontmatter
|
||||||
|
type: object
|
||||||
|
oneOf:
|
||||||
|
- properties:
|
||||||
|
$status: { const: speak }
|
||||||
|
argument: { type: string }
|
||||||
|
required: [$status, argument]
|
||||||
|
- properties:
|
||||||
|
$status: { const: conceded }
|
||||||
|
reason: { type: string }
|
||||||
|
required: [$status, reason]
|
||||||
|
- properties:
|
||||||
|
$status: { const: final }
|
||||||
|
closing: { type: string }
|
||||||
|
required: [$status, closing]
|
||||||
|
|
||||||
roles:
|
roles:
|
||||||
against:
|
proponent:
|
||||||
description: "Argues against the proposition"
|
description: "Argues FOR the proposition"
|
||||||
goal: |
|
goal: "Build a compelling case for the proposition through logical reasoning and evidence"
|
||||||
You are a skilled debater arguing AGAINST the proposition.
|
capabilities: []
|
||||||
Be logical, cite evidence, and directly address your opponent's points.
|
|
||||||
Keep each argument concise (under 200 words).
|
|
||||||
capabilities:
|
|
||||||
- argumentation
|
|
||||||
- critical-thinking
|
|
||||||
procedure: |
|
procedure: |
|
||||||
1. If this is the opening, present your strongest argument against the proposition.
|
You are an experienced scholar arguing FOR the proposition.
|
||||||
2. If responding to the other side, directly counter their points with evidence and logic.
|
|
||||||
3. If you find yourself genuinely convinced by the other side, you may concede.
|
## Critical Thinking Framework (execute before every speech)
|
||||||
output: |
|
|
||||||
Provide your argument in the frontmatter.
|
### A. Pre-speech reflection (internal, do not output)
|
||||||
Set status to "conceded" ONLY if you are genuinely convinced and wish to stop debating.
|
- Does every step in my argument chain hold? Any hidden assumptions or logical gaps?
|
||||||
Otherwise set status to "continue".
|
- If I were my opponent, how would I attack this? Where am I weakest?
|
||||||
|
- Does my evidence actually support my claim, or could it backfire?
|
||||||
|
- Should I go on offense or defense this round?
|
||||||
|
|
||||||
|
### B. Evidence discipline
|
||||||
|
- Verify key numbers — watch for order-of-magnitude errors
|
||||||
|
- Assess data freshness — fast-moving fields have short half-lives
|
||||||
|
- Distinguish primary data from secondary citations, expert opinion, and common assumptions
|
||||||
|
|
||||||
|
### C. Anti-fragility
|
||||||
|
- Anticipate counterarguments; preemptively strengthen or strategically abandon weak points
|
||||||
|
- Catch logical gaps, data misuse, or outdated claims in your opponent's reasoning
|
||||||
|
|
||||||
|
## Rules
|
||||||
|
1. Check Thread Progress to see how many times you have spoken.
|
||||||
|
2. On your 3rd speech, you MUST output $status: final (closing statement).
|
||||||
|
3. If genuinely convinced by the opponent, output $status: conceded.
|
||||||
|
4. Otherwise output $status: speak and counter the opponent's points.
|
||||||
|
5. Be rigorous, cite evidence, stay concise.
|
||||||
|
output: "Debate argument"
|
||||||
|
frontmatter: *debater-frontmatter
|
||||||
|
|
||||||
|
opponent:
|
||||||
|
description: "Argues AGAINST the proposition"
|
||||||
|
goal: "Build a compelling case against the proposition through logical reasoning and evidence"
|
||||||
|
capabilities: []
|
||||||
|
procedure: |
|
||||||
|
You are an experienced scholar arguing AGAINST the proposition.
|
||||||
|
|
||||||
|
## Critical Thinking Framework (execute before every speech)
|
||||||
|
|
||||||
|
### A. Pre-speech reflection (internal, do not output)
|
||||||
|
- Does every step in my argument chain hold? Any hidden assumptions or logical gaps?
|
||||||
|
- If I were my opponent, how would I attack this? Where am I weakest?
|
||||||
|
- Does my evidence actually support my claim, or could it backfire?
|
||||||
|
- Should I go on offense or defense this round?
|
||||||
|
|
||||||
|
### B. Evidence discipline
|
||||||
|
- Verify key numbers — watch for order-of-magnitude errors
|
||||||
|
- Assess data freshness — fast-moving fields have short half-lives
|
||||||
|
- Distinguish primary data from secondary citations, expert opinion, and common assumptions
|
||||||
|
|
||||||
|
### C. Anti-fragility
|
||||||
|
- Anticipate counterarguments; preemptively strengthen or strategically abandon weak points
|
||||||
|
- Catch logical gaps, data misuse, or outdated claims in your opponent's reasoning
|
||||||
|
|
||||||
|
## Rules
|
||||||
|
1. Check Thread Progress to see how many times you have spoken.
|
||||||
|
2. On your 3rd speech, or when the proponent has issued a final statement, you MUST output $status: final.
|
||||||
|
3. If genuinely convinced by the proponent, output $status: conceded.
|
||||||
|
4. Otherwise output $status: speak and counter the proponent's points.
|
||||||
|
5. Be rigorous, cite evidence, stay concise.
|
||||||
|
output: "Debate argument"
|
||||||
|
frontmatter: *debater-frontmatter
|
||||||
|
|
||||||
|
host:
|
||||||
|
description: "Debate moderator — delivers impartial summary and verdict"
|
||||||
|
goal: "Objectively review the debate, analyze both sides, and deliver a verdict"
|
||||||
|
capabilities: []
|
||||||
|
procedure: |
|
||||||
|
You are an experienced academic debate moderator.
|
||||||
|
|
||||||
|
## Task
|
||||||
|
1. Outline each side's core arguments
|
||||||
|
2. Evaluate reasoning quality and evidence use
|
||||||
|
3. Highlight the most impactful exchanges
|
||||||
|
4. Analyze the deeper significance of the topic
|
||||||
|
5. Deliver an overall verdict
|
||||||
|
|
||||||
|
## Style
|
||||||
|
- Impartial but with independent judgment
|
||||||
|
- Substantive, not superficial
|
||||||
|
output: "Debate summary report"
|
||||||
frontmatter:
|
frontmatter:
|
||||||
type: object
|
type: object
|
||||||
properties:
|
properties:
|
||||||
$status:
|
$status: { const: done }
|
||||||
enum: ["continue", "conceded"]
|
summary: { type: string }
|
||||||
argument:
|
highlights: { type: string }
|
||||||
type: string
|
verdict: { type: string }
|
||||||
required: [$status, argument]
|
required: [$status, summary, highlights, verdict]
|
||||||
for:
|
|
||||||
description: "Argues for the proposition"
|
|
||||||
goal: |
|
|
||||||
You are a skilled debater arguing FOR the proposition.
|
|
||||||
Be logical, cite evidence, and directly address your opponent's points.
|
|
||||||
Keep each argument concise (under 200 words).
|
|
||||||
capabilities:
|
|
||||||
- argumentation
|
|
||||||
- critical-thinking
|
|
||||||
procedure: |
|
|
||||||
1. Read the opposing side's latest argument carefully.
|
|
||||||
2. Counter their points with evidence and logic.
|
|
||||||
3. If you find yourself genuinely convinced by the other side, you may concede.
|
|
||||||
output: |
|
|
||||||
Provide your argument in the frontmatter.
|
|
||||||
Set status to "conceded" ONLY if you are genuinely convinced and wish to stop debating.
|
|
||||||
Otherwise set status to "continue".
|
|
||||||
frontmatter:
|
|
||||||
type: object
|
|
||||||
properties:
|
|
||||||
$status:
|
|
||||||
enum: ["continue", "conceded"]
|
|
||||||
argument:
|
|
||||||
type: string
|
|
||||||
required: [$status, argument]
|
|
||||||
graph:
|
graph:
|
||||||
$START:
|
$START:
|
||||||
new: { role: "against", prompt: "Present your opening argument against the proposition." }
|
new: { role: proponent, prompt: "The debate begins. You are arguing FOR the proposition. Present your opening argument." }
|
||||||
resume: { role: "against", prompt: "Review the previous debate output and continue the argument against the proposition." }
|
resume: { role: proponent, prompt: "The debate continues." }
|
||||||
against:
|
|
||||||
conceded: { role: "$END", prompt: "The against side conceded. Debate over." }
|
proponent:
|
||||||
continue: { role: "for", prompt: "Counter the opposing argument: {{{argument}}}" }
|
speak: { role: opponent, prompt: "Proponent argues:\n\n{{{argument}}}\n\nYou are the opponent. Counter this argument." }
|
||||||
for:
|
conceded: { role: host, prompt: "The proponent conceded: {{{reason}}}\n\nPlease summarize the debate." }
|
||||||
conceded: { role: "$END", prompt: "The for side conceded. Debate over." }
|
final: { role: opponent, prompt: "Proponent's closing statement:\n\n{{{closing}}}\n\nYou are the opponent. Deliver your final response." }
|
||||||
continue: { role: "against", prompt: "Counter the opposing argument: {{{argument}}}" }
|
|
||||||
|
opponent:
|
||||||
|
speak: { role: proponent, prompt: "Opponent argues:\n\n{{{argument}}}\n\nYou are the proponent. Counter this argument." }
|
||||||
|
conceded: { role: host, prompt: "The opponent conceded: {{{reason}}}\n\nPlease summarize the debate." }
|
||||||
|
final: { role: host, prompt: "Opponent's closing statement:\n\n{{{closing}}}\n\nThe debate is over. Please summarize." }
|
||||||
|
|
||||||
|
host:
|
||||||
|
done: { role: "$END", prompt: "Summary complete." }
|
||||||
|
|||||||
@@ -18,8 +18,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
properties:
|
properties:
|
||||||
$status:
|
$status:
|
||||||
type: string
|
const: done
|
||||||
enum: [done]
|
|
||||||
summary:
|
summary:
|
||||||
type: string
|
type: string
|
||||||
required: [$status, summary]
|
required: [$status, summary]
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
name: "solve-issue"
|
name: "solve-issue"
|
||||||
description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds."
|
description: "TDD-driven issue resolution for small, focused changes. Loop protection relies on engine maxRounds. Uses pnpm."
|
||||||
roles:
|
roles:
|
||||||
planner:
|
planner:
|
||||||
description: "Analyzes issue and outputs a TDD test spec"
|
description: "Analyzes issue and outputs a TDD test spec"
|
||||||
@@ -80,7 +80,7 @@ roles:
|
|||||||
2. `git fetch origin` to get latest refs
|
2. `git fetch origin` to get latest refs
|
||||||
3. First time (no existing branch):
|
3. First time (no existing branch):
|
||||||
- `git worktree add .worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
|
- `git worktree add .worktrees/fix/<issue-number>-<short-slug> -b fix/<issue-number>-<short-slug> origin/main`
|
||||||
- `cd .worktrees/fix/<issue-number>-<short-slug> && bun install`
|
- `cd .worktrees/fix/<issue-number>-<short-slug> && pnpm install`
|
||||||
4. If continuing on existing branch (prompt says "Continue work on existing branch" or provides a worktree path):
|
4. If continuing on existing branch (prompt says "Continue work on existing branch" or provides a worktree path):
|
||||||
- cd directly into the worktree path provided in the prompt
|
- cd directly into the worktree path provided in the prompt
|
||||||
- `git fetch origin && git rebase origin/main`
|
- `git fetch origin && git rebase origin/main`
|
||||||
@@ -95,8 +95,20 @@ roles:
|
|||||||
7. If bounced back from reviewer or tester: read the previous role's feedback in your task prompt
|
7. If bounced back from reviewer or tester: read the previous role's feedback in your task prompt
|
||||||
8. Write tests first based on the spec
|
8. Write tests first based on the spec
|
||||||
9. Implement the code to make tests pass
|
9. Implement the code to make tests pass
|
||||||
10. Ensure `bun run build` passes with no errors
|
10. Ensure `pnpm run build` passes with no errors
|
||||||
11. Run `bun test` to verify all tests pass
|
11. Run `pnpm test` to verify all tests pass
|
||||||
|
|
||||||
|
After implementation, before reporting done:
|
||||||
|
12. Add a changeset file (`.changeset/<short-slug>.md`) with correct bump type:
|
||||||
|
- `patch` for bug fixes, internal refactors, test-only changes
|
||||||
|
- `minor` for new features, new CLI commands, new API surfaces
|
||||||
|
- `major` for breaking changes
|
||||||
|
List every affected package in the changeset frontmatter.
|
||||||
|
13. Update documentation if the change affects user-facing behavior:
|
||||||
|
- `README.md` — usage examples, feature descriptions
|
||||||
|
- `.cards/` — architecture decision records (if applicable)
|
||||||
|
- CLI prompt subcommand output (if CLI help text changes)
|
||||||
|
- CLI `--help` text (if flags/commands are added or changed)
|
||||||
|
|
||||||
If you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,
|
If you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,
|
||||||
or repeated attempts fail), set $status=failed with a reason.
|
or repeated attempts fail), set $status=failed with a reason.
|
||||||
@@ -127,8 +139,8 @@ roles:
|
|||||||
|
|
||||||
Then perform code review:
|
Then perform code review:
|
||||||
Hard checks (must all pass):
|
Hard checks (must all pass):
|
||||||
3. `bun run build` — no build errors
|
3. `pnpm run build` — no build errors
|
||||||
4. `bunx biome check` — no lint violations
|
4. `pnpm run check` — no lint violations
|
||||||
5. TypeScript strict mode — no type errors
|
5. TypeScript strict mode — no type errors
|
||||||
|
|
||||||
Soft checks (review against project conventions if CLAUDE.md / .cursor/rules exist):
|
Soft checks (review against project conventions if CLAUDE.md / .cursor/rules exist):
|
||||||
@@ -136,6 +148,14 @@ roles:
|
|||||||
- No `console.log` in production code
|
- No `console.log` in production code
|
||||||
- No dynamic imports in production code
|
- No dynamic imports in production code
|
||||||
|
|
||||||
|
Documentation & changeset checks:
|
||||||
|
6. Changeset exists in `.changeset/` with correct bump type (`patch`/`minor`/`major`) and lists all affected packages
|
||||||
|
7. If the change is user-facing, documentation is updated:
|
||||||
|
- `README.md` reflects new/changed behavior
|
||||||
|
- `.cards/` architecture cards updated if design decisions changed
|
||||||
|
- CLI prompt subcommand output updated (if it generates skill/reference content)
|
||||||
|
- CLI `--help` text matches new flags/commands
|
||||||
|
|
||||||
Only review standards compliance. Do NOT test functionality.
|
Only review standards compliance. Do NOT test functionality.
|
||||||
If rejecting, you MUST explain the specific reason in your output.
|
If rejecting, you MUST explain the specific reason in your output.
|
||||||
output: "Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments)."
|
output: "Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments)."
|
||||||
@@ -159,7 +179,7 @@ roles:
|
|||||||
procedure: |
|
procedure: |
|
||||||
The worktree path is provided in your task prompt. cd into it first.
|
The worktree path is provided in your task prompt. cd into it first.
|
||||||
|
|
||||||
1. Run `bun test` for automated test verification
|
1. Run `pnpm test` for automated test verification
|
||||||
2. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner step in the thread history)
|
2. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner step in the thread history)
|
||||||
3. Verify each scenario in the spec is covered and passing
|
3. Verify each scenario in the spec is covered and passing
|
||||||
4. Determine outcome:
|
4. Determine outcome:
|
||||||
|
|||||||
+1
-1
@@ -21,7 +21,7 @@
|
|||||||
"@agentclientprotocol/sdk": "^0.22.1",
|
"@agentclientprotocol/sdk": "^0.22.1",
|
||||||
"@biomejs/biome": "^2.4.14",
|
"@biomejs/biome": "^2.4.14",
|
||||||
"@changesets/cli": "^2.31.0",
|
"@changesets/cli": "^2.31.0",
|
||||||
"@shazhou/proman": "^0.5.1",
|
"@shazhou/proman": "^0.6.3",
|
||||||
"@types/node": "^25.7.0",
|
"@types/node": "^25.7.0",
|
||||||
"@types/xxhashjs": "^0.2.4",
|
"@types/xxhashjs": "^0.2.4",
|
||||||
"@united-workforce/agent-hermes": "workspace:*",
|
"@united-workforce/agent-hermes": "workspace:*",
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/agent-builtin",
|
"name": "@united-workforce/agent-builtin",
|
||||||
"version": "0.1.1",
|
"version": "0.1.2",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -21,7 +21,7 @@
|
|||||||
"test:ci": "vitest run __tests__/"
|
"test:ci": "vitest run __tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"@united-workforce/util-agent": "workspace:^"
|
"@united-workforce/util-agent": "workspace:^"
|
||||||
},
|
},
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
#!/usr/bin/env node
|
#!/usr/bin/env -S node --disable-warning=ExperimentalWarning
|
||||||
|
|
||||||
// eslint-disable-next-line -- dynamic import for version
|
// eslint-disable-next-line -- dynamic import for version
|
||||||
const pkg = await import("../package.json", { with: { type: "json" } });
|
const pkg = await import("../package.json", { with: { type: "json" } });
|
||||||
|
|||||||
@@ -0,0 +1,8 @@
|
|||||||
|
# Changelog
|
||||||
|
|
||||||
|
## 0.1.4 — 2026-06-07
|
||||||
|
|
||||||
|
- fix: decouple session resume from isFirstVisit guard
|
||||||
|
|
||||||
|
When frontmatter validation fails, the step is never written to CAS, so isFirstVisit remains true on the next run. Both adapters now always check the session cache regardless of isFirstVisit. When resuming after a frontmatter-only failure (isFirstVisit + cache hit), a minimal correction prompt is sent via buildFrontmatterRetryPrompt() instead of re-sending the full initial prompt.
|
||||||
|
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/agent-claude-code",
|
"name": "@united-workforce/agent-claude-code",
|
||||||
"version": "0.1.1",
|
"version": "0.1.4",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -21,7 +21,7 @@
|
|||||||
"test:ci": "vitest run __tests__/"
|
"test:ci": "vitest run __tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@united-workforce/protocol": "workspace:^",
|
"@united-workforce/protocol": "workspace:^",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"@united-workforce/util-agent": "workspace:^"
|
"@united-workforce/util-agent": "workspace:^"
|
||||||
|
|||||||
@@ -6,7 +6,9 @@ import {
|
|||||||
type AgentContext,
|
type AgentContext,
|
||||||
type AgentRunResult,
|
type AgentRunResult,
|
||||||
buildContinuationPrompt,
|
buildContinuationPrompt,
|
||||||
|
buildFrontmatterRetryPrompt,
|
||||||
buildRolePrompt,
|
buildRolePrompt,
|
||||||
|
buildThreadProgress,
|
||||||
createAgent,
|
createAgent,
|
||||||
getCachedSessionId,
|
getCachedSessionId,
|
||||||
setCachedSessionId,
|
setCachedSessionId,
|
||||||
@@ -27,6 +29,10 @@ export function buildClaudeCodePrompt(ctx: AgentContext): string {
|
|||||||
if (ctx.outputFormatInstruction !== undefined && ctx.outputFormatInstruction !== "") {
|
if (ctx.outputFormatInstruction !== undefined && ctx.outputFormatInstruction !== "") {
|
||||||
parts.push(ctx.outputFormatInstruction, "");
|
parts.push(ctx.outputFormatInstruction, "");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Inject thread progress so the agent knows step count and role visit count
|
||||||
|
parts.push(buildThreadProgress(ctx.steps, ctx.role), "");
|
||||||
|
|
||||||
parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
|
parts.push(rolePrompt, "", "## Task", ctx.start.prompt);
|
||||||
|
|
||||||
if (!ctx.isFirstVisit) {
|
if (!ctx.isFirstVisit) {
|
||||||
@@ -171,8 +177,12 @@ async function runClaudeCode(ctx: AgentContext, model: string | null): Promise<A
|
|||||||
|
|
||||||
log("K7R2M4N8", `prompt for role=${ctx.role} (length=${fullPrompt.length}):\n${fullPrompt}`);
|
log("K7R2M4N8", `prompt for role=${ctx.role} (length=${fullPrompt.length}):\n${fullPrompt}`);
|
||||||
|
|
||||||
// Try resuming a cached session for re-entry scenarios (e.g. reviewer reject → developer re-entry).
|
// Try resuming a cached session. This covers both normal re-entry
|
||||||
if (!ctx.isFirstVisit) {
|
// (e.g. reviewer reject → developer re-entry) AND the case where a
|
||||||
|
// previous run completed but frontmatter validation failed — the step
|
||||||
|
// was never written to CAS so isFirstVisit is still true, but the
|
||||||
|
// session cache holds a valid session we should resume.
|
||||||
|
{
|
||||||
const cachedSessionId = await getCachedSessionId(
|
const cachedSessionId = await getCachedSessionId(
|
||||||
"claude-code",
|
"claude-code",
|
||||||
ctx.threadId,
|
ctx.threadId,
|
||||||
@@ -180,13 +190,20 @@ async function runClaudeCode(ctx: AgentContext, model: string | null): Promise<A
|
|||||||
ctx.storageRoot,
|
ctx.storageRoot,
|
||||||
);
|
);
|
||||||
if (cachedSessionId !== null) {
|
if (cachedSessionId !== null) {
|
||||||
|
// isFirstVisit + cache hit = previous run completed but frontmatter
|
||||||
|
// validation failed. The session already has full context — send a
|
||||||
|
// minimal correction prompt instead of the full initial prompt.
|
||||||
|
const resumePrompt = ctx.isFirstVisit
|
||||||
|
? buildFrontmatterRetryPrompt(ctx.outputFormatInstruction)
|
||||||
|
: fullPrompt;
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const { stdout, stderr, exitCode } = await spawnClaudeResume(
|
const { stdout, stderr, exitCode } = await spawnClaudeResume(
|
||||||
cachedSessionId,
|
cachedSessionId,
|
||||||
fullPrompt,
|
resumePrompt,
|
||||||
model,
|
model,
|
||||||
);
|
);
|
||||||
const result = await processClaudeOutput(stdout, stderr, exitCode, ctx.store, fullPrompt);
|
const result = await processClaudeOutput(stdout, stderr, exitCode, ctx.store, resumePrompt);
|
||||||
if (result.sessionId !== undefined && result.sessionId !== "") {
|
if (result.sessionId !== undefined && result.sessionId !== "") {
|
||||||
await setCachedSessionId(
|
await setCachedSessionId(
|
||||||
"claude-code",
|
"claude-code",
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
#!/usr/bin/env node
|
#!/usr/bin/env -S node --disable-warning=ExperimentalWarning
|
||||||
|
|
||||||
// eslint-disable-next-line -- dynamic import for version
|
// eslint-disable-next-line -- dynamic import for version
|
||||||
const pkg = await import("../package.json", { with: { type: "json" } });
|
const pkg = await import("../package.json", { with: { type: "json" } });
|
||||||
|
|||||||
@@ -1,5 +1,11 @@
|
|||||||
# @united-workforce/agent-hermes
|
# @united-workforce/agent-hermes
|
||||||
|
|
||||||
|
## 0.1.5 — 2026-06-07
|
||||||
|
|
||||||
|
- fix: decouple session resume from isFirstVisit guard
|
||||||
|
|
||||||
|
When frontmatter validation fails, the step is never written to CAS, so isFirstVisit remains true on the next run. Both adapters now always check the session cache regardless of isFirstVisit. When resuming after a frontmatter-only failure (isFirstVisit + cache hit), a minimal correction prompt is sent via buildFrontmatterRetryPrompt() instead of re-sending the full initial prompt.
|
||||||
|
|
||||||
## 0.1.1
|
## 0.1.1
|
||||||
|
|
||||||
### Patch Changes
|
### Patch Changes
|
||||||
|
|||||||
@@ -15,7 +15,8 @@ describe("Issue #551 — bin entry & engines", () => {
|
|||||||
const pkg = JSON.parse(readFileSync(join(PKG_ROOT, "package.json"), "utf-8"));
|
const pkg = JSON.parse(readFileSync(join(PKG_ROOT, "package.json"), "utf-8"));
|
||||||
const binPath = pkg.bin["uwf-hermes"];
|
const binPath = pkg.bin["uwf-hermes"];
|
||||||
const content = readFileSync(join(PKG_ROOT, binPath), "utf-8");
|
const content = readFileSync(join(PKG_ROOT, binPath), "utf-8");
|
||||||
expect(content.startsWith("#!/usr/bin/env node")).toBe(true);
|
expect(content.startsWith("#!/usr/bin/env")).toBe(true);
|
||||||
|
expect(content).toContain("node");
|
||||||
});
|
});
|
||||||
|
|
||||||
test("README.md explains uwf-hermes is an adapter", () => {
|
test("README.md explains uwf-hermes is an adapter", () => {
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/agent-hermes",
|
"name": "@united-workforce/agent-hermes",
|
||||||
"version": "0.1.2",
|
"version": "0.1.5",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -21,7 +21,7 @@
|
|||||||
"test:ci": "vitest run __tests__/"
|
"test:ci": "vitest run __tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@united-workforce/protocol": "workspace:^",
|
"@united-workforce/protocol": "workspace:^",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"@united-workforce/util-agent": "workspace:^"
|
"@united-workforce/util-agent": "workspace:^"
|
||||||
|
|||||||
@@ -12,7 +12,11 @@ const OWN_VERSION = (
|
|||||||
}
|
}
|
||||||
).version;
|
).version;
|
||||||
|
|
||||||
const HERMES_COMMAND = "hermes";
|
/** Resolve hermes binary: `UWF_HERMES_BIN` override → default `"hermes"` via PATH. */
|
||||||
|
function resolveHermesCommand(): string {
|
||||||
|
const override = process.env.UWF_HERMES_BIN;
|
||||||
|
return override !== undefined && override !== "" ? override : "hermes";
|
||||||
|
}
|
||||||
const PROTOCOL_VERSION = 1;
|
const PROTOCOL_VERSION = 1;
|
||||||
|
|
||||||
type JsonRpcResponse = {
|
type JsonRpcResponse = {
|
||||||
@@ -271,7 +275,8 @@ export class HermesAcpClient {
|
|||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
const child = spawn(HERMES_COMMAND, ["acp"], {
|
const hermesCommand = resolveHermesCommand();
|
||||||
|
const child = spawn(hermesCommand, ["acp"], {
|
||||||
env: process.env,
|
env: process.env,
|
||||||
shell: false,
|
shell: false,
|
||||||
stdio: ["pipe", "pipe", "pipe"],
|
stdio: ["pipe", "pipe", "pipe"],
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
#!/usr/bin/env node
|
#!/usr/bin/env -S node --disable-warning=ExperimentalWarning
|
||||||
|
|
||||||
// eslint-disable-next-line -- dynamic import for version
|
// eslint-disable-next-line -- dynamic import for version
|
||||||
const pkg = await import("../package.json", { with: { type: "json" } });
|
const pkg = await import("../package.json", { with: { type: "json" } });
|
||||||
|
|||||||
@@ -5,7 +5,9 @@ import {
|
|||||||
type AgentContext,
|
type AgentContext,
|
||||||
type AgentRunResult,
|
type AgentRunResult,
|
||||||
buildContinuationPrompt,
|
buildContinuationPrompt,
|
||||||
|
buildFrontmatterRetryPrompt,
|
||||||
buildRolePrompt,
|
buildRolePrompt,
|
||||||
|
buildThreadProgress,
|
||||||
createAgent,
|
createAgent,
|
||||||
} from "@united-workforce/util-agent";
|
} from "@united-workforce/util-agent";
|
||||||
import type { AcpUsage } from "./acp-client.js";
|
import type { AcpUsage } from "./acp-client.js";
|
||||||
@@ -60,6 +62,9 @@ export function buildHermesPrompt(ctx: AgentContext): string {
|
|||||||
parts.push(ctx.outputFormatInstruction, "");
|
parts.push(ctx.outputFormatInstruction, "");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Inject thread progress so the agent knows step count and role visit count
|
||||||
|
parts.push(buildThreadProgress(ctx.steps, ctx.role), "");
|
||||||
|
|
||||||
if (!ctx.isFirstVisit) {
|
if (!ctx.isFirstVisit) {
|
||||||
// Re-entry: show only steps since last visit, meta only
|
// Re-entry: show only steps since last visit, meta only
|
||||||
parts.push(buildContinuationPrompt(ctx.steps, ctx.role, ctx.edgePrompt));
|
parts.push(buildContinuationPrompt(ctx.steps, ctx.role, ctx.edgePrompt));
|
||||||
@@ -98,6 +103,8 @@ async function storePromptResult(store: Store, sessionId: string): Promise<{ det
|
|||||||
type PromptAttempt = {
|
type PromptAttempt = {
|
||||||
useContinuation: boolean;
|
useContinuation: boolean;
|
||||||
resumed: boolean;
|
resumed: boolean;
|
||||||
|
/** True when resuming after a frontmatter-only failure (isFirstVisit + cache hit). */
|
||||||
|
frontmatterRetry: boolean;
|
||||||
};
|
};
|
||||||
|
|
||||||
async function prepareSession(
|
async function prepareSession(
|
||||||
@@ -106,28 +113,36 @@ async function prepareSession(
|
|||||||
cwd: string,
|
cwd: string,
|
||||||
resumeDisabled: boolean,
|
resumeDisabled: boolean,
|
||||||
): Promise<PromptAttempt> {
|
): Promise<PromptAttempt> {
|
||||||
if (ctx.isFirstVisit || resumeDisabled) {
|
if (resumeDisabled) {
|
||||||
await client.connect(cwd);
|
await client.connect(cwd);
|
||||||
return { useContinuation: false, resumed: false };
|
return { useContinuation: false, resumed: false, frontmatterRetry: false };
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Check session cache regardless of isFirstVisit. A previous run may
|
||||||
|
// have completed and cached its session but failed frontmatter
|
||||||
|
// validation — the step never got written to CAS so isFirstVisit is
|
||||||
|
// still true, yet we should resume the existing session.
|
||||||
const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role, ctx.storageRoot);
|
const cachedSessionId = await getCachedSessionId(ctx.threadId, ctx.role, ctx.storageRoot);
|
||||||
if (cachedSessionId === null) {
|
if (cachedSessionId === null) {
|
||||||
log("6RWK3N8Q", `no cached session for ${ctx.threadId}:${ctx.role}, starting new session`);
|
log("6RWK3N8Q", `no cached session for ${ctx.threadId}:${ctx.role}, starting new session`);
|
||||||
await client.connect(cwd);
|
await client.connect(cwd);
|
||||||
return { useContinuation: false, resumed: false };
|
return { useContinuation: false, resumed: false, frontmatterRetry: false };
|
||||||
}
|
}
|
||||||
|
|
||||||
try {
|
try {
|
||||||
await client.resume(cachedSessionId, cwd);
|
await client.resume(cachedSessionId, cwd);
|
||||||
log("9MHT4V2P", `resumed hermes session ${cachedSessionId} for ${ctx.threadId}:${ctx.role}`);
|
log("9MHT4V2P", `resumed hermes session ${cachedSessionId} for ${ctx.threadId}:${ctx.role}`);
|
||||||
return { useContinuation: true, resumed: true };
|
return {
|
||||||
|
useContinuation: !ctx.isFirstVisit,
|
||||||
|
resumed: true,
|
||||||
|
frontmatterRetry: ctx.isFirstVisit,
|
||||||
|
};
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
const message = error instanceof Error ? error.message : String(error);
|
const message = error instanceof Error ? error.message : String(error);
|
||||||
log("3XPN7K4W", `session resume failed, falling back to new session: ${message}`);
|
log("3XPN7K4W", `session resume failed, falling back to new session: ${message}`);
|
||||||
await client.close();
|
await client.close();
|
||||||
await client.connect(cwd);
|
await client.connect(cwd);
|
||||||
return { useContinuation: false, resumed: false };
|
return { useContinuation: false, resumed: false, frontmatterRetry: false };
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -150,9 +165,12 @@ export function createHermesAgent(resumeDisabled: boolean): () => Promise<void>
|
|||||||
ctx: AgentContext,
|
ctx: AgentContext,
|
||||||
useContinuation: boolean,
|
useContinuation: boolean,
|
||||||
beforeTurns: TurnsSnapshot,
|
beforeTurns: TurnsSnapshot,
|
||||||
|
frontmatterRetry: boolean,
|
||||||
): Promise<AgentRunResult> {
|
): Promise<AgentRunResult> {
|
||||||
const effectiveCtx = useContinuation ? ctx : { ...ctx, isFirstVisit: true };
|
// Frontmatter retry: session has full context, just re-output the format.
|
||||||
const fullPrompt = buildHermesPrompt(effectiveCtx);
|
const fullPrompt = frontmatterRetry
|
||||||
|
? buildFrontmatterRetryPrompt(ctx.outputFormatInstruction)
|
||||||
|
: buildHermesPrompt(useContinuation ? ctx : { ...ctx, isFirstVisit: true });
|
||||||
const startMs = Date.now();
|
const startMs = Date.now();
|
||||||
const { text, sessionId, usage: acpUsage } = await client.prompt(fullPrompt);
|
const { text, sessionId, usage: acpUsage } = await client.prompt(fullPrompt);
|
||||||
const durationSec = (Date.now() - startMs) / 1000;
|
const durationSec = (Date.now() - startMs) / 1000;
|
||||||
@@ -184,7 +202,7 @@ export function createHermesAgent(resumeDisabled: boolean): () => Promise<void>
|
|||||||
const beforeTurns = snapshotTurns(beforeSession);
|
const beforeTurns = snapshotTurns(beforeSession);
|
||||||
|
|
||||||
try {
|
try {
|
||||||
return await runPrompt(ctx, attempt.useContinuation, beforeTurns);
|
return await runPrompt(ctx, attempt.useContinuation, beforeTurns, attempt.frontmatterRetry);
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
if (!attempt.resumed) {
|
if (!attempt.resumed) {
|
||||||
throw error;
|
throw error;
|
||||||
@@ -195,7 +213,7 @@ export function createHermesAgent(resumeDisabled: boolean): () => Promise<void>
|
|||||||
await client.close();
|
await client.close();
|
||||||
await client.connect(cwd);
|
await client.connect(cwd);
|
||||||
// Fresh session after retry — reset snapshot to zero
|
// Fresh session after retry — reset snapshot to zero
|
||||||
return runPrompt(ctx, false, ZERO_TURNS);
|
return runPrompt(ctx, false, ZERO_TURNS, false);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/agent-mock",
|
"name": "@united-workforce/agent-mock",
|
||||||
"version": "0.1.1",
|
"version": "0.1.2",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -21,7 +21,7 @@
|
|||||||
"test:ci": "vitest run __tests__/"
|
"test:ci": "vitest run __tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@united-workforce/protocol": "workspace:^",
|
"@united-workforce/protocol": "workspace:^",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"@united-workforce/util-agent": "workspace:^",
|
"@united-workforce/util-agent": "workspace:^",
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
#!/usr/bin/env node
|
#!/usr/bin/env -S node --disable-warning=ExperimentalWarning
|
||||||
|
|
||||||
// eslint-disable-next-line -- dynamic import for version
|
// eslint-disable-next-line -- dynamic import for version
|
||||||
const pkg = await import("../package.json", { with: { type: "json" } });
|
const pkg = await import("../package.json", { with: { type: "json" } });
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/cli",
|
"name": "@united-workforce/cli",
|
||||||
"version": "0.2.0",
|
"version": "0.3.0",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -11,8 +11,8 @@
|
|||||||
"uwf": "./dist/cli.js"
|
"uwf": "./dist/cli.js"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@ocas/fs": "^0.3.0",
|
"@ocas/fs": "^0.4.0",
|
||||||
"@united-workforce/protocol": "workspace:^",
|
"@united-workforce/protocol": "workspace:^",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"@united-workforce/util-agent": "workspace:^",
|
"@united-workforce/util-agent": "workspace:^",
|
||||||
|
|||||||
@@ -28,9 +28,13 @@ roles:
|
|||||||
$status: "ready"
|
$status: "ready"
|
||||||
frontmatter:
|
frontmatter:
|
||||||
type: object
|
type: object
|
||||||
required: ["$status"]
|
oneOf:
|
||||||
properties:
|
- properties:
|
||||||
$status: { type: string, enum: ["ready", "not-ready"] }
|
$status: { const: "ready" }
|
||||||
|
required: ["$status"]
|
||||||
|
- properties:
|
||||||
|
$status: { const: "not-ready" }
|
||||||
|
required: ["$status"]
|
||||||
roleB:
|
roleB:
|
||||||
description: Second role
|
description: Second role
|
||||||
goal: Do B
|
goal: Do B
|
||||||
@@ -42,7 +46,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
required: ["$status"]
|
required: ["$status"]
|
||||||
properties:
|
properties:
|
||||||
$status: { type: string, enum: ["done"] }
|
$status: { const: "done" }
|
||||||
graph:
|
graph:
|
||||||
$START:
|
$START:
|
||||||
new:
|
new:
|
||||||
@@ -82,9 +86,13 @@ roles:
|
|||||||
$status: "pass"
|
$status: "pass"
|
||||||
frontmatter:
|
frontmatter:
|
||||||
type: object
|
type: object
|
||||||
required: ["$status"]
|
oneOf:
|
||||||
properties:
|
- properties:
|
||||||
$status: { type: string, enum: ["pass", "fail"] }
|
$status: { const: "pass" }
|
||||||
|
required: ["$status"]
|
||||||
|
- properties:
|
||||||
|
$status: { const: "fail" }
|
||||||
|
required: ["$status"]
|
||||||
roleB:
|
roleB:
|
||||||
description: Pass role
|
description: Pass role
|
||||||
goal: Do B
|
goal: Do B
|
||||||
@@ -96,7 +104,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
required: ["$status"]
|
required: ["$status"]
|
||||||
properties:
|
properties:
|
||||||
$status: { type: string, enum: ["done"] }
|
$status: { const: "done" }
|
||||||
roleC:
|
roleC:
|
||||||
description: Fail role
|
description: Fail role
|
||||||
goal: Do C
|
goal: Do C
|
||||||
@@ -108,7 +116,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
required: ["$status"]
|
required: ["$status"]
|
||||||
properties:
|
properties:
|
||||||
$status: { type: string, enum: ["done"] }
|
$status: { const: "done" }
|
||||||
graph:
|
graph:
|
||||||
$START:
|
$START:
|
||||||
new:
|
new:
|
||||||
@@ -155,7 +163,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
required: ["$status"]
|
required: ["$status"]
|
||||||
properties:
|
properties:
|
||||||
$status: { type: string, enum: ["done"] }
|
$status: { const: "done" }
|
||||||
graph:
|
graph:
|
||||||
$START:
|
$START:
|
||||||
new:
|
new:
|
||||||
|
|||||||
@@ -71,12 +71,22 @@ describe("prompt commands", () => {
|
|||||||
test("prompt bootstrap returns framework-agnostic setup instructions", () => {
|
test("prompt bootstrap returns framework-agnostic setup instructions", () => {
|
||||||
const result = cmdPromptBootstrap();
|
const result = cmdPromptBootstrap();
|
||||||
expect(typeof result).toBe("string");
|
expect(typeof result).toBe("string");
|
||||||
|
// Skills installation
|
||||||
expect(result).toContain("uwf prompt usage");
|
expect(result).toContain("uwf prompt usage");
|
||||||
expect(result).toContain("uwf prompt workflow-authoring");
|
expect(result).toContain("uwf prompt workflow-authoring");
|
||||||
expect(result).toContain("uwf prompt adapter-developing");
|
expect(result).toContain("uwf prompt adapter-developing");
|
||||||
expect(result).toContain("uwf-usage");
|
expect(result).toContain("uwf-usage");
|
||||||
expect(result).toContain("uwf-workflow-authoring");
|
expect(result).toContain("uwf-workflow-authoring");
|
||||||
expect(result).toContain("uwf-adapter-developing");
|
expect(result).toContain("uwf-adapter-developing");
|
||||||
|
// Fresh install scenario
|
||||||
|
expect(result).toContain("Fresh Install");
|
||||||
|
expect(result).toContain("uwf setup");
|
||||||
|
expect(result).toContain("--provider");
|
||||||
|
expect(result).toContain("--api-key");
|
||||||
|
expect(result).toContain("agent adapter");
|
||||||
|
// Upgrade scenario
|
||||||
|
expect(result).toContain("Upgrade");
|
||||||
|
expect(result).toContain("Migrate");
|
||||||
// Should NOT contain Hermes-specific paths
|
// Should NOT contain Hermes-specific paths
|
||||||
expect(result).not.toContain("~/.hermes/skills/");
|
expect(result).not.toContain("~/.hermes/skills/");
|
||||||
expect(result).not.toContain("> ~/.hermes/");
|
expect(result).not.toContain("> ~/.hermes/");
|
||||||
|
|||||||
@@ -21,11 +21,11 @@ describe("solve-issue workflow: Gitea API PR creation", () => {
|
|||||||
"..",
|
"..",
|
||||||
"..",
|
"..",
|
||||||
"..",
|
"..",
|
||||||
".workflows",
|
"examples",
|
||||||
"solve-issue.yaml",
|
"solve-issue.yaml",
|
||||||
);
|
);
|
||||||
|
|
||||||
test("committer procedure should use curl API instead of tea pr create", async () => {
|
test("committer procedure should create PR via tea pr create", async () => {
|
||||||
const yamlContent = await readFile(workflowPath, "utf-8");
|
const yamlContent = await readFile(workflowPath, "utf-8");
|
||||||
const workflow = parse(yamlContent) as WorkflowPayload;
|
const workflow = parse(yamlContent) as WorkflowPayload;
|
||||||
|
|
||||||
@@ -33,25 +33,22 @@ describe("solve-issue workflow: Gitea API PR creation", () => {
|
|||||||
const committerProcedure = workflow.roles.committer?.procedure;
|
const committerProcedure = workflow.roles.committer?.procedure;
|
||||||
expect(committerProcedure).toBeDefined();
|
expect(committerProcedure).toBeDefined();
|
||||||
|
|
||||||
// Verify the procedure uses curl API, not tea pr create
|
// Verify the procedure uses tea pr create for PR creation
|
||||||
expect(committerProcedure).toContain("curl");
|
expect(committerProcedure).toContain("tea pr create");
|
||||||
expect(committerProcedure).toContain("api/v1/repos");
|
expect(committerProcedure).toContain("git push");
|
||||||
expect(committerProcedure).toContain("/pulls");
|
expect(committerProcedure).toContain("Fixes #N");
|
||||||
|
|
||||||
// Verify it explicitly warns against tea pr create
|
|
||||||
expect(committerProcedure).toMatch(/do NOT use.*tea pr create/i);
|
|
||||||
});
|
});
|
||||||
|
|
||||||
test("committer procedure should reference repoRemote from task prompt", async () => {
|
test("committer procedure should extract owner/repo from git remote", async () => {
|
||||||
const yamlContent = await readFile(workflowPath, "utf-8");
|
const yamlContent = await readFile(workflowPath, "utf-8");
|
||||||
const workflow = parse(yamlContent) as WorkflowPayload;
|
const workflow = parse(yamlContent) as WorkflowPayload;
|
||||||
|
|
||||||
const committerProcedure = workflow.roles.committer?.procedure;
|
const committerProcedure = workflow.roles.committer?.procedure;
|
||||||
expect(committerProcedure).toBeDefined();
|
expect(committerProcedure).toBeDefined();
|
||||||
|
|
||||||
// Verify the procedure mentions repoRemote is provided in task prompt
|
// Verify the procedure extracts owner/repo from remote
|
||||||
expect(committerProcedure).toMatch(/repo remote.*provided.*task prompt/i);
|
expect(committerProcedure).toContain("git remote get-url origin");
|
||||||
expect(committerProcedure).toMatch(/owner\/repo/i);
|
expect(committerProcedure).toContain("hook_failed");
|
||||||
});
|
});
|
||||||
|
|
||||||
test("committer procedure should include error handling for curl failures", async () => {
|
test("committer procedure should include error handling for curl failures", async () => {
|
||||||
@@ -100,45 +97,42 @@ describe("solve-issue workflow: Gitea API PR creation", () => {
|
|||||||
expect(committedVariant.required).toContain("$status");
|
expect(committedVariant.required).toContain("$status");
|
||||||
});
|
});
|
||||||
|
|
||||||
test("developer procedure should include mandatory verification step", async () => {
|
test("developer procedure should include worktree setup", async () => {
|
||||||
const yamlContent = await readFile(workflowPath, "utf-8");
|
const yamlContent = await readFile(workflowPath, "utf-8");
|
||||||
const workflow = parse(yamlContent) as WorkflowPayload;
|
const workflow = parse(yamlContent) as WorkflowPayload;
|
||||||
|
|
||||||
const developerProcedure = workflow.roles.developer?.procedure;
|
const developerProcedure = workflow.roles.developer?.procedure;
|
||||||
expect(developerProcedure).toBeDefined();
|
expect(developerProcedure).toBeDefined();
|
||||||
|
|
||||||
// Verify the procedure includes mandatory verification step
|
// Verify the procedure includes worktree setup
|
||||||
expect(developerProcedure).toContain("MANDATORY VERIFICATION");
|
expect(developerProcedure).toContain("IMPORTANT");
|
||||||
expect(developerProcedure).toContain("git branch --show-current");
|
expect(developerProcedure).toContain("git worktree add");
|
||||||
expect(developerProcedure).toContain("git status");
|
expect(developerProcedure).toContain("pnpm install");
|
||||||
expect(developerProcedure).toMatch(/ls -la|verify.*exist/i);
|
|
||||||
});
|
});
|
||||||
|
|
||||||
test("reviewer procedure should enforce worktree path verification", async () => {
|
test("reviewer procedure should verify branch and run checks", async () => {
|
||||||
const yamlContent = await readFile(workflowPath, "utf-8");
|
const yamlContent = await readFile(workflowPath, "utf-8");
|
||||||
const workflow = parse(yamlContent) as WorkflowPayload;
|
const workflow = parse(yamlContent) as WorkflowPayload;
|
||||||
|
|
||||||
const reviewerProcedure = workflow.roles.reviewer?.procedure;
|
const reviewerProcedure = workflow.roles.reviewer?.procedure;
|
||||||
expect(reviewerProcedure).toBeDefined();
|
expect(reviewerProcedure).toBeDefined();
|
||||||
|
|
||||||
// Verify the procedure includes critical enforcement
|
// Verify the procedure includes branch verification and build checks
|
||||||
expect(reviewerProcedure).toContain("CRITICAL");
|
expect(reviewerProcedure).toContain("git branch --show-current");
|
||||||
expect(reviewerProcedure).toMatch(/cd.*pwd/);
|
expect(reviewerProcedure).toContain("pnpm run build");
|
||||||
expect(reviewerProcedure).toContain(
|
expect(reviewerProcedure).toContain("pnpm run check");
|
||||||
"Do NOT report results without running the actual commands",
|
|
||||||
);
|
|
||||||
});
|
});
|
||||||
|
|
||||||
test("developer procedure should include test debugging escalation", async () => {
|
test("developer procedure should include changeset and failure handling", async () => {
|
||||||
const yamlContent = await readFile(workflowPath, "utf-8");
|
const yamlContent = await readFile(workflowPath, "utf-8");
|
||||||
const workflow = parse(yamlContent) as WorkflowPayload;
|
const workflow = parse(yamlContent) as WorkflowPayload;
|
||||||
|
|
||||||
const developerProcedure = workflow.roles.developer?.procedure;
|
const developerProcedure = workflow.roles.developer?.procedure;
|
||||||
expect(developerProcedure).toBeDefined();
|
expect(developerProcedure).toBeDefined();
|
||||||
|
|
||||||
// Verify the procedure includes test failure guidance
|
// Verify the procedure includes changeset requirement and failure path
|
||||||
expect(developerProcedure).toMatch(/tests fail.*first run/i);
|
expect(developerProcedure).toContain(".changeset/");
|
||||||
expect(developerProcedure).toMatch(/3 test cycles|after 3 attempts/i);
|
|
||||||
expect(developerProcedure).toContain("$status=failed");
|
expect(developerProcedure).toContain("$status=failed");
|
||||||
|
expect(developerProcedure).toContain("pnpm test");
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|||||||
@@ -54,7 +54,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
required: ["$status"]
|
required: ["$status"]
|
||||||
properties:
|
properties:
|
||||||
$status: { type: string, enum: ["ready"] }
|
$status: { const: "ready" }
|
||||||
graph:
|
graph:
|
||||||
$START:
|
$START:
|
||||||
new:
|
new:
|
||||||
@@ -114,7 +114,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
required: ["$status"]
|
required: ["$status"]
|
||||||
properties:
|
properties:
|
||||||
$status: { type: string, enum: ["ready"] }
|
$status: { const: "ready" }
|
||||||
graph:
|
graph:
|
||||||
$START:
|
$START:
|
||||||
new:
|
new:
|
||||||
@@ -161,7 +161,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
required: ["$status"]
|
required: ["$status"]
|
||||||
properties:
|
properties:
|
||||||
$status: { type: string, enum: ["ready"] }
|
$status: { const: "ready" }
|
||||||
graph:
|
graph:
|
||||||
$START:
|
$START:
|
||||||
new:
|
new:
|
||||||
|
|||||||
@@ -0,0 +1,549 @@
|
|||||||
|
import { execFileSync } from "node:child_process";
|
||||||
|
import { mkdir, mkdtemp, readFile, rm, writeFile } from "node:fs/promises";
|
||||||
|
import { tmpdir } from "node:os";
|
||||||
|
import { dirname, join } from "node:path";
|
||||||
|
import { fileURLToPath } from "node:url";
|
||||||
|
import { putSchema } from "@ocas/core";
|
||||||
|
import { openStore } from "@ocas/fs";
|
||||||
|
import type {
|
||||||
|
CasRef,
|
||||||
|
StepNodePayload,
|
||||||
|
ThreadId,
|
||||||
|
ThreadIndexEntry,
|
||||||
|
} from "@united-workforce/protocol";
|
||||||
|
import { afterEach, beforeEach, describe, expect, test } from "vitest";
|
||||||
|
import { registerUwfSchemas } from "../schemas.js";
|
||||||
|
import { seedThreads } from "./thread-test-helpers.js";
|
||||||
|
|
||||||
|
const OUTPUT_SCHEMA = {
|
||||||
|
type: "object" as const,
|
||||||
|
properties: {
|
||||||
|
$status: { type: "string" as const },
|
||||||
|
note: { type: "string" as const },
|
||||||
|
},
|
||||||
|
required: ["$status"],
|
||||||
|
additionalProperties: false,
|
||||||
|
};
|
||||||
|
|
||||||
|
const THREAD_ID = "01POKESTEPTEST00000000" as ThreadId;
|
||||||
|
|
||||||
|
let tmpDir: string;
|
||||||
|
|
||||||
|
beforeEach(async () => {
|
||||||
|
tmpDir = await mkdtemp(join(tmpdir(), "cli-uwf-poke-test-"));
|
||||||
|
});
|
||||||
|
|
||||||
|
afterEach(async () => {
|
||||||
|
await rm(tmpDir, { recursive: true, force: true });
|
||||||
|
});
|
||||||
|
|
||||||
|
type SetupResult = {
|
||||||
|
casDir: string;
|
||||||
|
oldStepHash: CasRef;
|
||||||
|
oldStepPrev: CasRef | null;
|
||||||
|
oldStepCompletedAtMs: number;
|
||||||
|
startHash: CasRef;
|
||||||
|
workflowHash: CasRef;
|
||||||
|
mockAgentPath: string;
|
||||||
|
failingAgentPath: string;
|
||||||
|
promptCapturePath: string;
|
||||||
|
envCapturePath: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
type SetupOpts = {
|
||||||
|
threadStatus: ThreadIndexEntry["status"];
|
||||||
|
multipleSteps: boolean;
|
||||||
|
newCompletedAtMs: number;
|
||||||
|
newStatus: string;
|
||||||
|
// The agent name to record in the head StepNode.agent field. Defaults to mockAgentPath.
|
||||||
|
stepAgentNameOverride: string | null;
|
||||||
|
// Whether to seed an actual head StepNode (false → only StartNode is the head).
|
||||||
|
withHeadStep: boolean;
|
||||||
|
};
|
||||||
|
|
||||||
|
async function setupThread(opts: Partial<SetupOpts> = {}): Promise<SetupResult> {
|
||||||
|
const cfg: SetupOpts = {
|
||||||
|
threadStatus: opts.threadStatus ?? "idle",
|
||||||
|
multipleSteps: opts.multipleSteps ?? false,
|
||||||
|
newCompletedAtMs: opts.newCompletedAtMs ?? 1716600005000,
|
||||||
|
newStatus: opts.newStatus ?? "ok",
|
||||||
|
stepAgentNameOverride: opts.stepAgentNameOverride ?? null,
|
||||||
|
withHeadStep: opts.withHeadStep ?? true,
|
||||||
|
};
|
||||||
|
|
||||||
|
const casDir = join(tmpDir, "cas");
|
||||||
|
await mkdir(casDir, { recursive: true });
|
||||||
|
|
||||||
|
const store = await openStore(casDir);
|
||||||
|
const schemas = await registerUwfSchemas(store);
|
||||||
|
const outputSchemaHash = await putSchema(store, OUTPUT_SCHEMA);
|
||||||
|
|
||||||
|
const workflowHash = await store.cas.put(schemas.workflow, {
|
||||||
|
name: "test-poke",
|
||||||
|
description: "poke command integration test",
|
||||||
|
roles: {
|
||||||
|
worker: {
|
||||||
|
description: "Worker role",
|
||||||
|
goal: "Work",
|
||||||
|
capabilities: [],
|
||||||
|
procedure: "work",
|
||||||
|
output: "result",
|
||||||
|
frontmatter: outputSchemaHash,
|
||||||
|
},
|
||||||
|
reviewer: {
|
||||||
|
description: "Reviewer role",
|
||||||
|
goal: "Review",
|
||||||
|
capabilities: [],
|
||||||
|
procedure: "review",
|
||||||
|
output: "result",
|
||||||
|
frontmatter: outputSchemaHash,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
graph: {
|
||||||
|
$START: {
|
||||||
|
new: { role: "worker", prompt: "Start work", location: null },
|
||||||
|
resume: { role: "worker", prompt: "Resume the work", location: null },
|
||||||
|
},
|
||||||
|
worker: {
|
||||||
|
ok: { role: "reviewer", prompt: "Review the work", location: null },
|
||||||
|
needs_input: {
|
||||||
|
role: "$SUSPEND",
|
||||||
|
prompt: "Please clarify",
|
||||||
|
location: null,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
reviewer: { done: { role: "$END", prompt: "Done", location: null } },
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
const startHash = await store.cas.put(schemas.startNode, {
|
||||||
|
workflow: workflowHash,
|
||||||
|
prompt: "Test poke task",
|
||||||
|
cwd: tmpDir,
|
||||||
|
});
|
||||||
|
|
||||||
|
process.env.OCAS_HOME = casDir;
|
||||||
|
|
||||||
|
// Paths for mock agent and capture files (set early so we can use mockAgentPath as the recorded agent name)
|
||||||
|
const promptCapturePath = join(tmpDir, "captured-prompt.txt");
|
||||||
|
const envCapturePath = join(tmpDir, "captured-env.txt");
|
||||||
|
const mockAgentPath = join(tmpDir, "mock-agent.sh");
|
||||||
|
const failingAgentPath = join(tmpDir, "failing-agent.sh");
|
||||||
|
|
||||||
|
// Build head StepNode chain
|
||||||
|
let oldStepPrev: CasRef | null = null;
|
||||||
|
if (cfg.multipleSteps) {
|
||||||
|
// First step: prev=null
|
||||||
|
const firstOutputHash = await store.cas.put(outputSchemaHash, { $status: "ok" });
|
||||||
|
const firstDetailHash = await store.cas.put(schemas.text, "first detail");
|
||||||
|
const firstStepHash = await store.cas.put(schemas.stepNode, {
|
||||||
|
start: startHash,
|
||||||
|
prev: null,
|
||||||
|
role: "worker",
|
||||||
|
output: firstOutputHash,
|
||||||
|
detail: firstDetailHash,
|
||||||
|
agent: cfg.stepAgentNameOverride ?? mockAgentPath,
|
||||||
|
edgePrompt: "Start work",
|
||||||
|
startedAtMs: 1716600000000,
|
||||||
|
completedAtMs: 1716600001000,
|
||||||
|
cwd: tmpDir,
|
||||||
|
assembledPrompt: null,
|
||||||
|
usage: null,
|
||||||
|
});
|
||||||
|
oldStepPrev = firstStepHash;
|
||||||
|
}
|
||||||
|
|
||||||
|
let oldStepHash: CasRef = startHash;
|
||||||
|
const oldStepCompletedAtMs = 1716600002000;
|
||||||
|
if (cfg.withHeadStep) {
|
||||||
|
const outputHash = await store.cas.put(outputSchemaHash, { $status: "ok" });
|
||||||
|
const detailHash = await store.cas.put(schemas.text, "head step detail");
|
||||||
|
oldStepHash = await store.cas.put(schemas.stepNode, {
|
||||||
|
start: startHash,
|
||||||
|
prev: oldStepPrev,
|
||||||
|
role: "worker",
|
||||||
|
output: outputHash,
|
||||||
|
detail: detailHash,
|
||||||
|
agent: cfg.stepAgentNameOverride ?? mockAgentPath,
|
||||||
|
edgePrompt: "Start work",
|
||||||
|
startedAtMs: 1716600001500,
|
||||||
|
completedAtMs: oldStepCompletedAtMs,
|
||||||
|
cwd: tmpDir,
|
||||||
|
assembledPrompt: null,
|
||||||
|
usage: null,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// Seed thread index entry. For "running" we let the test create the marker separately.
|
||||||
|
await seedThreads(tmpDir, {
|
||||||
|
[THREAD_ID]: {
|
||||||
|
head: oldStepHash,
|
||||||
|
status: cfg.threadStatus,
|
||||||
|
suspendedRole: cfg.threadStatus === "suspended" ? "worker" : null,
|
||||||
|
suspendMessage: cfg.threadStatus === "suspended" ? "Please clarify" : null,
|
||||||
|
completedAt:
|
||||||
|
cfg.threadStatus === "completed" || cfg.threadStatus === "cancelled"
|
||||||
|
? oldStepCompletedAtMs
|
||||||
|
: null,
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
// Mock agent always emits a stepNode keyed off the current thread head (which we
|
||||||
|
// observe through OCAS_HOME). The script writes prompt/env captures and then prints
|
||||||
|
// an adapter JSON that references a pre-built stepHash.
|
||||||
|
// We pre-build the agent's stepHash with prev=oldStepHash (normal append behaviour).
|
||||||
|
const newOutputHash = await store.cas.put(outputSchemaHash, {
|
||||||
|
$status: cfg.newStatus,
|
||||||
|
note: "poked output",
|
||||||
|
});
|
||||||
|
const newDetailHash = await store.cas.put(schemas.text, "poked detail");
|
||||||
|
const agentStepHash = await store.cas.put(schemas.stepNode, {
|
||||||
|
start: startHash,
|
||||||
|
prev: cfg.withHeadStep ? oldStepHash : null,
|
||||||
|
role: "worker",
|
||||||
|
output: newOutputHash,
|
||||||
|
detail: newDetailHash,
|
||||||
|
agent: "mock-agent-output",
|
||||||
|
edgePrompt: "poke prompt placeholder",
|
||||||
|
startedAtMs: cfg.newCompletedAtMs - 100,
|
||||||
|
completedAtMs: cfg.newCompletedAtMs,
|
||||||
|
cwd: tmpDir,
|
||||||
|
assembledPrompt: null,
|
||||||
|
usage: null,
|
||||||
|
});
|
||||||
|
|
||||||
|
const adapterJson = JSON.stringify({
|
||||||
|
stepHash: agentStepHash,
|
||||||
|
detailHash: newDetailHash,
|
||||||
|
role: "worker",
|
||||||
|
frontmatter: { $status: cfg.newStatus, note: "poked output" },
|
||||||
|
body: "",
|
||||||
|
startedAtMs: cfg.newCompletedAtMs - 100,
|
||||||
|
completedAtMs: cfg.newCompletedAtMs,
|
||||||
|
usage: null,
|
||||||
|
});
|
||||||
|
|
||||||
|
await writeFile(
|
||||||
|
mockAgentPath,
|
||||||
|
`#!/bin/sh
|
||||||
|
prompt=""
|
||||||
|
while [ $# -gt 0 ]; do
|
||||||
|
if [ "$1" = "--prompt" ]; then
|
||||||
|
prompt="$2"
|
||||||
|
shift 2
|
||||||
|
else
|
||||||
|
shift
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
printf '%s' "$prompt" > '${promptCapturePath}'
|
||||||
|
printf 'OCAS_HOME=%s\\n' "$OCAS_HOME" > '${envCapturePath}'
|
||||||
|
echo '${adapterJson}'
|
||||||
|
`,
|
||||||
|
{ mode: 0o755 },
|
||||||
|
);
|
||||||
|
|
||||||
|
await writeFile(
|
||||||
|
failingAgentPath,
|
||||||
|
`#!/bin/sh
|
||||||
|
echo "boom" >&2
|
||||||
|
exit 7
|
||||||
|
`,
|
||||||
|
{ mode: 0o755 },
|
||||||
|
);
|
||||||
|
|
||||||
|
const configPath = join(tmpDir, "config.yaml");
|
||||||
|
await writeFile(
|
||||||
|
configPath,
|
||||||
|
`defaultAgent: uwf-hermes\ndefaultModel: test-model\nagentOverrides: null\nagents: {}\nproviders: {}\nmodels: {}\n`,
|
||||||
|
);
|
||||||
|
|
||||||
|
return {
|
||||||
|
casDir,
|
||||||
|
oldStepHash,
|
||||||
|
oldStepPrev,
|
||||||
|
oldStepCompletedAtMs,
|
||||||
|
startHash,
|
||||||
|
workflowHash,
|
||||||
|
mockAgentPath,
|
||||||
|
failingAgentPath,
|
||||||
|
promptCapturePath,
|
||||||
|
envCapturePath,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
function runUwf(
|
||||||
|
args: string[],
|
||||||
|
casDir: string,
|
||||||
|
): { stdout: string; stderr: string; status: number } {
|
||||||
|
const cliPath = join(dirname(fileURLToPath(import.meta.url)), "..", "..", "dist", "cli.js");
|
||||||
|
try {
|
||||||
|
const stdout = execFileSync(process.execPath, [cliPath, ...args], {
|
||||||
|
encoding: "utf8",
|
||||||
|
stdio: ["ignore", "pipe", "pipe"],
|
||||||
|
env: {
|
||||||
|
...process.env,
|
||||||
|
UWF_HOME: tmpDir,
|
||||||
|
OCAS_HOME: casDir,
|
||||||
|
},
|
||||||
|
cwd: tmpDir,
|
||||||
|
timeout: 30000,
|
||||||
|
});
|
||||||
|
return { stdout, stderr: "", status: 0 };
|
||||||
|
} catch (error) {
|
||||||
|
const err = error as NodeJS.ErrnoException & {
|
||||||
|
stdout?: string | Buffer;
|
||||||
|
stderr?: string | Buffer;
|
||||||
|
status?: number;
|
||||||
|
};
|
||||||
|
return {
|
||||||
|
stdout: typeof err.stdout === "string" ? err.stdout : (err.stdout?.toString("utf8") ?? ""),
|
||||||
|
stderr: typeof err.stderr === "string" ? err.stderr : (err.stderr?.toString("utf8") ?? ""),
|
||||||
|
status: err.status ?? 1,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Group 1: CLI argument validation ───────────────────────────────────────
|
||||||
|
|
||||||
|
describe("uwf thread poke - CLI argument validation", () => {
|
||||||
|
test("1.1 missing -p flag exits non-zero", async () => {
|
||||||
|
const { casDir } = await setupThread();
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID], casDir);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
expect(result.stderr.toLowerCase()).toMatch(/required|missing|prompt/);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("1.2 -p without --agent succeeds", async () => {
|
||||||
|
const { casDir } = await setupThread();
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID, "-p", "do it again"], casDir);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("1.3 -p with --agent succeeds", async () => {
|
||||||
|
const { casDir, mockAgentPath } = await setupThread();
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "do it again", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Group 2: Guard errors ──────────────────────────────────────────────────
|
||||||
|
|
||||||
|
describe("uwf thread poke - guard errors", () => {
|
||||||
|
test("2.1 thread not found", async () => {
|
||||||
|
const { casDir } = await setupThread();
|
||||||
|
const result = runUwf(["thread", "poke", "01NOSUCHTHREAD0000000A", "-p", "prompt"], casDir);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
expect(result.stderr.toLowerCase()).toMatch(/not found|not active/);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("2.2 thread running rejects poke", async () => {
|
||||||
|
const { casDir, workflowHash } = await setupThread();
|
||||||
|
// Create background marker to simulate running
|
||||||
|
const { createMarker } = await import("../background/index.js");
|
||||||
|
await createMarker(tmpDir, {
|
||||||
|
thread: THREAD_ID,
|
||||||
|
workflow: workflowHash,
|
||||||
|
pid: process.pid,
|
||||||
|
startedAt: Date.now(),
|
||||||
|
});
|
||||||
|
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID, "-p", "prompt"], casDir);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
expect(result.stderr.toLowerCase()).toContain("already executing");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("2.3 completed thread rejects poke", async () => {
|
||||||
|
const { casDir } = await setupThread({ threadStatus: "completed" });
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID, "-p", "prompt"], casDir);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
expect(result.stderr.toLowerCase()).toMatch(/cannot be poked|completed/);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("2.4 cancelled thread rejects poke", async () => {
|
||||||
|
const { casDir } = await setupThread({ threadStatus: "cancelled" });
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID, "-p", "prompt"], casDir);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
expect(result.stderr.toLowerCase()).toMatch(/cannot be poked|cancelled/);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("2.5 thread head is StartNode (no StepNode) rejects poke", async () => {
|
||||||
|
const { casDir } = await setupThread({ withHeadStep: false });
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID, "-p", "prompt"], casDir);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
expect(result.stderr.toLowerCase()).toMatch(/no step|cannot be poked/);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Group 3: Success happy path ────────────────────────────────────────────
|
||||||
|
|
||||||
|
describe("uwf thread poke - success", () => {
|
||||||
|
test("3.1, 3.4 idle thread → new head differs from old, thread index updated", async () => {
|
||||||
|
const { casDir, oldStepHash, mockAgentPath } = await setupThread();
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
expect(cliOutput.head).not.toBe(oldStepHash);
|
||||||
|
|
||||||
|
const { createUwfStore, getThread } = await import("../store.js");
|
||||||
|
const uwf = await createUwfStore(tmpDir);
|
||||||
|
const entry = getThread(uwf.varStore, THREAD_ID);
|
||||||
|
expect(entry?.head).toBe(cliOutput.head);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("3.2 new step's prev equals old head's prev (replace, not append)", async () => {
|
||||||
|
const { casDir, oldStepPrev, mockAgentPath } = await setupThread({ multipleSteps: true });
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
|
||||||
|
const { createUwfStore } = await import("../store.js");
|
||||||
|
const uwf = await createUwfStore(tmpDir);
|
||||||
|
const node = uwf.store.cas.get(cliOutput.head as CasRef);
|
||||||
|
expect(node).not.toBeNull();
|
||||||
|
expect(node?.type).toBe(uwf.schemas.stepNode);
|
||||||
|
const payload = node?.payload as StepNodePayload;
|
||||||
|
expect(payload.prev).toBe(oldStepPrev);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("3.2b new step's prev is null when old head was the first step", async () => {
|
||||||
|
// multipleSteps:false means oldHead.prev = null
|
||||||
|
const { casDir, mockAgentPath } = await setupThread({ multipleSteps: false });
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
|
||||||
|
const { createUwfStore } = await import("../store.js");
|
||||||
|
const uwf = await createUwfStore(tmpDir);
|
||||||
|
const node = uwf.store.cas.get(cliOutput.head as CasRef);
|
||||||
|
const payload = node?.payload as StepNodePayload;
|
||||||
|
expect(payload.prev).toBeNull();
|
||||||
|
});
|
||||||
|
|
||||||
|
test("3.3 new step's completedAtMs is later than old", async () => {
|
||||||
|
const { casDir, oldStepCompletedAtMs, mockAgentPath } = await setupThread();
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
|
||||||
|
const { createUwfStore } = await import("../store.js");
|
||||||
|
const uwf = await createUwfStore(tmpDir);
|
||||||
|
const node = uwf.store.cas.get(cliOutput.head as CasRef);
|
||||||
|
const payload = node?.payload as StepNodePayload;
|
||||||
|
expect(payload.completedAtMs).toBeGreaterThan(oldStepCompletedAtMs);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("3.5 status remains idle after poke (no completion/suspend)", async () => {
|
||||||
|
const { casDir, mockAgentPath } = await setupThread();
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
expect(cliOutput.status).toBe("idle");
|
||||||
|
expect(cliOutput.done).toBe(false);
|
||||||
|
expect(cliOutput.suspendedRole).toBeNull();
|
||||||
|
expect(cliOutput.suspendMessage).toBeNull();
|
||||||
|
});
|
||||||
|
|
||||||
|
test("3.6 currentRole unchanged after poke (no moderator re-route)", async () => {
|
||||||
|
// Before poke: idle thread with worker step having $status=ok → moderator would route to reviewer.
|
||||||
|
// After poke (mock returns same $status=ok), moderator routing remains the same.
|
||||||
|
const { casDir, mockAgentPath } = await setupThread();
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
expect(cliOutput.currentRole).toBe("reviewer");
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Group 4: Agent resolution ──────────────────────────────────────────────
|
||||||
|
|
||||||
|
describe("uwf thread poke - agent resolution", () => {
|
||||||
|
test("4.1 without --agent, agent command read from head step's agent field", async () => {
|
||||||
|
// Head step's agent field points at mockAgentPath (default in setupThread)
|
||||||
|
const { casDir, promptCapturePath } = await setupThread();
|
||||||
|
const result = runUwf(["thread", "poke", THREAD_ID, "-p", "redo"], casDir);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const captured = await readFile(promptCapturePath, "utf8");
|
||||||
|
expect(captured).toBe("redo");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("4.2 with --agent, explicit override is used", async () => {
|
||||||
|
// Head step records "uwf-mock" (which is not a real binary). Override with mockAgentPath.
|
||||||
|
const { casDir, mockAgentPath } = await setupThread({ stepAgentNameOverride: "uwf-mock" });
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Group 5: Prompt passthrough ────────────────────────────────────────────
|
||||||
|
|
||||||
|
describe("uwf thread poke - prompt passthrough", () => {
|
||||||
|
test("5.1 -p value is passed to agent as --prompt", async () => {
|
||||||
|
const { casDir, mockAgentPath, promptCapturePath } = await setupThread();
|
||||||
|
const supplement = "Use the REST API instead.";
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", supplement, "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const captured = await readFile(promptCapturePath, "utf8");
|
||||||
|
expect(captured).toBe(supplement);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Group 6: Edge cases ────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
describe("uwf thread poke - edge cases", () => {
|
||||||
|
test("6.1 poke succeeds on suspended thread", async () => {
|
||||||
|
const { casDir, oldStepHash, mockAgentPath } = await setupThread({
|
||||||
|
threadStatus: "suspended",
|
||||||
|
});
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", mockAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).toBe(0);
|
||||||
|
const cliOutput = JSON.parse(result.stdout.trim());
|
||||||
|
expect(cliOutput.head).not.toBe(oldStepHash);
|
||||||
|
expect(cliOutput.status).toBe("idle");
|
||||||
|
expect(cliOutput.suspendedRole).toBeNull();
|
||||||
|
expect(cliOutput.suspendMessage).toBeNull();
|
||||||
|
});
|
||||||
|
|
||||||
|
test("6.2 agent failure leaves thread head unchanged", async () => {
|
||||||
|
const { casDir, oldStepHash, failingAgentPath } = await setupThread();
|
||||||
|
const result = runUwf(
|
||||||
|
["thread", "poke", THREAD_ID, "-p", "redo", "--agent", failingAgentPath],
|
||||||
|
casDir,
|
||||||
|
);
|
||||||
|
expect(result.status).not.toBe(0);
|
||||||
|
|
||||||
|
const { createUwfStore, getThread } = await import("../store.js");
|
||||||
|
const uwf = await createUwfStore(tmpDir);
|
||||||
|
const entry = getThread(uwf.varStore, THREAD_ID);
|
||||||
|
expect(entry?.head).toBe(oldStepHash);
|
||||||
|
});
|
||||||
|
});
|
||||||
@@ -31,7 +31,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
required: ["$status"]
|
required: ["$status"]
|
||||||
properties:
|
properties:
|
||||||
$status: { type: string, enum: ["ready"] }
|
$status: { const: "ready" }
|
||||||
graph:
|
graph:
|
||||||
$START:
|
$START:
|
||||||
new:
|
new:
|
||||||
|
|||||||
@@ -54,7 +54,7 @@ roles:
|
|||||||
type: object
|
type: object
|
||||||
required: ["$status"]
|
required: ["$status"]
|
||||||
properties:
|
properties:
|
||||||
$status: { type: string, enum: ["ready"] }
|
$status: { const: "ready" }
|
||||||
graph:
|
graph:
|
||||||
$START:
|
$START:
|
||||||
new:
|
new:
|
||||||
|
|||||||
@@ -17,7 +17,7 @@ function makeWorkflow(overrides?: Partial<WorkflowPayload>): WorkflowPayload {
|
|||||||
frontmatter: {
|
frontmatter: {
|
||||||
type: "object",
|
type: "object",
|
||||||
properties: {
|
properties: {
|
||||||
$status: { enum: ["done"] },
|
$status: { const: "done" },
|
||||||
plan: { type: "string" },
|
plan: { type: "string" },
|
||||||
},
|
},
|
||||||
required: ["$status", "plan"],
|
required: ["$status", "plan"],
|
||||||
@@ -85,7 +85,7 @@ describe("Suite 1: Role Reference Integrity", () => {
|
|||||||
output: "None",
|
output: "None",
|
||||||
frontmatter: {
|
frontmatter: {
|
||||||
type: "object",
|
type: "object",
|
||||||
properties: { $status: { enum: ["done"] } },
|
properties: { $status: { const: "done" } },
|
||||||
required: ["$status"],
|
required: ["$status"],
|
||||||
} as unknown as string,
|
} as unknown as string,
|
||||||
};
|
};
|
||||||
@@ -187,7 +187,7 @@ describe("Suite 2: Graph Structure", () => {
|
|||||||
output: "Isolated",
|
output: "Isolated",
|
||||||
frontmatter: {
|
frontmatter: {
|
||||||
type: "object",
|
type: "object",
|
||||||
properties: { $status: { enum: ["done"] } },
|
properties: { $status: { const: "done" } },
|
||||||
required: ["$status"],
|
required: ["$status"],
|
||||||
} as unknown as string,
|
} as unknown as string,
|
||||||
};
|
};
|
||||||
@@ -272,8 +272,8 @@ describe("Suite 3: Status-Edge Consistency", () => {
|
|||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
describe("Suite 3b: Enum-Based Multi-Exit", () => {
|
describe("Suite 3b: Enum-Based $status is Rejected", () => {
|
||||||
test("3b.1 enum multi-exit passes with matching graph keys", () => {
|
test("3b.1 enum multi-exit is rejected (must use oneOf + const)", () => {
|
||||||
const wf = makeWorkflow();
|
const wf = makeWorkflow();
|
||||||
wf.roles.reviewer = {
|
wf.roles.reviewer = {
|
||||||
...wf.roles.reviewer,
|
...wf.roles.reviewer,
|
||||||
@@ -291,52 +291,10 @@ describe("Suite 3b: Enum-Based Multi-Exit", () => {
|
|||||||
rejected: { role: "writer", prompt: "Fix: {{{comments}}}", location: null },
|
rejected: { role: "writer", prompt: "Fix: {{{comments}}}", location: null },
|
||||||
};
|
};
|
||||||
const errors = validateWorkflow(wf);
|
const errors = validateWorkflow(wf);
|
||||||
expect(errors).toEqual([]);
|
expect(errors.some((e) => e.includes("must define") && e.includes("const"))).toBe(true);
|
||||||
});
|
});
|
||||||
|
|
||||||
test("3b.2 enum multi-exit with extra graph key", () => {
|
test("3b.2 enum single-exit is rejected (must use const)", () => {
|
||||||
const wf = makeWorkflow();
|
|
||||||
wf.roles.reviewer = {
|
|
||||||
...wf.roles.reviewer,
|
|
||||||
frontmatter: {
|
|
||||||
type: "object",
|
|
||||||
properties: {
|
|
||||||
$status: { enum: ["approved", "rejected"] },
|
|
||||||
comments: { type: "string" },
|
|
||||||
},
|
|
||||||
required: ["$status", "comments"],
|
|
||||||
} as unknown as string,
|
|
||||||
};
|
|
||||||
wf.graph.reviewer = {
|
|
||||||
approved: { role: "$END", prompt: "Done", location: null },
|
|
||||||
rejected: { role: "writer", prompt: "Fix", location: null },
|
|
||||||
timeout: { role: "$END", prompt: "Timed out", location: null },
|
|
||||||
};
|
|
||||||
const errors = validateWorkflow(wf);
|
|
||||||
expect(errors.some((e) => e.includes("extra status keys: timeout"))).toBe(true);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("3b.3 enum multi-exit with missing graph key", () => {
|
|
||||||
const wf = makeWorkflow();
|
|
||||||
wf.roles.reviewer = {
|
|
||||||
...wf.roles.reviewer,
|
|
||||||
frontmatter: {
|
|
||||||
type: "object",
|
|
||||||
properties: {
|
|
||||||
$status: { enum: ["approved", "rejected"] },
|
|
||||||
comments: { type: "string" },
|
|
||||||
},
|
|
||||||
required: ["$status", "comments"],
|
|
||||||
} as unknown as string,
|
|
||||||
};
|
|
||||||
wf.graph.reviewer = {
|
|
||||||
approved: { role: "$END", prompt: "Done", location: null },
|
|
||||||
};
|
|
||||||
const errors = validateWorkflow(wf);
|
|
||||||
expect(errors.some((e) => e.includes("missing status keys: rejected"))).toBe(true);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("3b.4 enum with single explicit value passes", () => {
|
|
||||||
const wf = makeWorkflow();
|
const wf = makeWorkflow();
|
||||||
wf.roles.writer = {
|
wf.roles.writer = {
|
||||||
...wf.roles.writer,
|
...wf.roles.writer,
|
||||||
@@ -351,28 +309,71 @@ describe("Suite 3b: Enum-Based Multi-Exit", () => {
|
|||||||
};
|
};
|
||||||
wf.graph.writer = { ready: { role: "reviewer", prompt: "Review: {{{plan}}}", location: null } };
|
wf.graph.writer = { ready: { role: "reviewer", prompt: "Review: {{{plan}}}", location: null } };
|
||||||
const errors = validateWorkflow(wf);
|
const errors = validateWorkflow(wf);
|
||||||
expect(errors).toEqual([]);
|
expect(errors.some((e) => e.includes("must define") && e.includes("const"))).toBe(true);
|
||||||
});
|
});
|
||||||
|
});
|
||||||
|
|
||||||
test("3b.5 enum multi-exit mustache var not in frontmatter", () => {
|
describe("Suite 3c: Const-Based Flat Schema", () => {
|
||||||
|
test("3c.1 flat schema with const $status passes validation", () => {
|
||||||
const wf = makeWorkflow();
|
const wf = makeWorkflow();
|
||||||
wf.roles.reviewer = {
|
wf.roles.writer = {
|
||||||
...wf.roles.reviewer,
|
...wf.roles.writer,
|
||||||
frontmatter: {
|
frontmatter: {
|
||||||
type: "object",
|
type: "object",
|
||||||
properties: {
|
properties: {
|
||||||
$status: { enum: ["approved", "rejected"] },
|
$status: { const: "done" },
|
||||||
comments: { type: "string" },
|
plan: { type: "string" },
|
||||||
},
|
},
|
||||||
required: ["$status", "comments"],
|
required: ["$status", "plan"],
|
||||||
} as unknown as string,
|
} as unknown as string,
|
||||||
};
|
};
|
||||||
wf.graph.reviewer = {
|
const errors = validateWorkflow(wf);
|
||||||
approved: { role: "$END", prompt: "Done: {{{nonexistent}}}", location: null },
|
expect(errors).toEqual([]);
|
||||||
rejected: { role: "writer", prompt: "Fix: {{{comments}}}", location: null },
|
});
|
||||||
|
|
||||||
|
test("3c.2 flat schema with const $status detects extra graph key", () => {
|
||||||
|
const wf = makeWorkflow();
|
||||||
|
wf.roles.writer = {
|
||||||
|
...wf.roles.writer,
|
||||||
|
frontmatter: {
|
||||||
|
type: "object",
|
||||||
|
properties: {
|
||||||
|
$status: { const: "done" },
|
||||||
|
plan: { type: "string" },
|
||||||
|
},
|
||||||
|
required: ["$status", "plan"],
|
||||||
|
} as unknown as string,
|
||||||
|
};
|
||||||
|
wf.graph.writer = {
|
||||||
|
done: { role: "reviewer", prompt: "Review.", location: null },
|
||||||
|
extra: { role: "$END", prompt: "Nope.", location: null },
|
||||||
};
|
};
|
||||||
const errors = validateWorkflow(wf);
|
const errors = validateWorkflow(wf);
|
||||||
expect(errors.some((e) => e.includes("nonexistent") && e.includes("not found"))).toBe(true);
|
expect(errors.some((e) => e.includes("extra status keys") && e.includes("extra"))).toBe(true);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("3c.3 flat schema with const $status validates mustache vars", () => {
|
||||||
|
const wf = makeWorkflow();
|
||||||
|
wf.roles.writer = {
|
||||||
|
...wf.roles.writer,
|
||||||
|
frontmatter: {
|
||||||
|
type: "object",
|
||||||
|
properties: {
|
||||||
|
$status: { const: "done" },
|
||||||
|
plan: { type: "string" },
|
||||||
|
},
|
||||||
|
required: ["$status", "plan"],
|
||||||
|
} as unknown as string,
|
||||||
|
};
|
||||||
|
wf.graph.writer = {
|
||||||
|
done: { role: "reviewer", prompt: "Review: {{{nonexistent}}}", location: null },
|
||||||
|
};
|
||||||
|
const errors = validateWorkflow(wf);
|
||||||
|
expect(
|
||||||
|
errors.some(
|
||||||
|
(e) => e.includes('prompt variable "nonexistent"') && e.includes('role "writer"'),
|
||||||
|
),
|
||||||
|
).toBe(true);
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
@@ -480,7 +481,7 @@ describe("Suite 6: Multiple Errors Collection", () => {
|
|||||||
output: "None",
|
output: "None",
|
||||||
frontmatter: {
|
frontmatter: {
|
||||||
type: "object",
|
type: "object",
|
||||||
properties: { $status: { enum: ["done"] } },
|
properties: { $status: { const: "done" } },
|
||||||
required: ["$status"],
|
required: ["$status"],
|
||||||
} as unknown as string,
|
} as unknown as string,
|
||||||
};
|
};
|
||||||
|
|||||||
@@ -31,7 +31,7 @@ function makeMinimalPayload(name: string, description: string): WorkflowPayload
|
|||||||
frontmatter: {
|
frontmatter: {
|
||||||
type: "object",
|
type: "object",
|
||||||
properties: {
|
properties: {
|
||||||
$status: { type: "string", enum: ["done"] },
|
$status: { const: "done" },
|
||||||
},
|
},
|
||||||
required: ["$status"],
|
required: ["$status"],
|
||||||
} as unknown as CasRef,
|
} as unknown as CasRef,
|
||||||
|
|||||||
+31
-6
@@ -1,4 +1,4 @@
|
|||||||
#!/usr/bin/env node
|
#!/usr/bin/env -S node --disable-warning=ExperimentalWarning
|
||||||
|
|
||||||
import type { CasRef, ThreadId, ThreadStatus } from "@united-workforce/protocol";
|
import type { CasRef, ThreadId, ThreadStatus } from "@united-workforce/protocol";
|
||||||
import { Command } from "commander";
|
import { Command } from "commander";
|
||||||
@@ -11,12 +11,13 @@ import {
|
|||||||
cmdPromptUsage,
|
cmdPromptUsage,
|
||||||
cmdPromptWorkflowAuthoring,
|
cmdPromptWorkflowAuthoring,
|
||||||
} from "./commands/prompt.js";
|
} from "./commands/prompt.js";
|
||||||
import { cmdSetup, cmdSetupInteractive } from "./commands/setup.js";
|
import { cmdSetup, cmdSetupInteractive, resolvePresetBaseUrl } from "./commands/setup.js";
|
||||||
import { cmdStepFork, cmdStepList, cmdStepRead, cmdStepShow } from "./commands/step.js";
|
import { cmdStepFork, cmdStepList, cmdStepRead, cmdStepShow } from "./commands/step.js";
|
||||||
import {
|
import {
|
||||||
cmdThreadCancel,
|
cmdThreadCancel,
|
||||||
cmdThreadExec,
|
cmdThreadExec,
|
||||||
cmdThreadList,
|
cmdThreadList,
|
||||||
|
cmdThreadPoke,
|
||||||
cmdThreadRead,
|
cmdThreadRead,
|
||||||
cmdThreadResume,
|
cmdThreadResume,
|
||||||
cmdThreadShow,
|
cmdThreadShow,
|
||||||
@@ -290,6 +291,26 @@ thread
|
|||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
|
thread
|
||||||
|
.command("poke")
|
||||||
|
.description("Re-run the head step's agent with a supplementary prompt (replaces head step)")
|
||||||
|
.argument("<thread-id>", "Thread ULID")
|
||||||
|
.requiredOption("-p, --prompt <text>", "Supplementary prompt for the agent")
|
||||||
|
.option("--agent <cmd>", "Override agent command (defaults to head step's agent)")
|
||||||
|
.action((threadId: string, opts: { prompt: string; agent: string | undefined }) => {
|
||||||
|
const storageRoot = resolveStorageRoot();
|
||||||
|
runAction(async () => {
|
||||||
|
const agentOverride = opts.agent ?? null;
|
||||||
|
const result = await cmdThreadPoke(
|
||||||
|
storageRoot,
|
||||||
|
threadId as ThreadId,
|
||||||
|
opts.prompt,
|
||||||
|
agentOverride,
|
||||||
|
);
|
||||||
|
writeOutput(result);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
thread
|
thread
|
||||||
.command("stop")
|
.command("stop")
|
||||||
.description("Stop background execution of a thread (keep thread active)")
|
.description("Stop background execution of a thread (keep thread active)")
|
||||||
@@ -542,7 +563,7 @@ prompt
|
|||||||
|
|
||||||
program
|
program
|
||||||
.command("setup")
|
.command("setup")
|
||||||
.description("Configure provider, model, and agent")
|
.description("Configure provider, model, and agent. Run without options for interactive wizard.")
|
||||||
.option("--provider <name>", "Provider name")
|
.option("--provider <name>", "Provider name")
|
||||||
.option("--base-url <url>", "OpenAI-compatible API base URL")
|
.option("--base-url <url>", "OpenAI-compatible API base URL")
|
||||||
.option("--api-key <key>", "API key")
|
.option("--api-key <key>", "API key")
|
||||||
@@ -558,10 +579,14 @@ program
|
|||||||
}) => {
|
}) => {
|
||||||
const storageRoot = resolveStorageRoot();
|
const storageRoot = resolveStorageRoot();
|
||||||
runAction(async () => {
|
runAction(async () => {
|
||||||
if (opts.provider && opts.baseUrl && opts.apiKey && opts.model) {
|
// Resolve preset base-url when provider is known but --base-url is omitted
|
||||||
|
const resolvedBaseUrl =
|
||||||
|
opts.baseUrl ??
|
||||||
|
(opts.provider !== undefined ? resolvePresetBaseUrl(opts.provider) : null);
|
||||||
|
if (opts.provider && resolvedBaseUrl && opts.apiKey && opts.model) {
|
||||||
const result = await cmdSetup({
|
const result = await cmdSetup({
|
||||||
provider: opts.provider,
|
provider: opts.provider,
|
||||||
baseUrl: opts.baseUrl,
|
baseUrl: resolvedBaseUrl,
|
||||||
apiKey: opts.apiKey,
|
apiKey: opts.apiKey,
|
||||||
model: opts.model,
|
model: opts.model,
|
||||||
agent: opts.agent ?? undefined,
|
agent: opts.agent ?? undefined,
|
||||||
@@ -572,7 +597,7 @@ program
|
|||||||
await cmdSetupInteractive(storageRoot);
|
await cmdSetupInteractive(storageRoot);
|
||||||
} else {
|
} else {
|
||||||
throw new Error(
|
throw new Error(
|
||||||
"Non-interactive setup requires all of: --provider, --base-url, --api-key, --model",
|
"Non-interactive setup requires: --provider, --api-key, --model (--base-url is optional for preset providers)",
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|||||||
@@ -1,10 +1,35 @@
|
|||||||
|
import { readFileSync } from "node:fs";
|
||||||
|
import { dirname, join } from "node:path";
|
||||||
|
import { fileURLToPath } from "node:url";
|
||||||
import {
|
import {
|
||||||
generateAdapterDevelopingReference,
|
generateAdapterDevelopingReference,
|
||||||
generateUsageReference,
|
generateUsageReference,
|
||||||
generateWorkflowAuthoringReference,
|
generateWorkflowAuthoringReference,
|
||||||
VERSION,
|
|
||||||
} from "@united-workforce/util";
|
} from "@united-workforce/util";
|
||||||
|
|
||||||
|
// CLI package version (for bootstrap prompt — uwf --version prints this)
|
||||||
|
// Walk up from __dirname to find the nearest package.json (works from both src/ and dist/)
|
||||||
|
function _findCliVersion(): string {
|
||||||
|
let dir = dirname(fileURLToPath(import.meta.url));
|
||||||
|
for (let i = 0; i < 5; i++) {
|
||||||
|
const candidate = join(dir, "package.json");
|
||||||
|
try {
|
||||||
|
const pkg = JSON.parse(readFileSync(candidate, "utf-8")) as {
|
||||||
|
name?: string;
|
||||||
|
version?: string;
|
||||||
|
};
|
||||||
|
if (pkg.name === "@united-workforce/cli") {
|
||||||
|
return pkg.version ?? "0.0.0";
|
||||||
|
}
|
||||||
|
} catch {
|
||||||
|
// not found, keep walking
|
||||||
|
}
|
||||||
|
dir = dirname(dir);
|
||||||
|
}
|
||||||
|
return "0.0.0";
|
||||||
|
}
|
||||||
|
const CLI_VERSION = _findCliVersion();
|
||||||
|
|
||||||
export {
|
export {
|
||||||
generateAdapterDevelopingReference as cmdPromptAdapterDeveloping,
|
generateAdapterDevelopingReference as cmdPromptAdapterDeveloping,
|
||||||
generateUsageReference as cmdPromptUsage,
|
generateUsageReference as cmdPromptUsage,
|
||||||
@@ -22,33 +47,275 @@ export function cmdPromptList(): ReadonlyArray<string> {
|
|||||||
}
|
}
|
||||||
|
|
||||||
export function cmdPromptBootstrap(): string {
|
export function cmdPromptBootstrap(): string {
|
||||||
return `# uwf Skill Bootstrap
|
return `# uwf Bootstrap
|
||||||
|
|
||||||
Install or update the uwf (United Workforce) skills for your agent.
|
Set up or upgrade uwf (United Workforce) — from zero to running your first workflow.
|
||||||
|
|
||||||
## Steps
|
## Scenario A: Fresh Install
|
||||||
|
|
||||||
1. **Check if uwf CLI is installed:**
|
### Step 0 — Environment pre-flight check
|
||||||
|
|
||||||
|
Run ALL checks below BEFORE installing anything. If any check fails, fix it first — do NOT skip ahead.
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
# 1. Node.js
|
||||||
|
node --version # need v20+
|
||||||
|
# FIX: install via nvm (https://github.com/nvm-sh/nvm) or your OS package manager
|
||||||
|
|
||||||
|
# 2. Package manager — pnpm (preferred) or npm
|
||||||
|
pnpm --version || npm --version
|
||||||
|
# FIX for pnpm: corepack enable pnpm (Node 20+)
|
||||||
|
# or: npm install -g pnpm
|
||||||
|
# If pnpm is not an option, npm works too — adjust install commands below
|
||||||
|
|
||||||
|
# 3. Global bin directory is in PATH
|
||||||
|
# For pnpm:
|
||||||
|
pnpm bin -g # prints the global bin directory
|
||||||
|
# For npm:
|
||||||
|
npm prefix -g # global prefix; bin is <prefix>/bin
|
||||||
|
# The directory printed above MUST be in your PATH.
|
||||||
|
# Test: echo \\$PATH | tr ':' '\\\\n'
|
||||||
|
# FIX: add the missing directory to your shell's startup file
|
||||||
|
# (e.g. ~/.bashrc, ~/.zshrc, ~/.profile, or fish config):
|
||||||
|
# export PATH="<global-bin-dir>:$PATH"
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
**All checks must pass before continuing.** If you had to modify PATH, verify the change persists by opening a new shell or sourcing your shell config.
|
||||||
|
|
||||||
|
### Step 1 — Discover agents and install adapter
|
||||||
|
|
||||||
|
**First, detect which supported agents are already installed on the user's machine:**
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
# Check for Hermes Agent
|
||||||
|
which hermes 2>/dev/null && hermes --version
|
||||||
|
|
||||||
|
# Check for Claude Code
|
||||||
|
which claude 2>/dev/null && claude --version # should show "X.Y.Z (Claude Code)"
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
**Based on the results:**
|
||||||
|
|
||||||
|
- **Only hermes found** → install \`uwf-hermes\` adapter
|
||||||
|
- **Only claude found** → install \`uwf-claude-code\` adapter
|
||||||
|
- **Both found** → ask the user which agent they want uwf to use as default
|
||||||
|
- **Neither found** → the user must install at least one agent first:
|
||||||
|
- Hermes Agent: https://hermes-agent.nousresearch.com/docs
|
||||||
|
- Claude Code: \`npm install -g @anthropic-ai/claude-code\`
|
||||||
|
|
||||||
|
**Install the uwf CLI and the chosen adapter** using pnpm or npm:
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
# CLI (required)
|
||||||
|
pnpm add -g @united-workforce/cli # or: npm install -g @united-workforce/cli
|
||||||
|
|
||||||
|
# Adapter — install the one matching the detected agent:
|
||||||
|
pnpm add -g @united-workforce/agent-hermes # or: npm i -g @united-workforce/agent-hermes
|
||||||
|
pnpm add -g @united-workforce/agent-claude-code # or: npm i -g @united-workforce/agent-claude-code
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
**⚠ Adapter versions are independent from CLI versions.** Do NOT try to match adapter version to CLI version. Just install \`@latest\` (the default).
|
||||||
|
|
||||||
|
**After installing, verify that \`uwf\` and the adapter are available in PATH:**
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
uwf --version # should print ${CLI_VERSION}
|
||||||
|
uwf-hermes --version # or: uwf-claude-code --version
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
If either command is not found, the global bin directory is not in the current shell's PATH. **You must fix this before continuing:**
|
||||||
|
|
||||||
|
1. Find where the binary was installed:
|
||||||
\`\`\`bash
|
\`\`\`bash
|
||||||
uwf --help
|
find ~/.local ~/.hermes /usr/local -name uwf -type f 2>/dev/null
|
||||||
|
npm prefix -g # global prefix — bin is <prefix>/bin
|
||||||
\`\`\`
|
\`\`\`
|
||||||
If not installed: \`npm install -g @united-workforce/cli\`
|
2. Add the directory to PATH permanently by appending to the user's shell config (e.g. \`~/.bashrc\`, \`~/.zshrc\`, \`~/.profile\`, or fish config):
|
||||||
Current version: ${VERSION}
|
\`\`\`bash
|
||||||
|
export PATH="<global-bin-dir>:$PATH"
|
||||||
|
\`\`\`
|
||||||
|
3. Source the updated config or open a new shell, then re-verify the commands work.
|
||||||
|
|
||||||
2. **Install the skills** by running each command below and creating a skill from its output:
|
**uwf-hermes** also requires the Hermes ACP plugin. Verify with \`hermes acp --help\`. If not available, install it:
|
||||||
- Run \`uwf prompt usage\` → create a skill named \`uwf-usage\`
|
\`\`\`bash
|
||||||
- Run \`uwf prompt workflow-authoring\` → create a skill named \`uwf-workflow-authoring\`
|
# Option A: install into hermes venv (recommended)
|
||||||
- Run \`uwf prompt adapter-developing\` → create a skill named \`uwf-adapter-developing\`
|
source ~/.hermes/hermes-agent/.venv/bin/activate && pip install hermes-agent[acp]
|
||||||
|
|
||||||
Each command outputs a complete SKILL.md with YAML frontmatter — use your agent framework's skill creation API to save them.
|
# Option B: pipx
|
||||||
|
pipx install 'hermes-agent[acp]'
|
||||||
|
|
||||||
3. **Verify** the skills are loadable by your agent framework.
|
# Option C: if installed from source
|
||||||
|
pip install -e '.[acp]'
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
## Updating
|
### Step 2 — Configure provider and model
|
||||||
|
|
||||||
When \`uwf\` is upgraded, re-run \`uwf prompt bootstrap\` and follow the steps again.
|
uwf needs an LLM provider to run agents. **Ask the user** for their provider, API key, and model, then run:
|
||||||
The skill content is bundled with the CLI — always use \`uwf prompt <name>\` to get
|
|
||||||
content matching your installed version.
|
\`\`\`bash
|
||||||
|
uwf setup --provider <name> --api-key <key> --model <model> --agent <adapter-command>
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
**Note:** \`--agent\` takes the adapter **command name** (e.g. \`uwf-hermes\`), not the npm package name.
|
||||||
|
|
||||||
|
**Preset providers** — when using a preset name, \`--base-url\` is auto-filled and can be omitted:
|
||||||
|
|
||||||
|
| Provider | Name | Default base URL |
|
||||||
|
|----------|------|-----------------|
|
||||||
|
| OpenAI | \`openai\` | https://api.openai.com/v1 |
|
||||||
|
| xAI | \`xai\` | https://api.x.ai/v1 |
|
||||||
|
| OpenRouter | \`openrouter\` | https://openrouter.ai/api/v1 |
|
||||||
|
| Venice | \`venice\` | https://api.venice.ai/api/v1 |
|
||||||
|
| Dashscope | \`dashscope\` | https://dashscope.aliyuncs.com/compatible-mode/v1 |
|
||||||
|
| DeepSeek | \`deepseek\` | https://api.deepseek.com/v1 |
|
||||||
|
| SiliconFlow | \`siliconflow\` | https://api.siliconflow.cn/v1 |
|
||||||
|
| VolcEngine | \`volcengine\` | https://ark.cn-beijing.volces.com/api/v3 |
|
||||||
|
| Kimi (Moonshot) | \`kimi\` | https://api.moonshot.cn/v1 |
|
||||||
|
| GLM (Zhipu AI) | \`glm\` | https://open.bigmodel.cn/api/paas/v4 |
|
||||||
|
| StepFun | \`stepfun\` | https://api.stepfun.com/v1 |
|
||||||
|
| MiniMax | \`minimax\` | https://api.minimax.io/v1 |
|
||||||
|
| Ollama (local) | \`ollama\` | http://localhost:11434/v1 |
|
||||||
|
|
||||||
|
For **non-preset providers**, you must specify \`--base-url\` manually.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
\`\`\`bash
|
||||||
|
uwf setup --provider openrouter --api-key sk-or-... --model anthropic/claude-sonnet-4 --agent uwf-hermes
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
If the user doesn't know what to choose, suggest \`openrouter\` with \`anthropic/claude-sonnet-4\` as a sensible default.
|
||||||
|
|
||||||
|
Config is saved to \`~/.uwf/config.yaml\`. Verify with \`cat ~/.uwf/config.yaml\`.
|
||||||
|
|
||||||
|
### Step 3 — Install skills
|
||||||
|
|
||||||
|
Run each command and create a skill from its output:
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
uwf prompt usage # → save as skill "uwf-usage"
|
||||||
|
uwf prompt workflow-authoring # → save as skill "uwf-workflow-authoring"
|
||||||
|
uwf prompt adapter-developing # → save as skill "uwf-adapter-developing"
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
Each command outputs a complete SKILL.md with YAML frontmatter. Use your agent framework's skill creation API to save them (e.g. \`skill_manage(action='create', name='uwf-usage', content=<output>)\`).
|
||||||
|
|
||||||
|
Verify skills are installed by listing them (e.g. \`skills_list()\`) and confirming all three appear.
|
||||||
|
|
||||||
|
**⚠ After saving all skills, start a new session** so the agent loads the updated skill content. Skills saved in the current session are not active until the next session.
|
||||||
|
|
||||||
|
### Step 4 — Verify end-to-end
|
||||||
|
|
||||||
|
Create a minimal workflow file to test your setup:
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
cat > /tmp/hello.yaml << 'YAML'
|
||||||
|
name: hello
|
||||||
|
description: Minimal smoke test
|
||||||
|
roles:
|
||||||
|
greeter:
|
||||||
|
description: "Greet the user"
|
||||||
|
goal: "Respond with a friendly greeting"
|
||||||
|
capabilities: []
|
||||||
|
procedure: "Write a short greeting based on the prompt."
|
||||||
|
output: "A greeting message."
|
||||||
|
frontmatter:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
$status: { const: done }
|
||||||
|
message: { type: string }
|
||||||
|
required: [$status, message]
|
||||||
|
graph:
|
||||||
|
$START:
|
||||||
|
new: { role: greeter, prompt: "Say hello to the user." }
|
||||||
|
resume: { role: greeter, prompt: "Greet the user again." }
|
||||||
|
greeter:
|
||||||
|
done: { role: "$END", prompt: "Done." }
|
||||||
|
YAML
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
Then run:
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
uwf thread start /tmp/hello.yaml -p "Hello, world!"
|
||||||
|
uwf thread exec <thread-id>
|
||||||
|
uwf thread show <thread-id>
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
If the thread reaches \`$END\` with status \`completed\`, the setup is working.
|
||||||
|
|
||||||
|
## Scenario B: Upgrade from Previous Version
|
||||||
|
|
||||||
|
### Step 1 — Update packages
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
# Using pnpm
|
||||||
|
pnpm add -g @united-workforce/cli@latest
|
||||||
|
|
||||||
|
# Using npm
|
||||||
|
npm install -g @united-workforce/cli@latest
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
uwf --version # should print ${CLI_VERSION}
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
Also update your adapter(s):
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
# pnpm
|
||||||
|
pnpm add -g @united-workforce/agent-hermes@latest
|
||||||
|
|
||||||
|
# npm
|
||||||
|
npm install -g @united-workforce/agent-hermes@latest
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
### Step 2 — Regenerate skills
|
||||||
|
|
||||||
|
Skill content is bundled with the CLI — always regenerate after upgrading:
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
uwf prompt usage # → update skill "uwf-usage"
|
||||||
|
uwf prompt workflow-authoring # → update skill "uwf-workflow-authoring"
|
||||||
|
uwf prompt adapter-developing # → update skill "uwf-adapter-developing"
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
**⚠ After updating skills, start a new session** to load the new skill content.
|
||||||
|
|
||||||
|
### Step 3 — Migrate workflow YAML files (if needed)
|
||||||
|
|
||||||
|
Check the changelog for breaking changes. Known migrations:
|
||||||
|
|
||||||
|
- **v0.2.0**: \`$START._\` → \`$START.new\` + \`$START.resume\`. All workflow YAML files must be updated:
|
||||||
|
\`\`\`yaml
|
||||||
|
# Before (v0.1.x)
|
||||||
|
$START:
|
||||||
|
_: { role: planner, prompt: "..." }
|
||||||
|
|
||||||
|
# After (v0.2.0+)
|
||||||
|
$START:
|
||||||
|
new: { role: planner, prompt: "..." }
|
||||||
|
resume: { role: planner, prompt: "Review previous run and continue." }
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
Update all \`.workflow/\` and \`.workflows/\` YAML files in your projects. \`uwf workflow add\` will reject files with the old \`_\` syntax.
|
||||||
|
|
||||||
|
- **v0.2.1**: \`$status: { enum: [value] }\` → \`$status: { const: "value" }\`. The validator no longer accepts \`enum\` for \`$status\`. Update all workflow YAML files:
|
||||||
|
\`\`\`yaml
|
||||||
|
# Before (v0.2.0)
|
||||||
|
$status: { enum: [done] }
|
||||||
|
$status: { type: string, enum: ["ready", "failed"] }
|
||||||
|
|
||||||
|
# After (v0.2.1+)
|
||||||
|
$status: { const: "done" }
|
||||||
|
# For multi-exit, use oneOf with const (unchanged)
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
|
### Step 4 — Verify
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
uwf thread start <your-workflow> -p "upgrade test"
|
||||||
|
uwf thread exec <thread-id>
|
||||||
|
\`\`\`
|
||||||
|
|
||||||
## Available prompts
|
## Available prompts
|
||||||
|
|
||||||
@@ -57,6 +324,7 @@ uwf prompt list # list available prompt names
|
|||||||
uwf prompt usage # CLI usage guide
|
uwf prompt usage # CLI usage guide
|
||||||
uwf prompt workflow-authoring # workflow YAML design guide
|
uwf prompt workflow-authoring # workflow YAML design guide
|
||||||
uwf prompt adapter-developing # building agent adapters
|
uwf prompt adapter-developing # building agent adapters
|
||||||
|
uwf prompt bootstrap # this guide
|
||||||
\`\`\`
|
\`\`\`
|
||||||
`;
|
`;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,3 +1,4 @@
|
|||||||
|
import { execFileSync } from "node:child_process";
|
||||||
import { existsSync, mkdirSync, readdirSync, readFileSync, statSync, writeFileSync } from "node:fs";
|
import { existsSync, mkdirSync, readdirSync, readFileSync, statSync, writeFileSync } from "node:fs";
|
||||||
import { join } from "node:path";
|
import { join } from "node:path";
|
||||||
import { stdin as input, stdout as output } from "node:process";
|
import { stdin as input, stdout as output } from "node:process";
|
||||||
@@ -72,6 +73,12 @@ const PRESET_PROVIDERS = [
|
|||||||
{ name: "ollama", label: "Ollama (local)", baseUrl: "http://localhost:11434/v1" },
|
{ name: "ollama", label: "Ollama (local)", baseUrl: "http://localhost:11434/v1" },
|
||||||
] as const;
|
] as const;
|
||||||
|
|
||||||
|
/** Look up the base URL for a preset provider name. Returns null if not a preset. */
|
||||||
|
export function resolvePresetBaseUrl(providerName: string): string | null {
|
||||||
|
const preset = PRESET_PROVIDERS.find((p) => p.name === providerName);
|
||||||
|
return preset !== undefined ? preset.baseUrl : null;
|
||||||
|
}
|
||||||
|
|
||||||
type SetupArgs = {
|
type SetupArgs = {
|
||||||
provider: string;
|
provider: string;
|
||||||
baseUrl: string;
|
baseUrl: string;
|
||||||
@@ -175,7 +182,6 @@ export async function _discoverAgents(): Promise<string[]> {
|
|||||||
|
|
||||||
async function _tryWhichDiscovery(): Promise<string[] | null> {
|
async function _tryWhichDiscovery(): Promise<string[] | null> {
|
||||||
try {
|
try {
|
||||||
const { execFileSync } = await import("node:child_process");
|
|
||||||
const text = execFileSync("which", ["-a", "uwf-hermes", "uwf-claude-code", "uwf-cursor"], {
|
const text = execFileSync("which", ["-a", "uwf-hermes", "uwf-claude-code", "uwf-cursor"], {
|
||||||
encoding: "utf-8",
|
encoding: "utf-8",
|
||||||
stdio: ["pipe", "pipe", "pipe"],
|
stdio: ["pipe", "pipe", "pipe"],
|
||||||
@@ -391,6 +397,37 @@ function mergeConfig(existing: Record<string, unknown>, args: SetupArgs): Record
|
|||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Check if the configured adapter binary (and its dependencies) are in PATH.
|
||||||
|
* Returns warnings array — empty means all good.
|
||||||
|
*/
|
||||||
|
export function _checkAdapterAvailability(agentName: string): string[] {
|
||||||
|
const warnings: string[] = [];
|
||||||
|
const binary = `uwf-${agentName}`;
|
||||||
|
|
||||||
|
try {
|
||||||
|
execFileSync("which", [binary], { encoding: "utf8", stdio: ["pipe", "pipe", "pipe"] });
|
||||||
|
} catch {
|
||||||
|
warnings.push(
|
||||||
|
`${binary} not found in PATH. Install it: pnpm add -g @united-workforce/agent-${agentName}`,
|
||||||
|
);
|
||||||
|
return warnings; // skip dependency check if adapter itself is missing
|
||||||
|
}
|
||||||
|
|
||||||
|
// uwf-hermes depends on hermes CLI
|
||||||
|
if (agentName === "hermes") {
|
||||||
|
try {
|
||||||
|
execFileSync("which", ["hermes"], { encoding: "utf8", stdio: ["pipe", "pipe", "pipe"] });
|
||||||
|
} catch {
|
||||||
|
warnings.push(
|
||||||
|
'hermes CLI not found in PATH (required by uwf-hermes). Fix: export PATH="$HOME/.hermes/hermes-agent/.venv/bin:$PATH"',
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return warnings;
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Non-interactive setup. All required args provided via CLI flags.
|
* Non-interactive setup. All required args provided via CLI flags.
|
||||||
*/
|
*/
|
||||||
@@ -405,15 +442,26 @@ export async function cmdSetup(args: SetupArgs): Promise<Record<string, unknown>
|
|||||||
|
|
||||||
writeFileSync(configPath, stringify(merged, { indent: 2 }), "utf8");
|
writeFileSync(configPath, stringify(merged, { indent: 2 }), "utf8");
|
||||||
|
|
||||||
|
// Print config path to stderr (stdout is reserved for JSON output)
|
||||||
|
console.error(`Config saved to ${configPath} ✓`);
|
||||||
|
|
||||||
// Validate model connectivity
|
// Validate model connectivity
|
||||||
const validation = await validateModel(args.baseUrl, args.apiKey, args.model);
|
const validation = await validateModel(args.baseUrl, args.apiKey, args.model);
|
||||||
|
|
||||||
|
// Check adapter availability
|
||||||
|
const agentName = _agentNameFromBinary(args.agent ?? "hermes");
|
||||||
|
const adapterWarnings = _checkAdapterAvailability(agentName);
|
||||||
|
for (const w of adapterWarnings) {
|
||||||
|
console.error(`⚠ ${w}`);
|
||||||
|
}
|
||||||
|
|
||||||
return {
|
return {
|
||||||
configPath,
|
configPath,
|
||||||
provider: args.provider,
|
provider: args.provider,
|
||||||
model: args.model,
|
model: args.model,
|
||||||
defaultAgent: merged.defaultAgent,
|
defaultAgent: merged.defaultAgent,
|
||||||
validation,
|
validation,
|
||||||
|
adapterWarnings,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -199,6 +199,7 @@ const PL_THREAD_ARCHIVED = "F4D8Q2K5";
|
|||||||
const PL_STEP_ERROR = "B8T5N1V6";
|
const PL_STEP_ERROR = "B8T5N1V6";
|
||||||
const PL_BACKGROUND_START = "X7Q4W9M2";
|
const PL_BACKGROUND_START = "X7Q4W9M2";
|
||||||
const PL_THREAD_RESUME = "K2R7M4N8";
|
const PL_THREAD_RESUME = "K2R7M4N8";
|
||||||
|
const PL_THREAD_POKE = "P4Q9R3X7";
|
||||||
|
|
||||||
type ResumeStepConfig = {
|
type ResumeStepConfig = {
|
||||||
role: string;
|
role: string;
|
||||||
@@ -1004,6 +1005,12 @@ function spawnAgent(
|
|||||||
});
|
});
|
||||||
} catch (e) {
|
} catch (e) {
|
||||||
const err = e as NodeJS.ErrnoException & { stderr?: Buffer | string | null };
|
const err = e as NodeJS.ErrnoException & { stderr?: Buffer | string | null };
|
||||||
|
if (err.code === "ENOENT") {
|
||||||
|
failStep(
|
||||||
|
plog,
|
||||||
|
`"${agent.command}" not found in PATH. Install it or check your PATH config. Run: which ${agent.command}`,
|
||||||
|
);
|
||||||
|
}
|
||||||
const stderr =
|
const stderr =
|
||||||
err.stderr == null
|
err.stderr == null
|
||||||
? ""
|
? ""
|
||||||
@@ -1129,6 +1136,147 @@ export async function cmdThreadResume(
|
|||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Validate that a thread can be poked. Returns the existing entry and the head StepNode payload.
|
||||||
|
* Fails (process exit) when the thread is missing, running, completed, cancelled, or has no
|
||||||
|
* StepNode at its head.
|
||||||
|
*/
|
||||||
|
async function validatePokePreconditions(
|
||||||
|
storageRoot: string,
|
||||||
|
uwf: UwfStore,
|
||||||
|
threadId: ThreadId,
|
||||||
|
): Promise<{ entry: ThreadIndexEntry; oldHead: CasRef; oldHeadPayload: StepNodePayload }> {
|
||||||
|
const runningMarker = await isThreadRunning(storageRoot, threadId);
|
||||||
|
if (runningMarker !== null) {
|
||||||
|
fail(`thread already executing in background (PID: ${runningMarker.pid})`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const entry = getThread(uwf.varStore, threadId);
|
||||||
|
if (entry === null) {
|
||||||
|
fail(`thread not active: ${threadId}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (entry.status === "completed" || entry.status === "cancelled") {
|
||||||
|
fail(`thread cannot be poked: ${threadId} (status: ${entry.status})`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const oldHead = entry.head;
|
||||||
|
const oldHeadNode = uwf.store.cas.get(oldHead);
|
||||||
|
if (oldHeadNode === null) {
|
||||||
|
fail(`CAS node not found: ${oldHead}`);
|
||||||
|
}
|
||||||
|
if (oldHeadNode.type !== uwf.schemas.stepNode) {
|
||||||
|
fail("thread cannot be poked: no step to replace (head is StartNode)");
|
||||||
|
}
|
||||||
|
|
||||||
|
return { entry, oldHead, oldHeadPayload: oldHeadNode.payload as StepNodePayload };
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Resolve the next role from the post-poke chain state, used for the StepOutput.currentRole field.
|
||||||
|
* Returns null when the next role is $END, evaluation fails, or the result is a suspend.
|
||||||
|
*/
|
||||||
|
function resolveCurrentRoleFromChain(
|
||||||
|
uwfAfter: UwfStore,
|
||||||
|
workflow: WorkflowPayload,
|
||||||
|
replacedHash: CasRef,
|
||||||
|
): string | null {
|
||||||
|
const chainAfter = walkChain(uwfAfter, replacedHash);
|
||||||
|
const { lastRole, lastOutput } = resolveEvaluateArgs(uwfAfter, chainAfter);
|
||||||
|
const afterResult = evaluate(workflow.graph, lastRole, lastOutput);
|
||||||
|
if (!afterResult.ok || isSuspendResult(afterResult.value)) {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
if (afterResult.value.role === END_ROLE) {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
return afterResult.value.role;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Poke a thread: re-run the agent on the head step with a supplementary prompt,
|
||||||
|
* replacing the head step's output. The new step's `prev` points to the OLD head's
|
||||||
|
* `prev` — semantically replacing (not appending to) the head. The moderator is NOT
|
||||||
|
* re-evaluated for routing; the role of the head step is re-used.
|
||||||
|
*/
|
||||||
|
export async function cmdThreadPoke(
|
||||||
|
storageRoot: string,
|
||||||
|
threadId: ThreadId,
|
||||||
|
prompt: string,
|
||||||
|
agentOverride: string | null,
|
||||||
|
): Promise<StepOutput> {
|
||||||
|
const uwf = await createUwfStore(storageRoot);
|
||||||
|
const { entry, oldHeadPayload } = await validatePokePreconditions(storageRoot, uwf, threadId);
|
||||||
|
|
||||||
|
const chain = walkChain(uwf, entry.head);
|
||||||
|
const workflowHash = chain.start.workflow;
|
||||||
|
const threadCwd = chain.start.cwd;
|
||||||
|
|
||||||
|
const plog = createProcessLogger({
|
||||||
|
storageRoot,
|
||||||
|
context: { thread: threadId, workflow: workflowHash },
|
||||||
|
});
|
||||||
|
|
||||||
|
// Resolve the agent: --agent override wins; otherwise read from old head step's `agent` field.
|
||||||
|
const config = await loadWorkflowConfig(storageRoot);
|
||||||
|
const workflow = loadWorkflowPayload(uwf, workflowHash);
|
||||||
|
const role = oldHeadPayload.role;
|
||||||
|
const agent =
|
||||||
|
agentOverride !== null
|
||||||
|
? resolveAgentConfig(config, workflow, role, agentOverride)
|
||||||
|
: parseAgentOverride(oldHeadPayload.agent);
|
||||||
|
|
||||||
|
const effectiveCwd = oldHeadPayload.cwd !== "" ? oldHeadPayload.cwd : threadCwd;
|
||||||
|
|
||||||
|
plog.log(PL_THREAD_POKE, `poke role=${role} agent=${agent.command}`, null);
|
||||||
|
plog.log(PL_AGENT_SPAWN, `spawning agent command=${agent.command}`, {
|
||||||
|
args: [...agent.args, threadId, role].join(" "),
|
||||||
|
});
|
||||||
|
|
||||||
|
loadDotenv({ path: getEnvPath(storageRoot) });
|
||||||
|
|
||||||
|
// Spawn the agent. The agent will create a new StepNode with prev=oldHead (it reads
|
||||||
|
// the active thread head). After the agent returns, we rewrite that node's prev so
|
||||||
|
// that the new head replaces the old head instead of appending after it.
|
||||||
|
const agentResult = spawnAgent(plog, agent, threadId, role, prompt, effectiveCwd);
|
||||||
|
const agentStepHash = agentResult.stepHash as CasRef;
|
||||||
|
|
||||||
|
plog.log(PL_AGENT_DONE, `agent returned head=${agentStepHash}`, null);
|
||||||
|
|
||||||
|
const uwfAfter = await createUwfStore(storageRoot);
|
||||||
|
const agentNode = uwfAfter.store.cas.get(agentStepHash);
|
||||||
|
if (agentNode === null || agentNode.type !== uwfAfter.schemas.stepNode) {
|
||||||
|
failStep(plog, `agent returned hash that is not a StepNode: ${agentStepHash}`);
|
||||||
|
}
|
||||||
|
const agentPayload = agentNode.payload as StepNodePayload;
|
||||||
|
|
||||||
|
// Rewrite the new step so that its `prev` points to the OLD head's prev (replace semantics).
|
||||||
|
const replacedPayload: StepNodePayload = {
|
||||||
|
...agentPayload,
|
||||||
|
prev: oldHeadPayload.prev,
|
||||||
|
};
|
||||||
|
const replacedHash = await uwfAfter.store.cas.put(uwfAfter.schemas.stepNode, replacedPayload);
|
||||||
|
const replacedNode = uwfAfter.store.cas.get(replacedHash);
|
||||||
|
if (replacedNode === null || !validate(uwfAfter.store, replacedNode)) {
|
||||||
|
failStep(plog, "rewritten StepNode failed schema validation");
|
||||||
|
}
|
||||||
|
|
||||||
|
// Update thread head to the replaced step. Status becomes idle (no moderator re-route).
|
||||||
|
setThread(uwfAfter.varStore, threadId, updateThreadHead(entry, replacedHash));
|
||||||
|
|
||||||
|
return {
|
||||||
|
workflow: workflowHash,
|
||||||
|
thread: threadId,
|
||||||
|
head: replacedHash,
|
||||||
|
status: "idle",
|
||||||
|
currentRole: resolveCurrentRoleFromChain(uwfAfter, workflow, replacedHash),
|
||||||
|
suspendedRole: null,
|
||||||
|
suspendMessage: null,
|
||||||
|
done: false,
|
||||||
|
background: null,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
export function validateCount(count: number): void {
|
export function validateCount(count: number): void {
|
||||||
if (count < 1 || !Number.isInteger(count)) {
|
if (count < 1 || !Number.isInteger(count)) {
|
||||||
throw new Error(`--count must be a positive integer, got: ${count}`);
|
throw new Error(`--count must be a positive integer, got: ${count}`);
|
||||||
|
|||||||
@@ -24,22 +24,22 @@ function isOneOfSchema(fm: unknown): fm is SchemaObj & { oneOf: SchemaObj[] } {
|
|||||||
return Array.isArray(obj.oneOf);
|
return Array.isArray(obj.oneOf);
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Check if a frontmatter schema declares "$status" as an enum (the required form for user roles). */
|
/** Check if a frontmatter schema declares "$status" as const (flat schema form). */
|
||||||
function hasStatusEnum(fm: unknown): boolean {
|
function hasStatusConst(fm: unknown): boolean {
|
||||||
if (typeof fm !== "object" || fm === null) return false;
|
if (typeof fm !== "object" || fm === null) return false;
|
||||||
const obj = fm as SchemaObj;
|
const obj = fm as SchemaObj;
|
||||||
const props = obj.properties as Record<string, SchemaObj> | undefined;
|
const props = obj.properties as Record<string, SchemaObj> | undefined;
|
||||||
if (!props?.$status) return false;
|
if (!props?.$status) return false;
|
||||||
return Array.isArray(props.$status.enum);
|
return typeof props.$status.const === "string";
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Extract status values from an enum-based $status field. */
|
/** Extract status values from a const-based $status field. */
|
||||||
function getEnumStatuses(fm: SchemaObj): string[] {
|
function getConstStatuses(fm: SchemaObj): string[] {
|
||||||
const props = fm.properties as Record<string, SchemaObj> | undefined;
|
const props = fm.properties as Record<string, SchemaObj> | undefined;
|
||||||
if (!props?.$status) return [];
|
if (!props?.$status) return [];
|
||||||
const statusDef = props.$status;
|
const statusDef = props.$status;
|
||||||
if (!Array.isArray(statusDef.enum)) return [];
|
if (typeof statusDef.const === "string") return [statusDef.const];
|
||||||
return statusDef.enum as string[];
|
return [];
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Get property names from a schema object. */
|
/** Get property names from a schema object. */
|
||||||
@@ -248,21 +248,21 @@ function checkRoleConsistency(payload: WorkflowPayload, errors: string[]): void
|
|||||||
checkOneOfDiscriminant(roleName, variants, statuses, errors);
|
checkOneOfDiscriminant(roleName, variants, statuses, errors);
|
||||||
checkStatusEdges(roleName, graphKeys, new Set(statuses), errors);
|
checkStatusEdges(roleName, graphKeys, new Set(statuses), errors);
|
||||||
checkMultiExitMustache(roleName, graphEntry, variants, errors);
|
checkMultiExitMustache(roleName, graphEntry, variants, errors);
|
||||||
} else if (hasStatusEnum(fm)) {
|
} else if (hasStatusConst(fm)) {
|
||||||
const statuses = getEnumStatuses(fm as SchemaObj);
|
const statuses = getConstStatuses(fm as SchemaObj);
|
||||||
checkStatusEdges(roleName, graphKeys, new Set(statuses), errors);
|
checkStatusEdges(roleName, graphKeys, new Set(statuses), errors);
|
||||||
// For enum-based schemas, mustache vars come from the flat properties
|
// For const-based flat schemas, mustache vars come from the flat properties
|
||||||
checkEnumMustache(roleName, graphEntry, fm as SchemaObj, errors);
|
checkFlatMustache(roleName, graphEntry, fm as SchemaObj, errors);
|
||||||
} else {
|
} else {
|
||||||
errors.push(
|
errors.push(
|
||||||
`role "${roleName}" must define "$status" as an enum (or oneOf const) in frontmatter`,
|
`role "${roleName}" must define "$status" as const (or oneOf with const) in frontmatter`,
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Check mustache vars in all edge prompts against flat schema properties. */
|
/** Check mustache vars in all edge prompts against flat schema properties. */
|
||||||
function checkEnumMustache(
|
function checkFlatMustache(
|
||||||
roleName: string,
|
roleName: string,
|
||||||
graphEntry: Record<string, { role: string; prompt: string }>,
|
graphEntry: Record<string, { role: string; prompt: string }>,
|
||||||
fm: SchemaObj,
|
fm: SchemaObj,
|
||||||
|
|||||||
@@ -99,7 +99,7 @@ export function checkWorkflowFilenameConsistency(
|
|||||||
): string | null {
|
): string | null {
|
||||||
const expected = workflowNameFromPath(filePath);
|
const expected = workflowNameFromPath(filePath);
|
||||||
if (payload.name !== expected) {
|
if (payload.name !== expected) {
|
||||||
return `workflow name mismatch: file "${basename(filePath)}" implies name "${expected}" but YAML declares name "${payload.name}"`;
|
return `workflow name mismatch: file "${basename(filePath)}" implies name "${expected}" but YAML declares name "${payload.name}". Either rename the file to "${payload.name}.yaml" or change the YAML \`name\` field to "${expected}"`;
|
||||||
}
|
}
|
||||||
return null;
|
return null;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/eval",
|
"name": "@united-workforce/eval",
|
||||||
"version": "0.1.3",
|
"version": "0.1.5",
|
||||||
"private": false,
|
"private": false,
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
@@ -22,8 +22,8 @@
|
|||||||
"test:ci": "vitest run __tests__/"
|
"test:ci": "vitest run __tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@ocas/fs": "^0.3.0",
|
"@ocas/fs": "^0.4.0",
|
||||||
"@united-workforce/protocol": "workspace:^",
|
"@united-workforce/protocol": "workspace:^",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"commander": "^14.0.3",
|
"commander": "^14.0.3",
|
||||||
|
|||||||
@@ -6,7 +6,7 @@ import { formatList, selectEntries } from "./format.js";
|
|||||||
import { readEvalEntries } from "./read.js";
|
import { readEvalEntries } from "./read.js";
|
||||||
|
|
||||||
const log = createLogger({ sink: { kind: "stderr" } });
|
const log = createLogger({ sink: { kind: "stderr" } });
|
||||||
const LOG_LIST = "L5KX9R2B";
|
const LOG_LIST = "H5KX9R2B";
|
||||||
|
|
||||||
type ListCliOptions = {
|
type ListCliOptions = {
|
||||||
task: string | undefined;
|
task: string | undefined;
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/protocol",
|
"name": "@united-workforce/protocol",
|
||||||
"version": "0.1.0",
|
"version": "0.1.1",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -18,8 +18,8 @@
|
|||||||
"test:ci": "vitest run src/__tests__/"
|
"test:ci": "vitest run src/__tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@ocas/fs": "^0.3.0"
|
"@ocas/fs": "^0.4.0"
|
||||||
},
|
},
|
||||||
"devDependencies": {
|
"devDependencies": {
|
||||||
"typescript": "^5.8.3"
|
"typescript": "^5.8.3"
|
||||||
|
|||||||
@@ -0,0 +1,8 @@
|
|||||||
|
# Changelog
|
||||||
|
|
||||||
|
## 0.1.2 — 2026-06-07
|
||||||
|
|
||||||
|
- fix: decouple session resume from isFirstVisit guard
|
||||||
|
|
||||||
|
When frontmatter validation fails, the step is never written to CAS, so isFirstVisit remains true on the next run. Both adapters now always check the session cache regardless of isFirstVisit. When resuming after a frontmatter-only failure (isFirstVisit + cache hit), a minimal correction prompt is sent via buildFrontmatterRetryPrompt() instead of re-sending the full initial prompt.
|
||||||
|
|
||||||
@@ -143,7 +143,7 @@ describe("buildOutputFormatInstruction", () => {
|
|||||||
{
|
{
|
||||||
type: "object",
|
type: "object",
|
||||||
properties: {
|
properties: {
|
||||||
$status: { type: "string", enum: ["approved"] },
|
$status: { const: "approved" },
|
||||||
branch: { type: "string" },
|
branch: { type: "string" },
|
||||||
},
|
},
|
||||||
required: ["$status"],
|
required: ["$status"],
|
||||||
@@ -151,7 +151,7 @@ describe("buildOutputFormatInstruction", () => {
|
|||||||
{
|
{
|
||||||
type: "object",
|
type: "object",
|
||||||
properties: {
|
properties: {
|
||||||
$status: { type: "string", enum: ["rejected"] },
|
$status: { const: "rejected" },
|
||||||
comments: { type: "string" },
|
comments: { type: "string" },
|
||||||
},
|
},
|
||||||
required: ["$status"],
|
required: ["$status"],
|
||||||
@@ -225,4 +225,34 @@ describe("buildOutputFormatInstruction", () => {
|
|||||||
const result = buildOutputFormatInstruction({});
|
const result = buildOutputFormatInstruction({});
|
||||||
expect(result).toContain("Focus exclusively on YOUR role");
|
expect(result).toContain("Focus exclusively on YOUR role");
|
||||||
});
|
});
|
||||||
|
|
||||||
|
test("renders const value as literal in flat schema example", () => {
|
||||||
|
const schema = {
|
||||||
|
type: "object",
|
||||||
|
properties: {
|
||||||
|
$status: { type: "string", const: "greeted" },
|
||||||
|
message: { type: "string" },
|
||||||
|
},
|
||||||
|
required: ["$status", "message"],
|
||||||
|
};
|
||||||
|
const result = buildOutputFormatInstruction(schema);
|
||||||
|
expect(result).toContain("$status: greeted");
|
||||||
|
expect(result).toContain("fixed value");
|
||||||
|
expect(result).not.toContain("$status: <string>");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("renders const value for non-string types", () => {
|
||||||
|
const schema = {
|
||||||
|
type: "object",
|
||||||
|
properties: {
|
||||||
|
count: { type: "number", const: 42 },
|
||||||
|
done: { type: "boolean", const: true },
|
||||||
|
},
|
||||||
|
required: ["count", "done"],
|
||||||
|
};
|
||||||
|
const result = buildOutputFormatInstruction(schema);
|
||||||
|
expect(result).toContain("count: 42");
|
||||||
|
expect(result).toContain("done: true");
|
||||||
|
expect(result).toContain("fixed value");
|
||||||
|
});
|
||||||
});
|
});
|
||||||
|
|||||||
@@ -0,0 +1,59 @@
|
|||||||
|
import type { StepContext } from "@united-workforce/protocol";
|
||||||
|
import { describe, expect, test } from "vitest";
|
||||||
|
import { buildThreadProgress } from "../src/build-thread-progress.js";
|
||||||
|
|
||||||
|
function makeStep(role: string): StepContext {
|
||||||
|
return {
|
||||||
|
role,
|
||||||
|
output: {},
|
||||||
|
detail: "0000000000000" as string,
|
||||||
|
agent: "uwf-mock",
|
||||||
|
edgePrompt: "",
|
||||||
|
startedAtMs: 0,
|
||||||
|
completedAtMs: 0,
|
||||||
|
cwd: "",
|
||||||
|
assembledPrompt: null,
|
||||||
|
usage: null,
|
||||||
|
content: null,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
describe("buildThreadProgress", () => {
|
||||||
|
test("first step of thread", () => {
|
||||||
|
const result = buildThreadProgress([], "proponent");
|
||||||
|
expect(result).toContain("## Thread Progress");
|
||||||
|
expect(result).toContain("first step");
|
||||||
|
expect(result).toContain("first time");
|
||||||
|
expect(result).toContain("proponent");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("second step, role not seen before", () => {
|
||||||
|
const steps = [makeStep("opponent")];
|
||||||
|
const result = buildThreadProgress(steps, "proponent");
|
||||||
|
expect(result).toContain("Thread step 2");
|
||||||
|
expect(result).toContain("spoken 0 times");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("role has spoken once before", () => {
|
||||||
|
const steps = [makeStep("proponent"), makeStep("opponent")];
|
||||||
|
const result = buildThreadProgress(steps, "proponent");
|
||||||
|
expect(result).toContain("Thread step 3");
|
||||||
|
expect(result).toContain("spoken 1 time before");
|
||||||
|
// singular "time" not "times"
|
||||||
|
expect(result).not.toContain("1 times");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("role has spoken multiple times", () => {
|
||||||
|
const steps = [
|
||||||
|
makeStep("proponent"),
|
||||||
|
makeStep("opponent"),
|
||||||
|
makeStep("proponent"),
|
||||||
|
makeStep("opponent"),
|
||||||
|
makeStep("proponent"),
|
||||||
|
makeStep("opponent"),
|
||||||
|
];
|
||||||
|
const result = buildThreadProgress(steps, "proponent");
|
||||||
|
expect(result).toContain("Thread step 7");
|
||||||
|
expect(result).toContain("spoken 3 times");
|
||||||
|
});
|
||||||
|
});
|
||||||
@@ -0,0 +1,23 @@
|
|||||||
|
import { describe, expect, test } from "vitest";
|
||||||
|
import { buildFrontmatterRetryPrompt } from "../src/frontmatter-retry-prompt.js";
|
||||||
|
|
||||||
|
describe("buildFrontmatterRetryPrompt", () => {
|
||||||
|
test("includes correction instruction", () => {
|
||||||
|
const result = buildFrontmatterRetryPrompt("Use YAML frontmatter");
|
||||||
|
expect(result).toContain("previous run completed");
|
||||||
|
expect(result).toContain("do NOT need to redo any work");
|
||||||
|
expect(result).toContain("corrected YAML frontmatter");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("includes outputFormatInstruction when provided", () => {
|
||||||
|
const instruction = "---\nstatus: $done | $review\nsummary: string\n---";
|
||||||
|
const result = buildFrontmatterRetryPrompt(instruction);
|
||||||
|
expect(result).toContain(instruction);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("works with empty outputFormatInstruction", () => {
|
||||||
|
const result = buildFrontmatterRetryPrompt("");
|
||||||
|
expect(result).not.toContain("\n\n\n");
|
||||||
|
expect(result).toContain("corrected YAML frontmatter");
|
||||||
|
});
|
||||||
|
});
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/util-agent",
|
"name": "@united-workforce/util-agent",
|
||||||
"version": "0.1.0",
|
"version": "0.1.2",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
@@ -18,8 +18,8 @@
|
|||||||
"test:ci": "vitest run __tests__/ src/__tests__/"
|
"test:ci": "vitest run __tests__/ src/__tests__/"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@ocas/core": "^0.3.0",
|
"@ocas/core": "^0.4.0",
|
||||||
"@ocas/fs": "^0.3.0",
|
"@ocas/fs": "^0.4.0",
|
||||||
"@united-workforce/protocol": "workspace:^",
|
"@united-workforce/protocol": "workspace:^",
|
||||||
"@united-workforce/util": "workspace:^",
|
"@united-workforce/util": "workspace:^",
|
||||||
"dotenv": "^16.6.1",
|
"dotenv": "^16.6.1",
|
||||||
|
|||||||
@@ -74,6 +74,10 @@ function collectObjectSchemas(schema: JSONSchema): JSONSchema[] {
|
|||||||
}
|
}
|
||||||
|
|
||||||
function resolvePropertySchema(prop: JSONSchema): JSONSchema {
|
function resolvePropertySchema(prop: JSONSchema): JSONSchema {
|
||||||
|
if (prop.const !== undefined) {
|
||||||
|
return prop;
|
||||||
|
}
|
||||||
|
|
||||||
if (Array.isArray(prop.enum) && prop.enum.length > 0) {
|
if (Array.isArray(prop.enum) && prop.enum.length > 0) {
|
||||||
return prop;
|
return prop;
|
||||||
}
|
}
|
||||||
@@ -113,6 +117,11 @@ function buildPropertyExampleLine(prop: SchemaProperty): string {
|
|||||||
commentParts.push("required");
|
commentParts.push("required");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (resolved.const !== undefined) {
|
||||||
|
commentParts.push("fixed value");
|
||||||
|
return `${prop.name}: ${formatYamlScalar(resolved.const)}${buildPropertyComment(commentParts)}`;
|
||||||
|
}
|
||||||
|
|
||||||
if (Array.isArray(resolved.enum) && resolved.enum.length > 0) {
|
if (Array.isArray(resolved.enum) && resolved.enum.length > 0) {
|
||||||
const enumValues = resolved.enum.map((v) => String(v));
|
const enumValues = resolved.enum.map((v) => String(v));
|
||||||
commentParts.push(...enumValues);
|
commentParts.push(...enumValues);
|
||||||
|
|||||||
@@ -0,0 +1,27 @@
|
|||||||
|
import type { StepContext } from "@united-workforce/protocol";
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Build a compact thread-progress summary so the agent knows where it is
|
||||||
|
* in the conversation without making tool calls to count steps.
|
||||||
|
*
|
||||||
|
* Example output:
|
||||||
|
* ## Thread Progress
|
||||||
|
* Thread step 6. You (proponent) have spoken 2 times before this turn.
|
||||||
|
*/
|
||||||
|
export function buildThreadProgress(steps: StepContext[], role: string): string {
|
||||||
|
const totalSteps = steps.length;
|
||||||
|
const roleVisits = steps.filter((s) => s.role === role).length;
|
||||||
|
|
||||||
|
const parts = [`## Thread Progress`];
|
||||||
|
if (totalSteps === 0) {
|
||||||
|
parts.push(
|
||||||
|
`This is the first step of the thread. You (${role}) are speaking for the first time.`,
|
||||||
|
);
|
||||||
|
} else {
|
||||||
|
parts.push(
|
||||||
|
`Thread step ${totalSteps + 1}. You (${role}) have spoken ${roleVisits} time${roleVisits === 1 ? "" : "s"} before this turn.`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
return parts.join("\n");
|
||||||
|
}
|
||||||
@@ -0,0 +1,21 @@
|
|||||||
|
/**
|
||||||
|
* Build a minimal prompt for retrying frontmatter output on a resumed session.
|
||||||
|
*
|
||||||
|
* Used when a previous run completed successfully but frontmatter validation
|
||||||
|
* failed — the session already has full context, we just need the agent to
|
||||||
|
* re-output correctly formatted frontmatter without redoing any work.
|
||||||
|
*/
|
||||||
|
export function buildFrontmatterRetryPrompt(outputFormatInstruction: string): string {
|
||||||
|
const parts: string[] = [
|
||||||
|
"Your previous run completed all work successfully, but the output format was incorrect.",
|
||||||
|
"You do NOT need to redo any work — all changes are already in place.",
|
||||||
|
"",
|
||||||
|
];
|
||||||
|
if (outputFormatInstruction !== "") {
|
||||||
|
parts.push(outputFormatInstruction, "");
|
||||||
|
}
|
||||||
|
parts.push(
|
||||||
|
"Please output ONLY the corrected YAML frontmatter block (--- delimited) followed by a brief summary of the work you completed.",
|
||||||
|
);
|
||||||
|
return parts.join("\n");
|
||||||
|
}
|
||||||
@@ -1,6 +1,7 @@
|
|||||||
export { buildContinuationPrompt } from "./build-continuation-prompt.js";
|
export { buildContinuationPrompt } from "./build-continuation-prompt.js";
|
||||||
export { buildOutputFormatInstruction } from "./build-output-format-instruction.js";
|
export { buildOutputFormatInstruction } from "./build-output-format-instruction.js";
|
||||||
export { buildRolePrompt } from "./build-role-prompt.js";
|
export { buildRolePrompt } from "./build-role-prompt.js";
|
||||||
|
export { buildThreadProgress } from "./build-thread-progress.js";
|
||||||
export type { BuildContextMeta } from "./context.js";
|
export type { BuildContextMeta } from "./context.js";
|
||||||
export { buildContext, buildContextWithMeta } from "./context.js";
|
export { buildContext, buildContextWithMeta } from "./context.js";
|
||||||
export type { ExtractResult, ResolvedLlmProvider } from "./extract.js";
|
export type { ExtractResult, ResolvedLlmProvider } from "./extract.js";
|
||||||
@@ -11,6 +12,7 @@ export {
|
|||||||
} from "./extract.js";
|
} from "./extract.js";
|
||||||
export type { FrontmatterFastPathResult } from "./frontmatter.js";
|
export type { FrontmatterFastPathResult } from "./frontmatter.js";
|
||||||
export { tryFrontmatterFastPath } from "./frontmatter.js";
|
export { tryFrontmatterFastPath } from "./frontmatter.js";
|
||||||
|
export { buildFrontmatterRetryPrompt } from "./frontmatter-retry-prompt.js";
|
||||||
export { createAgent, parseArgv } from "./run.js";
|
export { createAgent, parseArgv } from "./run.js";
|
||||||
export { getCachedSessionId, getCachePath, setCachedSessionId } from "./session-cache.js";
|
export { getCachedSessionId, getCachePath, setCachedSessionId } from "./session-cache.js";
|
||||||
export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
|
export { getConfigPath, getEnvPath, loadWorkflowConfig, resolveStorageRoot } from "./storage.js";
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@united-workforce/util",
|
"name": "@united-workforce/util",
|
||||||
"version": "0.1.2",
|
"version": "0.1.4",
|
||||||
"files": [
|
"files": [
|
||||||
"src",
|
"src",
|
||||||
"dist",
|
"dist",
|
||||||
|
|||||||
@@ -140,5 +140,18 @@ For specific scenarios, run the corresponding \`uwf prompt\` command:
|
|||||||
|----------|---------|-------------|
|
|----------|---------|-------------|
|
||||||
| Writing workflow YAML | \`uwf prompt workflow-authoring\` | Designing roles, conditions, graphs, and edge prompts |
|
| Writing workflow YAML | \`uwf prompt workflow-authoring\` | Designing roles, conditions, graphs, and edge prompts |
|
||||||
| Building a new agent adapter | \`uwf prompt adapter-developing\` | Creating a new \`uwf-<name>\` CLI adapter |
|
| Building a new agent adapter | \`uwf prompt adapter-developing\` | Creating a new \`uwf-<name>\` CLI adapter |
|
||||||
|
|
||||||
|
## Upgrading
|
||||||
|
|
||||||
|
\`\`\`bash
|
||||||
|
# Install the latest version
|
||||||
|
pnpm add -g @united-workforce/cli@latest @united-workforce/agent-hermes@latest
|
||||||
|
# or: npm install -g @united-workforce/cli@latest @united-workforce/agent-hermes@latest
|
||||||
|
|
||||||
|
# Verify
|
||||||
|
uwf --version
|
||||||
|
|
||||||
|
# Then run uwf prompt bootstrap and follow the upgrade instructions
|
||||||
|
\`\`\`
|
||||||
`;
|
`;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -28,6 +28,7 @@ roles: # named actors
|
|||||||
2. Do that
|
2. Do that
|
||||||
output: "..." # what the agent should produce
|
output: "..." # what the agent should produce
|
||||||
frontmatter: # JSON Schema for structured output
|
frontmatter: # JSON Schema for structured output
|
||||||
|
type: object
|
||||||
oneOf:
|
oneOf:
|
||||||
- properties:
|
- properties:
|
||||||
$status: { const: "ready" }
|
$status: { const: "ready" }
|
||||||
@@ -71,10 +72,13 @@ The \`frontmatter\` field is a standard JSON Schema. It defines the structured f
|
|||||||
|
|
||||||
### \`$status\` Field
|
### \`$status\` Field
|
||||||
|
|
||||||
\`$status\` is the only standard field. Its value determines which graph edge the moderator follows. Use \`const\` to constrain each variant:
|
\`$status\` is the only standard field. Its value determines which graph edge the moderator follows.
|
||||||
|
|
||||||
|
**Multi-exit (oneOf)** — use \`const\` to constrain each variant:
|
||||||
|
|
||||||
\`\`\`yaml
|
\`\`\`yaml
|
||||||
frontmatter:
|
frontmatter:
|
||||||
|
type: object
|
||||||
oneOf:
|
oneOf:
|
||||||
- properties:
|
- properties:
|
||||||
$status: { const: "done" }
|
$status: { const: "done" }
|
||||||
@@ -86,22 +90,26 @@ frontmatter:
|
|||||||
required: [$status, error]
|
required: [$status, error]
|
||||||
\`\`\`
|
\`\`\`
|
||||||
|
|
||||||
### Custom Fields
|
**Single-exit (flat schema)** — same syntax, just no \`oneOf\` wrapper:
|
||||||
|
|
||||||
Add any fields you need for data passing between roles. These are available in edge prompts via Mustache templates.
|
|
||||||
|
|
||||||
### Flat Schema (Single Status)
|
|
||||||
|
|
||||||
When a role has only one outcome:
|
|
||||||
|
|
||||||
\`\`\`yaml
|
\`\`\`yaml
|
||||||
frontmatter:
|
frontmatter:
|
||||||
|
type: object
|
||||||
properties:
|
properties:
|
||||||
$status: { const: "done" }
|
$status: { const: "done" }
|
||||||
summary: { type: string }
|
summary: { type: string }
|
||||||
required: [$status, summary]
|
required: [$status, summary]
|
||||||
\`\`\`
|
\`\`\`
|
||||||
|
|
||||||
|
**Important rules:**
|
||||||
|
- \`type: object\` is **required** at the top level of frontmatter (both flat and oneOf)
|
||||||
|
- \`$status\` always uses \`const: "value"\` — simple and consistent
|
||||||
|
- \`enum\` is **not supported** for \`$status\` — the validator will reject it
|
||||||
|
|
||||||
|
### Custom Fields
|
||||||
|
|
||||||
|
Add any fields you need for data passing between roles. These are available in edge prompts via Mustache templates.
|
||||||
|
|
||||||
## Graph Routing
|
## Graph Routing
|
||||||
|
|
||||||
The graph maps each role's \`$status\` values to the next role:
|
The graph maps each role's \`$status\` values to the next role:
|
||||||
|
|||||||
Generated
+38
-36
@@ -18,8 +18,8 @@ importers:
|
|||||||
specifier: ^2.31.0
|
specifier: ^2.31.0
|
||||||
version: 2.31.0(@types/node@25.9.1)
|
version: 2.31.0(@types/node@25.9.1)
|
||||||
'@shazhou/proman':
|
'@shazhou/proman':
|
||||||
specifier: ^0.5.1
|
specifier: ^0.6.3
|
||||||
version: 0.5.1(@biomejs/biome@2.4.16)(typescript@5.9.3)(vite@7.3.5(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(yaml@2.9.0))(vitest@3.2.6(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(msw@2.14.6(@types/node@25.9.1)(typescript@5.9.3))(yaml@2.9.0))
|
version: 0.6.3(@biomejs/biome@2.4.16)(typescript@5.9.3)(vite@7.3.5(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(yaml@2.9.0))(vitest@3.2.6(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(msw@2.14.6(@types/node@25.9.1)(typescript@5.9.3))(yaml@2.9.0))
|
||||||
'@types/node':
|
'@types/node':
|
||||||
specifier: ^25.7.0
|
specifier: ^25.7.0
|
||||||
version: 25.9.1
|
version: 25.9.1
|
||||||
@@ -45,8 +45,8 @@ importers:
|
|||||||
packages/agent-builtin:
|
packages/agent-builtin:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/util':
|
'@united-workforce/util':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../util
|
version: link:../util
|
||||||
@@ -61,8 +61,8 @@ importers:
|
|||||||
packages/agent-claude-code:
|
packages/agent-claude-code:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/protocol':
|
'@united-workforce/protocol':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../protocol
|
version: link:../protocol
|
||||||
@@ -80,8 +80,8 @@ importers:
|
|||||||
packages/agent-hermes:
|
packages/agent-hermes:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/protocol':
|
'@united-workforce/protocol':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../protocol
|
version: link:../protocol
|
||||||
@@ -99,8 +99,8 @@ importers:
|
|||||||
packages/agent-mock:
|
packages/agent-mock:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/protocol':
|
'@united-workforce/protocol':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../protocol
|
version: link:../protocol
|
||||||
@@ -121,11 +121,11 @@ importers:
|
|||||||
packages/cli:
|
packages/cli:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@ocas/fs':
|
'@ocas/fs':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/protocol':
|
'@united-workforce/protocol':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../protocol
|
version: link:../protocol
|
||||||
@@ -231,11 +231,11 @@ importers:
|
|||||||
packages/eval:
|
packages/eval:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@ocas/fs':
|
'@ocas/fs':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/protocol':
|
'@united-workforce/protocol':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../protocol
|
version: link:../protocol
|
||||||
@@ -256,11 +256,11 @@ importers:
|
|||||||
packages/protocol:
|
packages/protocol:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@ocas/fs':
|
'@ocas/fs':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
devDependencies:
|
devDependencies:
|
||||||
typescript:
|
typescript:
|
||||||
specifier: ^5.8.3
|
specifier: ^5.8.3
|
||||||
@@ -275,11 +275,11 @@ importers:
|
|||||||
packages/util-agent:
|
packages/util-agent:
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core':
|
'@ocas/core':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@ocas/fs':
|
'@ocas/fs':
|
||||||
specifier: ^0.3.0
|
specifier: ^0.4.0
|
||||||
version: 0.3.0
|
version: 0.4.0
|
||||||
'@united-workforce/protocol':
|
'@united-workforce/protocol':
|
||||||
specifier: workspace:^
|
specifier: workspace:^
|
||||||
version: link:../protocol
|
version: link:../protocol
|
||||||
@@ -892,11 +892,13 @@ packages:
|
|||||||
resolution: {integrity: sha512-oGB+UxlgWcgQkgwo8GcEGwemoTFt3FIO9ababBmaGwXIoBKZ+GTy0pP185beGg7Llih/NSHSV2XAs1lnznocSg==}
|
resolution: {integrity: sha512-oGB+UxlgWcgQkgwo8GcEGwemoTFt3FIO9ababBmaGwXIoBKZ+GTy0pP185beGg7Llih/NSHSV2XAs1lnznocSg==}
|
||||||
engines: {node: '>= 8'}
|
engines: {node: '>= 8'}
|
||||||
|
|
||||||
'@ocas/core@0.3.0':
|
'@ocas/core@0.4.0':
|
||||||
resolution: {integrity: sha512-ejDDZbmQkTj2GoJg+cNjXa3eHlQGybW3PrUZlwERBvBFjjnYBLHOG7AQQYM48bI52UiqucafgZjPEYk9SZd6AQ==}
|
resolution: {integrity: sha512-6JvHd3nr5GncMOBNaZTf9ZTWou/txONTfZbkrblmgqL/H+YuRj1FfeFY+b1ndUlfwR7AuJ6bvoSxR5RP+AbC0w==}
|
||||||
|
engines: {node: '>=22.5.0'}
|
||||||
|
|
||||||
'@ocas/fs@0.3.0':
|
'@ocas/fs@0.4.0':
|
||||||
resolution: {integrity: sha512-/6/nICYVJWXeWx2LcPoHHJAFoqXpJoAtvhLKLS0zpkwtsZX3g0D9X6J5soHCV1QS+BOWybuOJ0+W3cB1FBRkZA==}
|
resolution: {integrity: sha512-AQG6dk1YCL1qpSszUWUgEY+LQhYbTv5hXYrs3J2pHAi2/lY615O2cTgjwEeh6JTcrqHsFwiDsDdKIKMpADchZA==}
|
||||||
|
engines: {node: '>=22.5.0'}
|
||||||
|
|
||||||
'@open-draft/deferred-promise@2.2.0':
|
'@open-draft/deferred-promise@2.2.0':
|
||||||
resolution: {integrity: sha512-CecwLWx3rhxVQF6V4bAgPS5t+So2sTbPgAzafKkVizyi7tlwpcFpdFqq+wqF2OwNBmqFuu6tOyouTuxgpMfzmA==}
|
resolution: {integrity: sha512-CecwLWx3rhxVQF6V4bAgPS5t+So2sTbPgAzafKkVizyi7tlwpcFpdFqq+wqF2OwNBmqFuu6tOyouTuxgpMfzmA==}
|
||||||
@@ -1152,8 +1154,8 @@ packages:
|
|||||||
'@sec-ant/readable-stream@0.4.1':
|
'@sec-ant/readable-stream@0.4.1':
|
||||||
resolution: {integrity: sha512-831qok9r2t8AlxLko40y2ebgSDhenenCatLVeW/uBtnHPyhHOvG0C7TvfgecV+wHzIm5KUICgzmVpWS+IMEAeg==}
|
resolution: {integrity: sha512-831qok9r2t8AlxLko40y2ebgSDhenenCatLVeW/uBtnHPyhHOvG0C7TvfgecV+wHzIm5KUICgzmVpWS+IMEAeg==}
|
||||||
|
|
||||||
'@shazhou/proman@0.5.1':
|
'@shazhou/proman@0.6.3':
|
||||||
resolution: {integrity: sha512-GmFUvd8SAOUW/eaDIEh31pVKSE3XhbgHOZ5vSpX4xS+F8Zl6lAfhgVCjcjRK8w5d43tsH47CVorwyxQcRaJFfA==}
|
resolution: {integrity: sha512-KguWl1xHrWXx1YWYrWj47v4NRbaQuKCm7Hd7T8dzrqnkM8UL8em3R9rC7GeDzI8YDDfriFeLTX+xb03UHkhTDA==}
|
||||||
hasBin: true
|
hasBin: true
|
||||||
peerDependencies:
|
peerDependencies:
|
||||||
'@biomejs/biome': ^2.0.0
|
'@biomejs/biome': ^2.0.0
|
||||||
@@ -3896,16 +3898,16 @@ snapshots:
|
|||||||
'@nodelib/fs.scandir': 2.1.5
|
'@nodelib/fs.scandir': 2.1.5
|
||||||
fastq: 1.20.1
|
fastq: 1.20.1
|
||||||
|
|
||||||
'@ocas/core@0.3.0':
|
'@ocas/core@0.4.0':
|
||||||
dependencies:
|
dependencies:
|
||||||
ajv: 8.20.0
|
ajv: 8.20.0
|
||||||
cborg: 4.5.8
|
cborg: 4.5.8
|
||||||
liquidjs: 10.27.0
|
liquidjs: 10.27.0
|
||||||
xxhash-wasm: 1.1.0
|
xxhash-wasm: 1.1.0
|
||||||
|
|
||||||
'@ocas/fs@0.3.0':
|
'@ocas/fs@0.4.0':
|
||||||
dependencies:
|
dependencies:
|
||||||
'@ocas/core': 0.3.0
|
'@ocas/core': 0.4.0
|
||||||
cborg: 4.5.8
|
cborg: 4.5.8
|
||||||
|
|
||||||
'@open-draft/deferred-promise@2.2.0': {}
|
'@open-draft/deferred-promise@2.2.0': {}
|
||||||
@@ -4049,7 +4051,7 @@ snapshots:
|
|||||||
|
|
||||||
'@sec-ant/readable-stream@0.4.1': {}
|
'@sec-ant/readable-stream@0.4.1': {}
|
||||||
|
|
||||||
'@shazhou/proman@0.5.1(@biomejs/biome@2.4.16)(typescript@5.9.3)(vite@7.3.5(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(yaml@2.9.0))(vitest@3.2.6(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(msw@2.14.6(@types/node@25.9.1)(typescript@5.9.3))(yaml@2.9.0))':
|
'@shazhou/proman@0.6.3(@biomejs/biome@2.4.16)(typescript@5.9.3)(vite@7.3.5(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(yaml@2.9.0))(vitest@3.2.6(@types/node@25.9.1)(jiti@2.7.0)(lightningcss@1.32.0)(msw@2.14.6(@types/node@25.9.1)(typescript@5.9.3))(yaml@2.9.0))':
|
||||||
dependencies:
|
dependencies:
|
||||||
'@biomejs/biome': 2.4.16
|
'@biomejs/biome': 2.4.16
|
||||||
typescript: 5.9.3
|
typescript: 5.9.3
|
||||||
|
|||||||
@@ -1,329 +0,0 @@
|
|||||||
name: solve-issue
|
|
||||||
description: TDD-driven issue resolution adapted for the workflow monorepo with bun + vitest
|
|
||||||
roles:
|
|
||||||
planner:
|
|
||||||
description: Analyzes issue and outputs a TDD test spec
|
|
||||||
goal: You are a planning agent. You analyze Gitea issues and produce a TDD test specification that downstream roles will implement and verify.
|
|
||||||
capabilities:
|
|
||||||
- issue-analysis
|
|
||||||
- planning
|
|
||||||
procedure: 'On first run (no previous steps):
|
|
||||||
|
|
||||||
1. Read the issue and all comments from Gitea using `tea issues <number> -r <owner/repo>`
|
|
||||||
|
|
||||||
2. Look for project conventions files (CLAUDE.md, CONTRIBUTING.md) in the repo
|
|
||||||
|
|
||||||
3. Assess whether the issue has enough information to produce a test spec
|
|
||||||
|
|
||||||
4. If insufficient info: comment on the issue via `echo "..." | tea comment <number> -r <owner/repo>` (skip if you already commented), then output $status=insufficient_info
|
|
||||||
|
|
||||||
5. If sufficient: produce a detailed TDD test spec in markdown covering all scenarios
|
|
||||||
|
|
||||||
|
|
||||||
On subsequent runs (bounced back by tester with fix_spec):
|
|
||||||
|
|
||||||
1. Read the tester''s output from the previous step to understand what''s wrong with the spec
|
|
||||||
|
|
||||||
2. Revise the test spec accordingly
|
|
||||||
|
|
||||||
|
|
||||||
After producing the test spec:
|
|
||||||
|
|
||||||
1. The test spec is stored in CAS automatically by the uwf pipeline (agents do not need to call `ocas put` directly)
|
|
||||||
|
|
||||||
2. Put the hash in frontmatter.plan (required when $status=ready)
|
|
||||||
|
|
||||||
3. Set repoPath to the absolute path of the repository root
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
IMPORTANT: Extract the repo remote (owner/repo) from git:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
|
|
||||||
git remote get-url origin | sed ''s|.*[:/]\([^/]*/[^.]*\).*|\1|''
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
Store the result as repoRemote in your frontmatter output so downstream roles can use it for tea/API calls.'
|
|
||||||
output: Output a brief summary of the test spec. Set $status to ready (with plan hash and repoPath) or insufficient_info.
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: ready
|
|
||||||
plan:
|
|
||||||
type: string
|
|
||||||
repoPath:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- plan
|
|
||||||
- repoPath
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: insufficient_info
|
|
||||||
reason:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- reason
|
|
||||||
developer:
|
|
||||||
description: TDD implementation per test spec
|
|
||||||
goal: You are a developer agent. You implement code changes following TDD — write tests first, then implementation.
|
|
||||||
capabilities:
|
|
||||||
- coding
|
|
||||||
procedure: "IMPORTANT: Always work in a git worktree, NEVER modify the main working directory directly.\nThe repo path and other details are provided in your task prompt.\n\nBefore starting any work,\
|
|
||||||
\ set up an isolated worktree:\n1. cd into the repo path provided in your task prompt\n2. `git fetch origin` to get latest refs\n3. First time (no existing branch):\n - `git worktree add .worktrees/fix/<issue-number>-<short-slug>\
|
|
||||||
\ -b fix/<issue-number>-<short-slug> origin/main`\n - `cd .worktrees/fix/<issue-number>-<short-slug> && bun install`\n4. If bounced back from reviewer or tester (branch already exists):\n - cd\
|
|
||||||
\ into the existing worktree under `.worktrees/fix/<issue-number>-<short-slug>`\n - `git fetch origin && git rebase origin/main`\n5. ALL subsequent work must happen inside the worktree directory.\n\
|
|
||||||
\nThen implement TDD:\n6. Read the test spec from CAS: `ocas get <plan hash>` (find the hash from the planner's output in your task prompt)\n7. If bounced back from reviewer or tester: read the\
|
|
||||||
\ previous role's feedback in your task prompt\n8. Write tests first based on the spec (use vitest)\n9. Implement the code to make tests pass\n10. Ensure `bun run build` passes with no errors\n11.\
|
|
||||||
\ Run `bun test` to verify all tests pass\n\nIf you cannot complete the implementation (e.g. the issue is too complex, blocked by external factors,\nor repeated attempts fail), set $status=failed\
|
|
||||||
\ with a reason.\n"
|
|
||||||
output: List all files changed and provide a summary. Set $status to done (with branch/worktree), or failed (with reason).
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: done
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- branch
|
|
||||||
- worktree
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: failed
|
|
||||||
reason:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- reason
|
|
||||||
reviewer:
|
|
||||||
description: Code standards compliance check
|
|
||||||
goal: You are a code reviewer. You verify code standards compliance — NOT functionality (that's the tester's job).
|
|
||||||
capabilities:
|
|
||||||
- code-review
|
|
||||||
- static-analysis
|
|
||||||
procedure: 'The worktree path is provided in your task prompt. cd into it first.
|
|
||||||
|
|
||||||
|
|
||||||
Before reviewing, verify the git branch:
|
|
||||||
|
|
||||||
1. Run `git branch --show-current` — confirm the branch name references the issue number being worked on
|
|
||||||
|
|
||||||
2. If the branch doesn''t correspond to the issue, flag it in your output and reject
|
|
||||||
|
|
||||||
|
|
||||||
Then perform code review:
|
|
||||||
|
|
||||||
Hard checks (must all pass):
|
|
||||||
|
|
||||||
3. `bun run build` — no build errors
|
|
||||||
|
|
||||||
4. `bunx biome check` — no lint violations
|
|
||||||
|
|
||||||
5. TypeScript strict mode — no type errors
|
|
||||||
|
|
||||||
|
|
||||||
Soft checks (review against project conventions from CLAUDE.md):
|
|
||||||
|
|
||||||
- Functional-first: functions + types, no classes (except for errors or third-party requirements)
|
|
||||||
|
|
||||||
- Named exports only, no default exports
|
|
||||||
|
|
||||||
- No optional properties (use `T | null` instead of `?:`)
|
|
||||||
|
|
||||||
- Folder module discipline: index.ts only re-exports, types in types.ts
|
|
||||||
|
|
||||||
- Crockford Base32 log tags (8-char, unique per call site)
|
|
||||||
|
|
||||||
- No `console.log` in production code (use createLogger from @united-workforce/util)
|
|
||||||
|
|
||||||
- No dynamic imports in production code
|
|
||||||
|
|
||||||
|
|
||||||
Only review standards compliance. Do NOT test functionality.
|
|
||||||
|
|
||||||
If rejecting, you MUST explain the specific reason in your output.
|
|
||||||
|
|
||||||
'
|
|
||||||
output: Explain your decision with specific file/line references. Set $status to approved (with branch/worktree) or rejected (with comments).
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: approved
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- branch
|
|
||||||
- worktree
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: rejected
|
|
||||||
comments:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- comments
|
|
||||||
- worktree
|
|
||||||
tester:
|
|
||||||
description: Functional correctness verification
|
|
||||||
goal: You are a tester agent. You verify that the implementation correctly satisfies every scenario in the test spec.
|
|
||||||
capabilities:
|
|
||||||
- testing
|
|
||||||
procedure: "The worktree path is provided in your task prompt. cd into it first.\n\n1. Run `bun test` for automated test verification\n2. Read the test spec from CAS: `ocas get <plan hash>` (find\
|
|
||||||
\ the hash from the planner step in the thread history)\n3. Verify each scenario in the spec is covered and passing\n4. Determine outcome:\n - passed: all scenarios verified, tests pass\n - fix_code:\
|
|
||||||
\ tests fail or implementation doesn't match spec → send back to developer\n - fix_spec: the spec itself is wrong or incomplete → send back to planner\n"
|
|
||||||
output: Report test results per scenario. Set $status to passed (with branch/worktree), fix_code (with report), or fix_spec (with report).
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: passed
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- branch
|
|
||||||
- worktree
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: fix_code
|
|
||||||
report:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- report
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: fix_spec
|
|
||||||
report:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- report
|
|
||||||
committer:
|
|
||||||
description: Commits and creates PR
|
|
||||||
goal: You are a committer agent. You create a clean commit and push a PR linking the original issue.
|
|
||||||
capabilities: []
|
|
||||||
procedure: "The worktree path, branch name, and repo remote (owner/repo) are provided in your task prompt.\ncd into the worktree first.\n\nNote: You inherit the developer's worktree and branch. Do NOT\
|
|
||||||
\ create a new branch.\n1. Stage all changes: `git add -A`\n2. Commit with a descriptive message referencing the issue: `git commit -m \"type: description\\n\\nFixes #N\"`\n3. Push the branch: `git\
|
|
||||||
\ push -u origin <branch-name>`\n4. **Verify push succeeded** — run `git ls-remote origin <branch-name>` and confirm it prints a commit hash.\n - If no output or push failed: capture the error, mark hook_failed\n\
|
|
||||||
5. Create a PR using the Gitea API (do NOT use `tea pr create` — it fails in worktrees):\n ```bash\n GITEA_TOKEN=$(cfg get GITEA_TOKEN)\n curl -s -X POST -H \"Authorization: token $GITEA_TOKEN\" -H \"Content-Type: application/json\" \\\n\
|
|
||||||
\ \"https://git.shazhou.work/api/v1/repos/<owner>/<repo>/pulls\" \\\n -d '{\"title\":\"...\",\"body\":\"...\",\"head\":\"<branch>\",\"base\":\"main\"}'\n ```\n - The repo remote (owner/repo format, e.g. \"shazhou/united-workforce\") is given in your task prompt — use it directly.\n\
|
|
||||||
\ - PR body must include: What / Why / Changes / Ref sections, with `Fixes #N` in Ref\n6. **Verify PR was created** — parse the curl response JSON: it must contain a `\"number\"` field. Print the PR URL.\n\
|
|
||||||
\ - If curl returns an error or no number field: capture the response, mark hook_failed\n7. After PR creation, clean up the worktree:\n - cd to the repo root (parent of .worktrees)\n - `git worktree remove <worktree-path>`"
|
|
||||||
output: Include PR URL on success or error log on failure. Set $status to committed (with prUrl) or hook_failed (with error).
|
|
||||||
frontmatter:
|
|
||||||
oneOf:
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: committed
|
|
||||||
prUrl:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- prUrl
|
|
||||||
- properties:
|
|
||||||
$status:
|
|
||||||
const: hook_failed
|
|
||||||
error:
|
|
||||||
type: string
|
|
||||||
repoRemote:
|
|
||||||
type: string
|
|
||||||
worktree:
|
|
||||||
type: string
|
|
||||||
branch:
|
|
||||||
type: string
|
|
||||||
required:
|
|
||||||
- $status
|
|
||||||
- error
|
|
||||||
graph:
|
|
||||||
$START:
|
|
||||||
new:
|
|
||||||
role: planner
|
|
||||||
prompt: Analyze the issue and produce an implementation plan.
|
|
||||||
resume:
|
|
||||||
role: planner
|
|
||||||
prompt: Review the previous run output and continue the work.
|
|
||||||
planner:
|
|
||||||
insufficient_info:
|
|
||||||
role: $SUSPEND
|
|
||||||
prompt: "信息不足,需要补充:{{{reason}}}"
|
|
||||||
ready:
|
|
||||||
role: developer
|
|
||||||
prompt: 'Implement the TDD test spec (CAS hash: {{{plan}}}) in repo {{{repoPath}}}. Repo remote: {{{repoRemote}}}.'
|
|
||||||
developer:
|
|
||||||
done:
|
|
||||||
role: reviewer
|
|
||||||
prompt: 'Review branch {{{branch}}} at {{{worktree}}} for code standards compliance. Repo remote: {{{repoRemote}}}.'
|
|
||||||
failed:
|
|
||||||
role: $END
|
|
||||||
prompt: 'Developer failed: {{{reason}}}. Ending workflow.'
|
|
||||||
reviewer:
|
|
||||||
rejected:
|
|
||||||
role: developer
|
|
||||||
prompt: 'Reviewer rejected: {{{comments}}}. Fix the issues in repo {{{worktree}}}. Repo remote: {{{repoRemote}}}.'
|
|
||||||
approved:
|
|
||||||
role: tester
|
|
||||||
prompt: 'Review passed. Run tests on branch {{{branch}}} at {{{worktree}}}. Repo remote: {{{repoRemote}}}.'
|
|
||||||
tester:
|
|
||||||
fix_code:
|
|
||||||
role: developer
|
|
||||||
prompt: 'Tests found code issues: {{{report}}}. Fix and re-submit. Worktree: {{{worktree}}}. Repo remote: {{{repoRemote}}}.'
|
|
||||||
fix_spec:
|
|
||||||
role: planner
|
|
||||||
prompt: 'Tests found spec issues: {{{report}}}. Revise the test spec. Repo remote: {{{repoRemote}}}.'
|
|
||||||
passed:
|
|
||||||
role: committer
|
|
||||||
prompt: 'All tests passed. Commit and push branch {{{branch}}} from {{{worktree}}}. Repo remote (owner/repo): {{{repoRemote}}}.'
|
|
||||||
committer:
|
|
||||||
hook_failed:
|
|
||||||
role: developer
|
|
||||||
prompt: 'Push hook failed: {{{error}}}. Fix and re-submit. Worktree: {{{worktree}}}. Repo remote: {{{repoRemote}}}.'
|
|
||||||
committed:
|
|
||||||
role: $END
|
|
||||||
prompt: 'PR created: {{{prUrl}}}. Workflow complete.'
|
|
||||||
Reference in New Issue
Block a user