RFC: thread suspend/resume for pending information #587

New Issue

2026-06-02T03:32:29Z

xiaomo commented

2026-06-02 03:32:29 +00:00

Problem

当 workflow 执行中缺乏信息（如 planner 发现 issue 描述不充分），当前只能走 $END 终止整个 workflow。这意味着：

补充信息后需要重新 start 一个新 thread，丢失已有上下文
无法在原 thread 上继续执行
planner 只能选择「够了就继续」或「不够就结束」，没有中间态

Proposal

引入 suspend 状态，允许 thread 挂起等待外部输入，然后 resume 继续。

新状态：`suspended`

在现有 idle / running / done 之外，新增 suspended 状态。

挂起方式

Role 输出 $status: suspended 时，thread 进入挂起态，记录：

当前 role（resume 后回到这个 role）
挂起原因（agent 输出的 message/reason）

恢复方式

uwf thread resume <thread-id> -p "补充的信息"

补充的信息作为新的 prompt context 注入，thread 回到挂起时的 role 继续执行。

Graph 路由

graph:
  planner:
    suspended:
      role: $SUSPEND   # 新的特殊 target
      prompt: "Waiting for additional information: {{{reason}}}"
    ready:
      role: developer
      prompt: ...

或者更简单的方式：$SUSPEND 作为保留 status，不需要在 graph 里显式声明——任何 role 输出 $status: suspended 都会自动挂起。

待讨论

$SUSPEND 是 graph target 还是隐式 status？ — graph target 更显式，但每个可能挂起的 role 都要声明；隐式更简洁但不可见
resume 时注入方式 — 作为新 step 的 prompt？作为 context 追加到原 prompt？
多次 suspend — 同一 thread 能否多次挂起？（应该可以）
超时/过期 — suspended thread 是否需要 TTL？
CAS 记录 — suspend 事件是否需要写入 step chain？

用例

solve-issue: planner 发现 issue 信息不足 → suspend → 用户补充 → resume
review-code: reviewer 需要作者解释设计决策 → suspend → 作者回复 → resume
任何需要 human-in-the-loop 的场景

## Problem 当 workflow 执行中缺乏信息（如 planner 发现 issue 描述不充分），当前只能走 `$END` 终止整个 workflow。这意味着： 1. 补充信息后需要重新 start 一个新 thread，丢失已有上下文 2. 无法在原 thread 上继续执行 3. planner 只能选择「够了就继续」或「不够就结束」，没有中间态 ## Proposal 引入 **suspend** 状态，允许 thread 挂起等待外部输入，然后 resume 继续。 ### 新状态：`suspended` 在现有 `idle` / `running` / `done` 之外，新增 `suspended` 状态。 ### 挂起方式 Role 输出 `$status: suspended` 时，thread 进入挂起态，记录： - 当前 role（resume 后回到这个 role） - 挂起原因（agent 输出的 message/reason） ### 恢复方式 ```bash uwf thread resume <thread-id> -p "补充的信息" ``` 补充的信息作为新的 prompt context 注入，thread 回到挂起时的 role 继续执行。 ### Graph 路由 ```yaml graph: planner: suspended: role: $SUSPEND # 新的特殊 target prompt: "Waiting for additional information: {{{reason}}}" ready: role: developer prompt: ... ``` 或者更简单的方式：`$SUSPEND` 作为保留 status，不需要在 graph 里显式声明——任何 role 输出 `$status: suspended` 都会自动挂起。 ## 待讨论 1. **`$SUSPEND` 是 graph target 还是隐式 status？** — graph target 更显式，但每个可能挂起的 role 都要声明；隐式更简洁但不可见 2. **resume 时注入方式** — 作为新 step 的 prompt？作为 context 追加到原 prompt？ 3. **多次 suspend** — 同一 thread 能否多次挂起？（应该可以） 4. **超时/过期** — suspended thread 是否需要 TTL？ 5. **CAS 记录** — suspend 事件是否需要写入 step chain？ ## 用例 - solve-issue: planner 发现 issue 信息不足 → suspend → 用户补充 → resume - review-code: reviewer 需要作者解释设计决策 → suspend → 作者回复 → resume - 任何需要 human-in-the-loop 的场景

scottwei commented

2026-06-02 03:43:36 +00:00

suspend 应该是一个隐式的 status，graph 里应该没有一个类似 $START/$END 的 $SUSPEND 节点，而是在任何一个 role 上都可以有一个 suspend 状态，resume 了还是回到这个 role
应该是作为新的 step，毕竟要重新调起 agent session
可以多次
suspend 不应该超时，等待人回应，这是异步任务的常态，可能会很久
应该记录进 CAS，不过具体的方法值得讨论

1. suspend 应该是一个隐式的 status，graph 里应该没有一个类似 $START/$END 的 $SUSPEND 节点，而是在任何一个 role 上都可以有一个 suspend 状态，resume 了还是回到这个 role 2. 应该是作为新的 step，毕竟要重新调起 agent session 3. 可以多次 4. suspend 不应该超时，等待人回应，这是异步任务的常态，可能会很久 5. 应该记录进 CAS，不过具体的方法值得讨论

xiaomo commented

2026-06-02 03:44:16 +00:00

收到，方向确认：

设计决议

#	决定	说明
1	隐式 status	任何 role 输出 `$status: suspended` 即挂起，resume 回到同一 role。graph 不需要声明 `$SUSPEND` target
2	新 step	resume 注入的信息作为新 step 写入 chain，重新调起 agent session
3	可多次	同一 thread 可反复 suspend/resume
4	无 TTL	suspended thread 不过期，异步等待人回应
5	记录 CAS	suspend 事件写入 step chain，具体结构待设计

待细化

suspend step 的 CAS 结构

选项 A：suspend 是一个正常 StepNode，role 就是当前 role，output 记录 suspend reason：

{ role: "planner", $status: "suspended", output: { reason: "需要更多信息" } }

选项 B：suspend 是一个特殊类型的 step（新 schema），和普通 step 区分：

{ type: "suspend", suspendedRole: "planner", reason: "..." }

倾向 A——复用现有 StepNode，suspended 只是一个 status 值。moderator 看到 suspended 就不路由，thread 进入 suspended 态。

resume 流程

uwf thread resume <id> -p "补充信息"

检查 thread status == suspended
读取 head step，拿到 suspendedRole
构造新 prompt = 原 role prompt + resume 补充信息
以该 role 执行一个新 step
thread 恢复为 idle/running

这样 moderator 逻辑改动最小——只需要在 evaluate 时跳过 suspended status 的路由即可。

主人觉得方向 OK 吗？OK 的话我可以拆 implementation issues。

收到，方向确认： ## 设计决议 | # | 决定 | 说明 | |---|------|------| | 1 | **隐式 status** | 任何 role 输出 `$status: suspended` 即挂起，resume 回到同一 role。graph 不需要声明 `$SUSPEND` target | | 2 | **新 step** | resume 注入的信息作为新 step 写入 chain，重新调起 agent session | | 3 | **可多次** | 同一 thread 可反复 suspend/resume | | 4 | **无 TTL** | suspended thread 不过期，异步等待人回应 | | 5 | **记录 CAS** | suspend 事件写入 step chain，具体结构待设计 | ## 待细化 ### suspend step 的 CAS 结构选项 A：suspend 是一个正常 StepNode，role 就是当前 role，output 记录 suspend reason： ``` { role: "planner", $status: "suspended", output: { reason: "需要更多信息" } } ``` 选项 B：suspend 是一个特殊类型的 step（新 schema），和普通 step 区分： ``` { type: "suspend", suspendedRole: "planner", reason: "..." } ``` 倾向 A——复用现有 StepNode，`suspended` 只是一个 status 值。moderator 看到 `suspended` 就不路由，thread 进入 suspended 态。 ### resume 流程 ``` uwf thread resume <id> -p "补充信息" ``` 1. 检查 thread status == suspended 2. 读取 head step，拿到 suspendedRole 3. 构造新 prompt = 原 role prompt + resume 补充信息 4. 以该 role 执行一个新 step 5. thread 恢复为 idle/running 这样 moderator 逻辑改动最小——只需要在 evaluate 时跳过 `suspended` status 的路由即可。主人觉得方向 OK 吗？OK 的话我可以拆 implementation issues。

xiaomo commented

2026-06-02 03:49:12 +00:00

理解了，修正设计：

`$SUSPEND` = 伪 role target（同 `$END`）

$SUSPEND 不是 status 值，而是 graph 里的 target role。跟 $END 一个模式：

graph:
  planner:
    insufficient_info:
      role: $SUSPEND
      prompt: "需要补充以下信息：{{{reason}}}"
    ready:
      role: developer
      prompt: ...

Agent 输出 $status: insufficient_info（正常 status）
Moderator 路由，发现 target 是 $SUSPEND
Thread 进入 suspended 态，记录 suspendedRole = planner
Prompt 模板渲染后作为给用户的消息（告知需要什么信息）
Resume 时回到 planner 重新执行

优势

Workflow 作者显式控制哪些 status 会挂起
Prompt 模板定义给用户的提示信息，可用 mustache 插值
同一 role 可以有多个不同 status 路由到 $SUSPEND（不同原因不同提示）
Moderator 改动最小——$SUSPEND 和 $END 走类似分支

修正后的设计决议

#	决定
1	`$SUSPEND` 是 graph 的伪 role target，workflow YAML 显式声明
2	resume 作为新 step 写入 chain
3	可多次 suspend/resume
4	无 TTL
5	suspend 事件记录 CAS（StepNode with suspended status）

方向 OK 的话我拆 issues。

理解了，修正设计： ## `$SUSPEND` = 伪 role target（同 `$END`） `$SUSPEND` 不是 status 值，而是 graph 里的 target role。跟 `$END` 一个模式： ```yaml graph: planner: insufficient_info: role: $SUSPEND prompt: "需要补充以下信息：{{{reason}}}" ready: role: developer prompt: ... ``` - Agent 输出 `$status: insufficient_info`（正常 status） - Moderator 路由，发现 target 是 `$SUSPEND` - Thread 进入 `suspended` 态，记录 `suspendedRole = planner` - Prompt 模板渲染后作为给用户的消息（告知需要什么信息） - Resume 时回到 `planner` 重新执行 ### 优势 - Workflow 作者显式控制哪些 status 会挂起 - Prompt 模板定义给用户的提示信息，可用 mustache 插值 - 同一 role 可以有多个不同 status 路由到 `$SUSPEND`（不同原因不同提示） - Moderator 改动最小——`$SUSPEND` 和 `$END` 走类似分支 ### 修正后的设计决议 | # | 决定 | |---|------| | 1 | `$SUSPEND` 是 graph 的伪 role target，workflow YAML 显式声明 | | 2 | resume 作为新 step 写入 chain | | 3 | 可多次 suspend/resume | | 4 | 无 TTL | | 5 | suspend 事件记录 CAS（StepNode with suspended status） | 方向 OK 的话我拆 issues。

xiaomo referenced this issue

2026-06-02 03:58:44 +00:00

feat: moderator recognizes $SUSPEND as pseudo-role target #588

xiaomo referenced this issue

2026-06-02 03:58:45 +00:00

feat: record suspend event as StepNode in CAS chain #589

xiaomo referenced this issue

2026-06-02 03:58:46 +00:00

feat: uwf thread resume command #590

xiaomo referenced this issue

2026-06-02 03:58:46 +00:00

feat: thread list/show displays suspended state and message #591

xiaomo referenced this issue

2026-06-02 03:58:47 +00:00

chore: update solve-issue workflow to use $SUSPEND for insufficient_info #592

xingyue closed this issue

2026-06-02 11:44:59 +00:00

Sign in to join this conversation.

Branches Tags

main

chore/migrate-ocas

retrospect/fix-committer-tea

retrospect/solve-issue-fixes

fix/517-expand-skill

fix/574-silent-fail-handling

fix/573-unify-cas-store

fix/571-current-role

fix/567-trim-leading-whitespace

fix/566-adapter-json-stdout

fix/544-remove-legacy-frontmatter

fix/553-edge-prompt-empty

fix/557-step-show-json-escape

fix/559-thread-show-status

fix/561-thread-start-cwd-option

fix/558-thread-edge-location

fix/531-config-mask-apikey

fix/532-config-key-validation

fix/533-double-prefix

fix/551-hermes-bin-engines

fix/549-commit-scope

feat/541-skill-developer

feat/539-skill-author

feat/538-skill-user

feat/540-skill-actor

fix/ci-skip-integration-tests

fix/535-sqlite-fallback

fix/531-532-533

fix/528-refactor-apikey

fix/526-config-subcommand

fix/522-cancelled-thread-status

fix/523-bin-entry-point

fix/519-read-session-file

fix/enum-multi-exit-validation

fix/remove-chinese-cli-output

feat/424-setup-agent-discovery

fix/hermes-integration-test-import

fix/449-reduce-dashboard-complexity

refactor/512-rename-packages

chore/510-open-source-readiness

chore/solve-issue-portable

fix/489-step-timing

fix/506-semantic-validation

feat/502-oneOf-output-instruction

feat/499-phase2-discriminated-union

feat/499-dollar-status

fix/497-update-docs

feat/490-phase3-dashboard

feat/490-phase2-yaml-migration

feat/490-status-routing

fix/487-refactor-step-read

fix/484-step-read-command

fix/480-thread-read-quota

fix/481-cas-has-exit-code

fix/474-tea-pr-worktree-fix

fix/469-step-commands-completed-threads

fix/473-first-time-role-context

fix/471-thread-list-filters

fix/466-continuation-prompt-content

chore/cleanup-cli-docs

fix/463-http-methods

fix/464-worktree-isolation

fix/461-per-agent-session-cache

fix/459-xml-tag-isolation

fix/444-biome-complexity-warnings

fix/456-thread-step-background

fix/448-reduce-complexity

fix/445-reduce-setup-complexity

fix/446-reduce-thread-complexity

docs/sync-readme

fix/447-reduce-loop-complexity

fix/439-detail-merge-and-acp

fix/440-thread-read-prompt-dedup

fix/builtin-session-lifecycle

debug/439-raw-ndjson-dump

fix/428-multi-strategy-workflow-resolution

fix/yaml-no-alias

feat/428-workflow-resolution

feat/turn-jsonl-session

feat/426-builtin-session-resume

fix/builtin-agent-system-user-split

feat/422-claude-code-detail-enrichment

feat/builtin-agent

test/418-resume-e2e-repro

chore/update-cli-reference

fix/413-log-subcommands

feat/411-process-logger

feat/405-phase2-find-last-role-index

feat/405-edge-prompt-required

feat/402-edge-prompt-session-resume

feat/398-hermes-acp-client

fix/395-worktree-hygiene

feat/391-workflow-agent-claude-code

jshang/workflow-dashboard

fix/394-forbid-extra-frontmatter-fields

feat/335-setup-validate-model

fix/389-dynamic-format-instruction

fix/388-frontmatter-dynamic-fields

fix/385-revert-output-protocol

feat/384-agent-session-protocol

feat/remove-llm-extract

feat/cas-put-text

fix/380-hermes-quiet-flag

feat/373-thread-step-count

fix/fallback-transition-validation

feat/376-first-last-jsonata

refactor/374-meta-to-frontmatter

feat/370-solve-issue-workflow

feat/369-uwf-skill-cli

chore/ignore-legacy-biome

feat/365-project-local-workflows

refactor/364-rename-role-fields

feat/359-role-four-phase

chore/rename-uwf-to-workflow

chore/repo-restructure

feat/357-thread-read-content

feat/355-uwf-frontmatter

feat/351-phase3-prompt-focus

feat/351-phase2-adapter-frontmatter

feat/351-frontmatter-markdown-phase1

feat/349-thread-read

fix/348-session-id-stderr

fix/342-parse-session-id

fix/342-fork-simplify

feat/342-thread-steps-fork

refactor/simplify-agent-context

refactor/pass-store-via-context

feat/337-agent-detail-merkle

feat/cas-reindex

refactor/use-list-by-type

refactor/merge-cas-get-cat

refactor/remove-table-format

fix/328-table-vertical

user/jiayiyan/feat_office-agent-document-template-v2

feat/328-format-option

fix/319-cas-json-output

fix/319-validate-schema-only-inline

fix/319-schema-titles

feat/319-uwf-cas-builtin

user/jiayiyan/feat_office-agent-document-template

feat/309-uwf-stateless

feat/285-phase3-x-cas-ref

chore/remove-old-templates

feat/294-phase7-cli

private/json-cas-refactor

feat/294-phase5-react-layer

feat/294-phase4-engine-migration

feat/294-phase3-workflow-json

feat/294-jsonata-moderator

docs/architecture-cards

feat/285-phase2-remove-extractrefs

feat/285-cas-ref-annotation

chore/fix-biome-complexity-warnings

refactor/agent-fn-required-opt

chore/audit-exports-cleanup

chore/remove-symlink-dead-code

chore/no-external-bundle

feat/show-system-prompt

chore/205-env-example

chore/biome-fix-and-pre-push-hook

chore/remove-parentRequired-param

fix/265-flaky-thread-rm

feat/workflow-detail-layout

chore/252-remove-text-adapter

feat/261-adapter-migration

feat/252-agent-fn

feat/graph-interactions

fix/dashboard-graph-side-handles

fix/dashboard-graph-visual-247

fix/cursor-agent-runtime-extract

refactor/serve-remove-http-tunnel

chore/slim-role-output

feat/changesets-version-management

chore/bump-0.4.0

chore/merge-publish-scripts

chore/remove-link-all

feat/merge-publish-scripts

fix/auto-discover-publish

refactor/dashboard-custom-spine-layout

fix/cli-bin-path

fix/dashboard-elk-review-feedback

feat/dashboard-elk-layout

fix/skill-author-pitfalls

fix/publish-lockfile-regen

feat/210-ws-gateway-phase2

refactor/thread-detail-side-by-side-layout

feat/210-ws-gateway-phase1

feat/222-tools-smoke-test-phase3

feat/222-react-adapter-phase2

feat/222-adapter-fn-phase1

fix/219-review-followup

feat/216-setup-and-build-scripts

fix/206-bundle-build-register

feat/197-agent-observability

feat/198-dashboard-workflow-graph

feat/194-merkle-call-stack-phase2

refactor/200-moderator-table

feat/194-merkle-call-stack-phase1

feat/cursor-agent-workspace-extract

fix/191-dashboard-thread-sort

feat/187-end-node-llm-summary

refactor/185-remove-max-rounds

refactor/180-simplify-extract-fn

feat/177-gateway-route-reorg

feat/172-declarative-moderator-table

fix/170-thread-status-detection

feat/164-cf-worker-gateway

fix/161-162-cas-content-refs

feat/155-cas-thread-phase-5

feat/155-cas-thread-phase-4

feat/155-cas-thread-phase-3

feat/155-cas-thread-phase-2

feat/155-cas-thread-phase-1

chore/rename-dashboard-folder

refactor/143-split-packages

feat/139-thread-reactor

refactor/runtime-descriptor-boundary

fix/128-dashboard-enhancements

fix/130-sse-incremental

fix/120-serve-hardening

feat/131-dashboard-sse

refactor/thread-context-runtime

feat/118-serve-write-sse

feat/118-dashboard

refactor/121-split-workflow-runtime

feat/118-serve-api

chore/bump-0.2.0

feat/110-phase3-supervisor

chore/114-remove-deprecated

feat/110-phase2-migrate-extract

docs/package-readmes

feat/110-phase1-config-layer

chore/108-cli-module-discipline

chore/106-workflow-module-discipline

chore/cleanup-cas-thread-id

refactor/102-module-folders

refactor/97-phase4-cleanup

refactor/96-phase3-split-dispatch

refactor/95-phase2-control-merge

chore/remove-build-scripts

refactor/93-phase1-directory-restructure

feat/91-reviewer-prompt

docs/88-readme-architecture-cleanup

fix/85-usage-format

fix/83-cli-ux

feat/81-skill-topics

fix/75-nits

refactor/75-merge-roles-phase1

refactor/71-auto-gen-skill-doc

feat/63-workflow-storage-root

feat/69-help-skill

refactor/cli-noun-verb-grouping

feat/59-solve-issue-refactor

feat/37-live-command

feat/58-develop-workflow

feat/36-init-command

feat/44-react-extract

feat/43-extract-provider-config

feat/42-thread-root-node

feat/41-merkle-content-cas

feat/33-workflow-as-agent

feat/32-cas-gc

feat/31-refs-tracking

feat/30-global-cas

feat/28-preparer-role

fix/26-planner-cas-cli-prompt

test/19-validate-workflow-descriptor

feat/23-phase-title-in-planner-meta

fix/21-moderator-coder-transition

fix/review-feedback-and-typecheck

fix/type-errors-and-tsbuildinfo

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: uncaged/workflow#587

RFC: thread suspend/resume for pending information #587

Problem

Proposal

新状态：suspended

挂起方式

恢复方式

Graph 路由

待讨论

用例

设计决议

待细化

suspend step 的 CAS 结构

resume 流程

$SUSPEND = 伪 role target（同 $END）

优势

修正后的设计决议

新状态：`suspended`

`$SUSPEND` = 伪 role target（同 `$END`）