RFC: Khala — Stateless Agent Pool Cloud Workflow Orchestrator #119

Open
opened 2026-04-25 03:00:20 +00:00 by tuanzi · 2 comments
Owner

acdea25b595a8484d3ae3d2445ea3ec3601ac760

Summary

Khala is a cloud-native (Cloudflare Workers + D1 + Durable Objects) reactive workflow orchestrator for cross-agent coordination. It treats all agents as a stateless, homogeneous worker pool — any agent can execute any role turn at any time.

Supersedes #115 (original RFC with named channels and recruitment phases).

Key Insight

Two properties of our agent fleet make the design radically simple:

  1. Shared skills (shazhou/skills) — all agents have equivalent capabilities
  2. Workflow thread as context — work state lives in the thread, not the agent session

Therefore agents are interchangeable work units. No need for named channels, role binding, recruitment phases, or offline recovery.

Architecture

┌─────────────────────────────────────────────┐
│              Khala (CF)                 │
│                                             │
│  ┌─────────────┐    ┌────────────────────┐  │
│  │  Task Queue  │    │  Workflow Engine   │  │
│  │  (open only) │    │  (DO per thread)   │  │
│  └──────┬──────┘    │  - moderator       │  │
│         │           │  - thread state    │  │
│         │           │  - message history │  │
│         │           └────────┬───────────┘  │
│         │                    │              │
│         └────────────────────┘              │
│              ▲            │                 │
│              │ response   │ turn event      │
└──────────────┼────────────┼─────────────────┘
               │            ▼
         ┌─────┴──────────────────┐
         │    Any Agent (nerve)   │
         │                        │
         │  Sense: poll task queue │
         │  → Signal → Reflex     │
         │  → Local workflow      │
         │    (execute turn)      │
         │  → POST response back  │
         └────────────────────────┘

Cloud Workflow Definition

workflow: code-review
moderator: ./moderators/code-review.jsonata   # pure state machine

roles:
  author:
    prompt: |
      You are the PR author. Explain your changes, respond to
      reviewer feedback, and apply suggested fixes.
  reviewer:
    prompt: |
      You are the code reviewer. Review code quality, architecture,
      potential bugs. Give specific, actionable suggestions.

Role Definition

  • Local workflow: role = async function (code)
  • Cloud workflow: role = prompt (declarative)

A role is "who you are and what you're responsible for", not "how to do it". Execution details come from the agent's local skills and tools.

Turn Event

When moderator assigns a turn, the agent receives:

  • role prompt — your role in this workflow
  • turn instruction — what to do this step (generated by moderator)
  • thread id — for querying context

Thread Context: Query, Don't Dump

Thread history is NOT bulk-loaded into agent context. Instead, agents get a query interface to pull what they need:

GET /threads/:id/messages?role=author&last=3
GET /threads/:id/messages?since=<timestamp>
GET /threads/:id/messages?step=2

This:

  • Prevents token explosion on long workflows
  • Lets agents decide what context is relevant
  • Removes dependency on agent continuity (any agent can pick up any turn, pull context on demand)

The query capability is injected as a tool/context-provider into the local workflow that executes the turn.

Agent-Side Integration

Zero new primitives. Standard nerve sense/signal/reflex:

  1. KhalaSense — polls the task queue periodically
  2. Turn event → local signal
  3. Reflex triggers local workflow to:
    • Read role prompt + turn instruction
    • Query thread context as needed
    • Execute with local LLM + tools
    • POST response event back to khala

Capacity management: a CapacitySense checks local workflow load. Reflex only fires when the agent has bandwidth.

Khala Internals

Khala is a nerve subset: only reflex + workflow, no sense.

  • Purely reactive — agents POST events in, khala responds
  • Moderator — JSONata or simple automaton, runs in CF DO
  • Thread state — persisted in D1, keyed by thread id
  • Task queue — D1 table, agents poll for unclaimed turns
  • Turn lifecycle: moderator emits turn → queue → agent claims → executes → posts response → moderator routes next

Transport

  • REST as primary (simple, reliable)
  • CF DO WebSocket as optional upgrade for low-latency push (hibernation API for cost efficiency)

Event Ordering

Causal ordering only. Moderator serializes turns within a thread. No global total order needed.

Auth

Per-agent token. Agents register with khala on nerve init. Orchestrator maintains an agent registry.

Relationship to Pulseflare

Khala replaces pulseflare. The D1 event store from pulseflare becomes the persistence layer for thread messages.

Open Questions

  1. Moderator format: JSONata sufficient, or need a small DSL?
  2. Turn timeout: how long before an unclaimed/unfinished turn is re-queued?
  3. Result aggregation: how does a workflow return a final result to the initiator?
  4. Workflow initiation: REST API? Or also via agent event?
  5. Observability: tracing a multi-agent workflow execution

Next Steps

  • Define khala package structure in nerve monorepo
  • Design D1 schema (threads, messages, task_queue)
  • Prototype moderator engine on CF DO
  • Build KhalaSense for local nerve daemon
  • End-to-end demo: two agents doing a code review
acdea25b595a8484d3ae3d2445ea3ec3601ac760 ## Summary Khala is a cloud-native (Cloudflare Workers + D1 + Durable Objects) reactive workflow orchestrator for cross-agent coordination. It treats all agents as a **stateless, homogeneous worker pool** — any agent can execute any role turn at any time. Supersedes #115 (original RFC with named channels and recruitment phases). ## Key Insight Two properties of our agent fleet make the design radically simple: 1. **Shared skills** (`shazhou/skills`) — all agents have equivalent capabilities 2. **Workflow thread as context** — work state lives in the thread, not the agent session Therefore agents are **interchangeable work units**. No need for named channels, role binding, recruitment phases, or offline recovery. ## Architecture ``` ┌─────────────────────────────────────────────┐ │ Khala (CF) │ │ │ │ ┌─────────────┐ ┌────────────────────┐ │ │ │ Task Queue │ │ Workflow Engine │ │ │ │ (open only) │ │ (DO per thread) │ │ │ └──────┬──────┘ │ - moderator │ │ │ │ │ - thread state │ │ │ │ │ - message history │ │ │ │ └────────┬───────────┘ │ │ │ │ │ │ └────────────────────┘ │ │ ▲ │ │ │ │ response │ turn event │ └──────────────┼────────────┼─────────────────┘ │ ▼ ┌─────┴──────────────────┐ │ Any Agent (nerve) │ │ │ │ Sense: poll task queue │ │ → Signal → Reflex │ │ → Local workflow │ │ (execute turn) │ │ → POST response back │ └────────────────────────┘ ``` ## Cloud Workflow Definition ```yaml workflow: code-review moderator: ./moderators/code-review.jsonata # pure state machine roles: author: prompt: | You are the PR author. Explain your changes, respond to reviewer feedback, and apply suggested fixes. reviewer: prompt: | You are the code reviewer. Review code quality, architecture, potential bugs. Give specific, actionable suggestions. ``` ### Role Definition - **Local workflow**: role = async function (code) - **Cloud workflow**: role = prompt (declarative) A role is "who you are and what you're responsible for", not "how to do it". Execution details come from the agent's local skills and tools. ### Turn Event When moderator assigns a turn, the agent receives: - **role prompt** — your role in this workflow - **turn instruction** — what to do this step (generated by moderator) - **thread id** — for querying context ## Thread Context: Query, Don't Dump Thread history is NOT bulk-loaded into agent context. Instead, agents get a **query interface** to pull what they need: ``` GET /threads/:id/messages?role=author&last=3 GET /threads/:id/messages?since=<timestamp> GET /threads/:id/messages?step=2 ``` This: - Prevents token explosion on long workflows - Lets agents decide what context is relevant - Removes dependency on agent continuity (any agent can pick up any turn, pull context on demand) The query capability is injected as a tool/context-provider into the local workflow that executes the turn. ## Agent-Side Integration Zero new primitives. Standard nerve sense/signal/reflex: 1. **KhalaSense** — polls the task queue periodically 2. Turn event → local **signal** 3. **Reflex** triggers local workflow to: - Read role prompt + turn instruction - Query thread context as needed - Execute with local LLM + tools - POST response event back to khala Capacity management: a **CapacitySense** checks local workflow load. Reflex only fires when the agent has bandwidth. ## Khala Internals Khala is a **nerve subset**: only reflex + workflow, no sense. - **Purely reactive** — agents POST events in, khala responds - **Moderator** — JSONata or simple automaton, runs in CF DO - **Thread state** — persisted in D1, keyed by thread id - **Task queue** — D1 table, agents poll for unclaimed turns - **Turn lifecycle**: moderator emits turn → queue → agent claims → executes → posts response → moderator routes next ### Transport - **REST** as primary (simple, reliable) - **CF DO WebSocket** as optional upgrade for low-latency push (hibernation API for cost efficiency) ## Event Ordering Causal ordering only. Moderator serializes turns within a thread. No global total order needed. ## Auth Per-agent token. Agents register with khala on `nerve init`. Orchestrator maintains an agent registry. ## Relationship to Pulseflare Khala replaces pulseflare. The D1 event store from pulseflare becomes the persistence layer for thread messages. ## Open Questions 1. **Moderator format**: JSONata sufficient, or need a small DSL? 2. **Turn timeout**: how long before an unclaimed/unfinished turn is re-queued? 3. **Result aggregation**: how does a workflow return a final result to the initiator? 4. **Workflow initiation**: REST API? Or also via agent event? 5. **Observability**: tracing a multi-agent workflow execution ## Next Steps - [ ] Define khala package structure in nerve monorepo - [ ] Design D1 schema (threads, messages, task_queue) - [ ] Prototype moderator engine on CF DO - [ ] Build KhalaSense for local nerve daemon - [ ] End-to-end demo: two agents doing a code review
Owner

Design Review 补充讨论

Turn Timeout + 乐观锁

Turn 带 claim_id,agent POST response 时校验 claim_id 是否还是当前 holder:

  • 匹配 → 接受
  • 不匹配(已 timeout 转给别人)→ 409 拒掉,agent 丢弃结果

两层 timeout:

  • Claim timeout — claim 后 N 分钟没响应 → 释放回队列(re-queue)
  • Idle timeout — 队列里无人 claim → 通知 initiator

超时值可配置在 workflow definition 里。

Workflow Initiation: API Only

Cloud workflow 是纯 reactive,不需要引入 sense/reflex 概念。POST /workflows 创建 thread 就够了。

Result Aggregation

Moderator 定义 terminal 状态,到达时把最后一条 message(或 moderator 汇总)作为 workflow result,写回 D1 + 通知 initiator(webhook 或 polling)。

Observability

thread_id 天然是 trace ID。每个 turn event 带 timestamp + agent_id + role,D1 里形成 audit log。加个 GET /threads/:id/trace 即可。

统一 Workflow 命名:Git 语义

Local 和 cloud workflow 统一管理,采用 git branch 命名风格:

$ nerve workflow list
  code-review          # local
  deploy-check         # local
  origin/code-review   # remote
  origin/design-critique
  • 无前缀 = local,前缀名字留给 remote
  • 默认 remote 叫 origin
  • nerve workflow run origin/code-review → POST 到 nerveflare 创建 thread
  • nerve workflow logs origin/code-review#thread-123 → 查 thread 历史

Config 预留多 remote 支持(先只实现 origin):

remotes:
  origin: https://nerveflare.shazhou.workers.dev

Workflow definition 统一 YAML,用 binding: localbinding: cloud 区分。

Moderator Format

JSONata 先行,够用再说。避免过早抽象 DSL。

CapacitySense

建议第一版先不做,hardcode max concurrent turns,等跑起来看实际瓶颈再优化。

## Design Review 补充讨论 ### Turn Timeout + 乐观锁 Turn 带 `claim_id`,agent POST response 时校验 claim_id 是否还是当前 holder: - 匹配 → 接受 - 不匹配(已 timeout 转给别人)→ 409 拒掉,agent 丢弃结果 两层 timeout: - **Claim timeout** — claim 后 N 分钟没响应 → 释放回队列(re-queue) - **Idle timeout** — 队列里无人 claim → 通知 initiator 超时值可配置在 workflow definition 里。 ### Workflow Initiation: API Only Cloud workflow 是纯 reactive,不需要引入 sense/reflex 概念。`POST /workflows` 创建 thread 就够了。 ### Result Aggregation Moderator 定义 `terminal` 状态,到达时把最后一条 message(或 moderator 汇总)作为 workflow result,写回 D1 + 通知 initiator(webhook 或 polling)。 ### Observability `thread_id` 天然是 trace ID。每个 turn event 带 timestamp + agent_id + role,D1 里形成 audit log。加个 `GET /threads/:id/trace` 即可。 ### 统一 Workflow 命名:Git 语义 Local 和 cloud workflow 统一管理,采用 git branch 命名风格: ``` $ nerve workflow list code-review # local deploy-check # local origin/code-review # remote origin/design-critique ``` - **无前缀 = local**,前缀名字留给 remote - 默认 remote 叫 `origin` - `nerve workflow run origin/code-review` → POST 到 nerveflare 创建 thread - `nerve workflow logs origin/code-review#thread-123` → 查 thread 历史 Config 预留多 remote 支持(先只实现 `origin`): ```yaml remotes: origin: https://nerveflare.shazhou.workers.dev ``` Workflow definition 统一 YAML,用 `binding: local` 或 `binding: cloud` 区分。 ### Moderator Format JSONata 先行,够用再说。避免过早抽象 DSL。 ### CapacitySense 建议第一版先不做,hardcode max concurrent turns,等跑起来看实际瓶颈再优化。
tuanzi changed title from RFC: Nerveflare — Stateless Agent Pool Cloud Workflow Orchestrator to RFC: Khala — Stateless Agent Pool Cloud Workflow Orchestrator 2026-04-25 03:57:19 +00:00
Author
Owner

Naming Update

Renamed from Nerveflare to Khala (卡拉).

Inspired by StarCraft's Protoss Khala — a psychic link connecting all individuals as equals, sharing knowledge and consciousness.

  • nerve = individual agent's nervous system (local sensing)
  • khala = the shared consciousness network (cross-agent workflow orchestration)

Package: packages/khala
Deployment: khala.shazhou.workers.dev

## Naming Update Renamed from **Nerveflare** to **Khala** (卡拉). Inspired by StarCraft's Protoss Khala — a psychic link connecting all individuals as equals, sharing knowledge and consciousness. - nerve = individual agent's nervous system (local sensing) - khala = the shared consciousness network (cross-agent workflow orchestration) Package: `packages/khala` Deployment: `khala.shazhou.workers.dev`
This repo is archived. You cannot comment on issues.
No Label
2 Participants
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: uncaged/nerve#119