7de75b5df7
RoleSpec now has exactly 3 fields: adapter, prompt, meta. Timeout belongs to adapter config — different timeouts = different adapter instances. Refs #245 小橘 🍊(NEKO Team)
319 lines
12 KiB
Markdown
319 lines
12 KiB
Markdown
# RFC-003: Agent Configuration Layer
|
|
|
|
**Author:** 小橘 🍊(NEKO Team)
|
|
**Status:** Draft
|
|
**Created:** 2026-04-29
|
|
|
|
## Summary
|
|
|
|
Define a minimal agent abstraction where **adapter = capability** and **role = scenario**. Workflows directly declare which adapter each role uses — no intermediate registry or `nerve.yaml` agent config. `nerve.yaml` only holds `extract` config and `knowledge` settings.
|
|
|
|
## Motivation
|
|
|
|
The original design introduced a `nerve.yaml` agents registry to map logical names (e.g. `developer`) to adapter implementations. In practice this added an unnecessary layer of indirection:
|
|
|
|
- **Agent names are arbitrary** — `developer` vs `coder` vs `engineer` is a naming exercise, not architecture
|
|
- **One more config to maintain** — adding/changing an adapter requires editing both `nerve.yaml` and the workflow
|
|
- **Same adapter, same config** — in reality, most workflows just need "use cursor" or "use hermes", not a named abstraction on top
|
|
|
|
The simpler model: **workflow roles declare their adapter directly**. The adapter *is* the capability.
|
|
|
|
## Key Concepts
|
|
|
|
### Adapter vs Role
|
|
|
|
| | Adapter | Role |
|
|
|---|---|---|
|
|
| **What** | Capability — what tools are available | Scenario — what to do with those tools |
|
|
| **Granularity** | Few (cursor, hermes, claude, codex) | Many (per workflow step) |
|
|
| **Defines** | How to spawn an agent, tool access | Prompt, schema, timeout |
|
|
| **Layer** | Infrastructure (packages) | Business logic (WorkflowSpec) |
|
|
|
|
A `cursor` adapter becomes an architect, coder, or reviewer depending on the role's prompt. The adapter defines *what it can do*; the role defines *what it does right now*.
|
|
|
|
### Agent Protocol
|
|
|
|
All agent types implement a single unified interface:
|
|
|
|
```ts
|
|
type AgentFn = (prompt: string, context: WorkflowContext) => Promise<string>
|
|
```
|
|
|
|
- **Input**: prompt (assembled by Role) + context (start frame + prior messages + workdir + abort signal)
|
|
- **Output**: raw string — structured data is extracted separately
|
|
- **Internals**: adapter handles tool-specific details (cursor CLI, hermes subagent, codex API, etc.)
|
|
|
|
Workflow runtime never interacts with agent internals.
|
|
|
|
### Extract Layer
|
|
|
|
A separate concern that parses agent output (raw string) into typed meta:
|
|
|
|
```ts
|
|
type ExtractFn<T> = (raw: string, schema: Schema<T>) => Promise<T>
|
|
```
|
|
|
|
Configured globally in `nerve.yaml`, overridable per role (two-level merge: global → role).
|
|
|
|
**Error handling**: retry once (feed raw output + parse error back to LLM for correction), then throw `ExtractError`. The workflow moderator decides the recovery strategy (retry role, skip, or terminate) — extract never makes workflow-level decisions.
|
|
|
|
## Design
|
|
|
|
### Configuration (`nerve.yaml`)
|
|
|
|
`nerve.yaml` holds only extract and knowledge config — no agent registry:
|
|
|
|
```yaml
|
|
extract:
|
|
provider: dashscope
|
|
model: qwen-plus
|
|
```
|
|
|
|
### Workflow Definition (TypeScript)
|
|
|
|
Roles declare their adapter directly — no indirection through named agents:
|
|
|
|
```ts
|
|
import { cursorAdapter, createCursorAdapter } from "@uncaged/nerve-adapter-cursor";
|
|
import { hermesAdapter } from "@uncaged/nerve-adapter-hermes";
|
|
|
|
const workflow: WorkflowSpec<MyMeta> = {
|
|
name: "develop-workflow",
|
|
roles: {
|
|
architect: { adapter: cursorAdapter, prompt: architectPrompt, meta: architectSchema },
|
|
coder: { adapter: createCursorAdapter({ model: "claude-sonnet-4", timeout: 600 }), prompt: coderPrompt, meta: coderSchema },
|
|
reviewer: { adapter: hermesAdapter, prompt: reviewPrompt, meta: reviewSchema },
|
|
deployer: { adapter: hermesAdapter, prompt: deployPrompt, meta: deploySchema },
|
|
},
|
|
moderator,
|
|
};
|
|
```
|
|
|
|
### Runtime Assembly
|
|
|
|
```
|
|
WorkflowSpec → Role(adapter fn + prompt) → adapter(prompt, ctx) → string
|
|
↓
|
|
nerve.yaml#extract → ExtractFn(string, schema) → T (typed meta)
|
|
```
|
|
|
|
Adapter is a direct function reference on each role — no map, no lookup, no registry.
|
|
|
|
### Adapter Packages
|
|
|
|
Each agent adapter lives in its own package to avoid pulling unnecessary dependencies:
|
|
|
|
```
|
|
packages/
|
|
adapter-cursor/ # @uncaged/nerve-adapter-cursor — cursor-agent CLI
|
|
adapter-hermes/ # @uncaged/nerve-adapter-hermes — hermes CLI subagent
|
|
adapter-claude/ # @uncaged/nerve-adapter-claude — claude-code CLI (future)
|
|
adapter-codex/ # @uncaged/nerve-adapter-codex — codex CLI (future)
|
|
```
|
|
|
|
Each adapter exports a **default instance** and a **factory** for customization:
|
|
|
|
```ts
|
|
// @uncaged/nerve-adapter-cursor
|
|
import type { AgentConfig, AgentFn } from "@uncaged/nerve-core";
|
|
|
|
// Factory — custom config
|
|
export function createCursorAdapter(config: AgentConfig): AgentFn;
|
|
|
|
// Default — sensible defaults (model: "auto", timeout: 300)
|
|
export const cursorAdapter: AgentFn;
|
|
```
|
|
|
|
The factory receives adapter config (model, timeout) and returns an `AgentFn` that spawns the CLI tool, passes the prompt, and returns raw output.
|
|
|
|
**Wiring** — workflows import adapters directly, no daemon-level registry:
|
|
|
|
```ts
|
|
import { cursorAdapter } from "@uncaged/nerve-adapter-cursor";
|
|
import { hermesAdapter } from "@uncaged/nerve-adapter-hermes";
|
|
|
|
// Use default instances directly in roles
|
|
{ adapter: cursorAdapter, prompt: "...", meta: schema }
|
|
```
|
|
|
|
Adapters not installed simply can't be imported — TypeScript catches missing dependencies at compile time.
|
|
|
|
**Workspace `package.json`** only lists the adapters it actually uses:
|
|
|
|
```json
|
|
{
|
|
"dependencies": {
|
|
"@uncaged/nerve-adapter-cursor": "workspace:*",
|
|
"@uncaged/nerve-adapter-hermes": "workspace:*"
|
|
}
|
|
}
|
|
```
|
|
|
|
**Migration from `workflow-utils`** — the existing `role-cursor.ts` / `shared/cursor-agent.ts` spawn logic moves to `@uncaged/nerve-adapter-cursor`. `role-hermes.ts` / `shared/hermes-agent.ts` moves to `@uncaged/nerve-adapter-hermes`. `workflow-utils` retains only extract, prompt utilities, and shared spawn infrastructure.
|
|
|
|
### Dynamic Prompts
|
|
|
|
`RoleSpec.prompt` supports both static strings and async functions:
|
|
|
|
```ts
|
|
type PromptInput = string | ((start: StartStep, messages: WorkflowMessage[]) => Promise<string>);
|
|
|
|
type RoleSpec<M> = {
|
|
adapter: AgentFn;
|
|
prompt: PromptInput;
|
|
meta: Schema<M>;
|
|
};
|
|
```
|
|
|
|
Static prompts cover simple cases. Dynamic prompts (functions) are needed when the prompt depends on thread context — e.g. reading issue content, injecting prior step results, or resolving repo paths at runtime.
|
|
|
|
### Timeout Resolution
|
|
|
|
Timeout is an **adapter concern**, not a role concern. Roles define *what to do* (prompt + schema); adapters define *how to do it* (tool, model, timeout).
|
|
|
|
When different roles need different timeouts, create separate adapter instances:
|
|
|
|
```ts
|
|
import { cursorAdapter, createCursorAdapter } from "@uncaged/nerve-adapter-cursor";
|
|
|
|
const fastCursor = createCursorAdapter({ model: "auto", timeout: 60 });
|
|
const slowCursor = createCursorAdapter({ model: "auto", timeout: 600 });
|
|
|
|
roles: {
|
|
reviewer: { adapter: fastCursor, prompt: reviewPrompt, meta: reviewSchema },
|
|
coder: { adapter: slowCursor, prompt: coderPrompt, meta: coderSchema },
|
|
}
|
|
```
|
|
|
|
### No Runtime Fallback
|
|
|
|
- **`nerve init`** — detects agent availability (CLI exists? service reachable?), reports errors immediately
|
|
- **Runtime** — if an agent is unavailable, the workflow fails with a clear error. No silent degradation.
|
|
|
|
Rationale: silent fallback hides quality differences (cursor → hermes subagent produces very different output) and makes debugging harder.
|
|
|
|
### Adapter Hot-Reload
|
|
|
|
Follows the existing `nerve.yaml` hot-reload mechanism. On config change, adapters are rebuilt. Running workflow threads are not affected (they use the `AdapterFn` bound at thread start). New threads automatically use the updated config.
|
|
|
|
### WorkflowContext
|
|
|
|
```ts
|
|
type WorkflowContext = {
|
|
start: StartStep;
|
|
messages: WorkflowMessage[];
|
|
workdir: string; // repo root — coding agent working directory
|
|
signal: AbortSignal; // graceful cancellation
|
|
};
|
|
```
|
|
|
|
`workdir` is required for coding agents. `signal` enables graceful cancellation of long-running agent calls — adapters must respect it (e.g. kill subprocess on abort).
|
|
|
|
### Configuration Validation
|
|
|
|
`nerve validate` checks:
|
|
- All roles have a valid adapter function (not null/undefined)
|
|
- Adapter CLIs are available (binary exists in PATH)
|
|
- Extract provider is configured and reachable
|
|
|
|
## Compatibility with Current Types
|
|
|
|
The existing `Role<Meta>` signature:
|
|
|
|
```ts
|
|
type Role<Meta> = (start: StartStep, messages: WorkflowMessage[]) => Promise<RoleResult<Meta>>
|
|
```
|
|
|
|
remains the runtime interface. The new config layer is syntactic sugar — the runtime assembles `Role<Meta>` functions from `(adapter + prompt + schema)` instead of users writing them by hand. `WorkflowDefinition` stays the same at the engine level; `WorkflowSpec` is the new user-facing authoring format that compiles down to it at daemon startup / hot-reload time (runtime lazy compile, not `nerve init`).
|
|
|
|
Existing hand-written `Role` functions continue to work — `WorkflowSpec` is additive, not a breaking change.
|
|
|
|
## Knowledge Layer
|
|
|
|
Project knowledge is a **built-in nerve feature**. Scope is the **repo** — each repo has its own knowledge base, tracked in git.
|
|
|
|
### Architecture
|
|
|
|
```
|
|
Local (per repo) Remote Service
|
|
┌───────────────────────┐ ┌─────────────────────┐
|
|
│ knowledge.yaml │ │ Embedding API │
|
|
│ ├── include/exclude │ ──→ │ text → vector │
|
|
│ knowledge.db (SQLite) │ ←── │ content-hash cache │
|
|
│ ├── chunk text │ │ (avoid recompute) │
|
|
│ ├── embedding bytes │ └─────────────────────┘
|
|
│ └── cosine search │
|
|
└───────────────────────┘
|
|
```
|
|
|
|
- **Local-first** — `knowledge.db` stores chunks + embeddings, search runs locally (in-memory cosine similarity)
|
|
- **Remote service only computes embeddings** — content-addressable cache keyed by text hash, avoids redundant computation across agents
|
|
- **Branch-aware by design** — different agents on different branches naturally have different `knowledge.db` contents
|
|
|
|
### Configuration (`knowledge.yaml` at repo root)
|
|
|
|
```yaml
|
|
include:
|
|
- "src/**/*.ts"
|
|
- "docs/**/*.md"
|
|
- "*.md"
|
|
|
|
exclude:
|
|
- "node_modules/**"
|
|
- "dist/**"
|
|
- "*.test.ts"
|
|
```
|
|
|
|
`knowledge.yaml` is committed to git. `knowledge.db` is gitignored — it's a local cache rebuilt from source files + remote embedding service.
|
|
|
|
### CLI
|
|
|
|
```bash
|
|
nerve knowledge sync # index/re-index changed files
|
|
nerve knowledge query "how does the signal bus work"
|
|
|
|
# Scope
|
|
nerve knowledge query "..." # default: cwd repo
|
|
nerve knowledge query --repo /path/to/other/repo "..."
|
|
nerve knowledge query -g "..." # global search (all indexed repos)
|
|
# --repo and -g are mutually exclusive
|
|
```
|
|
|
|
### Search Implementation
|
|
|
|
Project-scale knowledge (hundreds to low thousands of chunks) does not need vector indices. Full scan with cosine similarity in memory is sufficient and adds zero native dependencies.
|
|
|
|
```ts
|
|
// Pseudocode
|
|
const chunks = db.all("SELECT slug, chunk, embedding FROM chunks");
|
|
const query_vec = await embed(query);
|
|
const results = chunks
|
|
.map(c => ({ ...c, score: cosine(query_vec, c.embedding) }))
|
|
.sort((a, b) => b.score - a.score)
|
|
.slice(0, limit);
|
|
```
|
|
|
|
### Knowledge Layers
|
|
|
|
```
|
|
Project knowledge (knowledge.yaml) Per repo, git managed, any agent reads
|
|
Agent long-term memory Per agent, domain expertise, cross-run
|
|
Workflow context (start + msgs) Per run, moderator-controlled history
|
|
```
|
|
|
|
## Open Questions
|
|
|
|
1. **Agent long-term memory** — storage format and mechanism for persisting domain expertise across runs
|
|
|
|
### Resolved
|
|
|
|
- **Agent naming / registry** → removed; workflow roles declare adapter directly, no intermediate registry
|
|
- **Extract override granularity** → two-level merge: global → role (agent level removed)
|
|
- **Context threading** → `WorkflowContext` includes `workdir` and `signal` (see design above)
|
|
- **Embedding service** → self-hosted, 1024-dim vectors, content-hash cache
|
|
|
|
## References
|
|
|
|
- [RFC-002: Workflow Engine](./rfc-002-workflow-engine.md)
|
|
- Current `Role` / `Moderator` types: `packages/core/src/workflow.ts`
|