docs: RFC v3 — named payload fields, refs as GC index, merge parent+ancestors
- payload is source of truth with named fields (start, content, ancestors, compact) - refs[] auto-derived by collectRefs(), pure GC index - parent merged into ancestors[0] 小橘 <xiaoju@shazhou.work>
This commit is contained in:
@@ -54,31 +54,29 @@ CAS blob:
|
|||||||
payload: {
|
payload: {
|
||||||
role: "coder",
|
role: "coder",
|
||||||
meta: { ... },
|
meta: { ... },
|
||||||
|
start: "<start_hash>",
|
||||||
|
content: "<content_merkle_hash>",
|
||||||
|
ancestors: ["<parent_hash>", "<grandparent_hash>", ...],
|
||||||
|
compact: null,
|
||||||
timestamp: 1234567890
|
timestamp: 1234567890
|
||||||
},
|
},
|
||||||
refs: [
|
refs: [<start_hash>, <content_hash>, <parent_hash>, ...]
|
||||||
<start_hash>, // refs[0]: always the StartNode
|
|
||||||
<parent_hash>, // refs[1]: previous StateNode (null for first step)
|
|
||||||
<content_hash>, // refs[2]: content Merkle node (carries role artifact refs)
|
|
||||||
...ancestors, // refs[3..N]: skip-list of up to 10 ancestor StateNode hashes
|
|
||||||
]
|
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
**Fixed ref positions:**
|
**Payload is the source of truth.** Application code reads named fields from payload. `refs[]` is a **GC index** — automatically derived from payload by collecting all CAS hashes. GC only scans `refs[]` without understanding payload structure.
|
||||||
|
|
||||||
| Index | Meaning | Nullable |
|
**Payload fields:**
|
||||||
|-------|---------|----------|
|
|
||||||
| 0 | StartNode hash | No |
|
|
||||||
| 1 | Parent StateNode hash | Yes (null for first step after start) |
|
|
||||||
| 2 | Content Merkle node hash | No |
|
|
||||||
| 3+ | Ancestor skip-list (≤ 10 most recent ancestors, newest first) | Optional |
|
|
||||||
|
|
||||||
**Optional payload fields:**
|
|
||||||
|
|
||||||
| Field | Type | Meaning |
|
| Field | Type | Meaning |
|
||||||
|-------|------|---------|
|
|-------|------|---------|
|
||||||
|
| `role` | `string` | Role name, or `"__end__"` for completion |
|
||||||
|
| `meta` | `object` | Structured metadata extracted from agent output |
|
||||||
|
| `start` | `string` | StartNode hash |
|
||||||
|
| `content` | `string` | Content Merkle node hash (carries role artifact refs) |
|
||||||
|
| `ancestors` | `string[]` | `[parent, grandparent, ...]` — up to 11 entries (1 parent + 10 skip-list). Empty for first step after start. `ancestors[0]` is the direct parent. |
|
||||||
| `compact` | `string \| null` | CAS hash of a compacted summary of all nodes before this one. When present, LLM context assembly can use this instead of walking the full chain. |
|
| `compact` | `string \| null` | CAS hash of a compacted summary of all nodes before this one. When present, LLM context assembly can use this instead of walking the full chain. |
|
||||||
|
| `timestamp` | `number` | Unix timestamp in ms |
|
||||||
|
|
||||||
### Content Merkle Node
|
### Content Merkle Node
|
||||||
|
|
||||||
@@ -121,9 +119,13 @@ An end is just a StateNode with `role: "__end__"`:
|
|||||||
payload: {
|
payload: {
|
||||||
role: "__end__",
|
role: "__end__",
|
||||||
meta: { returnCode: 0, summary: "completed successfully" },
|
meta: { returnCode: 0, summary: "completed successfully" },
|
||||||
|
start: "<start_hash>",
|
||||||
|
content: "<content_hash>",
|
||||||
|
ancestors: ["<parent_hash>", ...],
|
||||||
|
compact: null,
|
||||||
timestamp: 1234567891
|
timestamp: 1234567891
|
||||||
},
|
},
|
||||||
refs: [<start_hash>, <parent_hash>, <content_hash>, ...ancestors]
|
refs: [<start_hash>, <content_hash>, <parent_hash>, ...]
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -165,11 +167,11 @@ Benefits:
|
|||||||
|
|
||||||
### Ancestor Skip-List
|
### Ancestor Skip-List
|
||||||
|
|
||||||
Each StateNode carries up to 10 ancestor hashes in `refs[3..N]` (newest first):
|
Each StateNode carries up to 11 entries in `payload.ancestors` (1 parent + 10 skip-list, newest first):
|
||||||
|
|
||||||
```
|
```
|
||||||
Node 15: refs = [start, node14, content, node13, node12, node11, node10, node9, node8, node7, node6, node5, node4]
|
Node 15: ancestors = [node14, node13, node12, node11, node10, node9, node8, node7, node6, node5, node4]
|
||||||
^--- ancestors (10 most recent) ---^
|
^parent ^--- skip-list (10 most recent) ---^
|
||||||
```
|
```
|
||||||
|
|
||||||
This enables:
|
This enables:
|
||||||
@@ -221,6 +223,10 @@ Simple mark-and-sweep:
|
|||||||
|
|
||||||
No per-row format parsing needed. GC only needs to understand `refs[]`.
|
No per-row format parsing needed. GC only needs to understand `refs[]`.
|
||||||
|
|
||||||
|
### refs[] Derivation
|
||||||
|
|
||||||
|
`refs[]` is auto-derived from payload at write time via a `collectRefs(payload)` function that extracts all CAS hash strings from named fields (`start`, `content`, `ancestors`, `compact`). Application code never reads `refs[]` — it reads named payload fields. This makes `refs[]` a pure GC optimization with zero semantic coupling.
|
||||||
|
|
||||||
### Extract Phase
|
### Extract Phase
|
||||||
|
|
||||||
The Extractor is expanded from the current design. Currently it only extracts `meta` from agent output. In the new design it extracts:
|
The Extractor is expanded from the current design. Currently it only extracts `meta` from agent output. In the new design it extracts:
|
||||||
|
|||||||
Reference in New Issue
Block a user