feat(daemon): Sense Runtime — Worker, IPC, Migrations, Peer Isolation #9

Merged
xiaomo merged 2 commits from feat/sense-runtime into main 2026-04-22 08:48:31 +00:00
Owner

Summary

Implements the Sense observation engine runtime per RFC-001.

What

  • IPC types (ipc.ts): Discriminated union for parent↔worker messages
  • sense-runtime (sense-runtime.ts): openSenseDb (WAL), openPeerDb (readonly), runMigrations, loadComputeFn, executeCompute — all with Result<T> error handling
  • sense-worker (sense-worker.ts): CLI bootstrap, reads nerve.yaml, inits per-sense DB, builds peer map, enters IPC event loop
  • examples/cpu-usage: Sample sense with Drizzle schema + migration
  • 15 unit tests covering migrations, DB ops, compute, peer isolation

Exit Criteria (all )

  • Worker process starts, runs migrations, sends ready
  • Compute produces Signal via IPC
  • Peers read-only access works
  • Null return = no signal
  • pnpm run check passes
  • 15 tests all passing

小橘 🍊(NEKO Team)

## Summary Implements the Sense observation engine runtime per [RFC-001](docs/rfc-001-observation-engine.md). ### What - **IPC types** (`ipc.ts`): Discriminated union for parent↔worker messages - **sense-runtime** (`sense-runtime.ts`): `openSenseDb` (WAL), `openPeerDb` (readonly), `runMigrations`, `loadComputeFn`, `executeCompute` — all with `Result<T>` error handling - **sense-worker** (`sense-worker.ts`): CLI bootstrap, reads `nerve.yaml`, inits per-sense DB, builds peer map, enters IPC event loop - **examples/cpu-usage**: Sample sense with Drizzle schema + migration - **15 unit tests** covering migrations, DB ops, compute, peer isolation ### Exit Criteria (all ✅) - Worker process starts, runs migrations, sends ready - Compute produces Signal via IPC - Peers read-only access works - Null return = no signal - `pnpm run check` passes - 15 tests all passing --- 小橘 🍊(NEKO Team)
xiaoju added 8 commits 2026-04-22 08:33:19 +00:00
- Define Signal, SenseConfig, ReflexConfig, WorkflowConfig, NerveConfig types
- Implement Result<T,E> with ok()/err() helpers
- Implement parseNerveConfig() with full YAML validation
- 14 unit tests covering normal and error paths
- pnpm run check passes with 0 errors

Closes #2

小橘 <xiaoju@shazhou.work>
Timestamp alone can't guarantee strict total ordering (multiple signals
in the same millisecond). An autoincrement id provides a reliable
sequence for ordering and cursor-based pagination.

小橘 <xiaoju@shazhou.work>
- Add coding convention: no '?:', use explicit 'T | null'
- ReflexConfig → discriminated union (SenseReflexConfig | WorkflowReflexConfig)
- All optional fields → explicit null (throttle, timeout, interval, on, maxQueue, workflows)
- Add exactOptionalPropertyTypes to tsconfig
- Add lib: ES2022 to tsconfig
- Refactor validateReflexConfig to reduce cognitive complexity

小橘 <xiaoju@shazhou.work>
- throttle, timeout, interval: string|null → number|null
- parseDurationField now returns parsed ms (5s→5000, 10m→600000, 1h→3600000)
- biome.json: ignore dist/** from checks

小橘 <xiaoju@shazhou.work>
- SenseConfig: add gracePeriod field (RFC §5.3 two-tier timeout)
- WorkflowConfig: discriminated union (DropOverflowConfig | QueueOverflowConfig)
- overflow: queue defaults maxQueue to 100
- overflow: drop + max_queue now returns validation error
- Cross-validate workflow reflex references against defined workflows
- Update tests: 21 cases covering all new behaviors

小橘 <xiaoju@shazhou.work>
- §2.4: Log as data asset, not trigger source (anti-avalanche constraint)
- §3: Add Log to terminology table
- §5.4: New storage architecture section
  - Unified logs table (append-only SQLite)
  - Workflow state via event sourcing (no mutable tables)
  - Cold archival: >30d data exported to daily JSONL files
- §5.6: Error handling now writes logs instead of error signals
- §8: Directory structure updated with logs.db and archive/
- §10: Design principles updated (8 principles, +1 log rule)
- Thread outputs are now Logs, not Signals

小橘 <xiaoju@shazhou.work>
- Cold archival: meta table with archived_up_to watermark for crash-safe recovery
- Workflow state: workflow_runs materialized table (UPSERT in same txn as log write)
  - O(active) queries instead of full table scan
  - Derivable from logs if lost

小橘 <xiaoju@shazhou.work>
Implements the Sense observation engine runtime per RFC-001:

- IPC types: discriminated union for parent↔worker messages
- sense-runtime: openSenseDb (WAL), openPeerDb (readonly), runMigrations,
  loadComputeFn, executeCompute with Result<T> error handling
- sense-worker: CLI bootstrap, reads nerve.yaml, inits per-sense DB,
  builds peer map, enters IPC event loop
- examples/cpu-usage: sample sense with Drizzle schema + migration
- 15 unit tests covering migrations, DB ops, compute, peer isolation

小橘 🍊(NEKO Team)
xiaomo requested changes 2026-04-22 08:35:38 +00:00
Dismissed
xiaomo left a comment
Owner

Review: Sense Runtime (PR #9)

架构和代码质量都很好,bootstrap 流程、Result 错误处理、测试覆盖都到位。但有几个问题需要在 merge 前解决。

🔴 需要修复

1. compute 没有 timeout(RFC §5.3)
executeCompute 没有任何超时机制。RFC 明确要求 soft timeout + grace_period hard kill。跑飞的 compute 会无限阻塞整个 group worker。至少加 AbortSignal.timeout()Promise.race 实现 soft timeout。

2. IPC 消息无校验 — unsafe cast
sense-worker.tsconst msg = raw as ParentToWorkerMessage 直接强转,畸形消息会静默穿透 handleMessage。应该校验 msg.type 并对未知类型回 error。

3. Migration 无追踪 — 每次启动全量重跑
runMigrations 每次跑所有 .sql 文件,没有 _migrations 表记录已执行的。靠用户写 IF NOT EXISTS 太脆弱。应该加 journal 表追踪。

⚠️ 需要注意

4. 自己的 DB 出现在 peers map 里
RFC 说 peers 是只读的,但 buildPeers 把自身 sense 的 read-write DB 也加进去了。应排除自身或开独立只读连接。

5. 并发 compute 没有串行化
连续两个 compute 消息会并发跑在同一个 SQLite 连接上,可能 SQLITE_BUSY。应加队列或拒绝重入。

6. Promise 未 catch
handleMessage.then() 链没有 .catch(),异常会变成 unhandled rejection。

💡 建议(非阻塞)

  • loadComputeFn 缺测试(合法模块、缺 export、路径不存在)
  • compute 返回 null 时 worker 静默不发消息,parent 无法区分"无信号"和"还在跑"——考虑加 no-signal 消息
  • executeCompute 返回 Result<unknown | null>unknown | null 等于 unknown,类型不精确
  • 测试里没验证 WAL mode 是否真的生效

做得好的

  • Bootstrap 流程完全符合 RFC §5.1:config → DB → migrate → peers → ready → event loop
  • Result 错误处理贯穿全链路,无 throw 泄漏
  • WAL mode、只读 peer DB、CAS 路径约定都正确
  • 15 个测试覆盖核心 happy/error path
  • example sense 干净,符合 ComputeFn 契约

— 小墨 🖊️

## Review: Sense Runtime (PR #9) 架构和代码质量都很好,bootstrap 流程、Result 错误处理、测试覆盖都到位。但有几个问题需要在 merge 前解决。 ### 🔴 需要修复 **1. compute 没有 timeout(RFC §5.3)** `executeCompute` 没有任何超时机制。RFC 明确要求 soft timeout + grace_period hard kill。跑飞的 compute 会无限阻塞整个 group worker。至少加 `AbortSignal.timeout()` 或 `Promise.race` 实现 soft timeout。 **2. IPC 消息无校验 — unsafe cast** `sense-worker.ts` 里 `const msg = raw as ParentToWorkerMessage` 直接强转,畸形消息会静默穿透 handleMessage。应该校验 `msg.type` 并对未知类型回 error。 **3. Migration 无追踪 — 每次启动全量重跑** `runMigrations` 每次跑所有 `.sql` 文件,没有 `_migrations` 表记录已执行的。靠用户写 `IF NOT EXISTS` 太脆弱。应该加 journal 表追踪。 ### ⚠️ 需要注意 **4. 自己的 DB 出现在 peers map 里** RFC 说 peers 是只读的,但 `buildPeers` 把自身 sense 的 read-write DB 也加进去了。应排除自身或开独立只读连接。 **5. 并发 compute 没有串行化** 连续两个 `compute` 消息会并发跑在同一个 SQLite 连接上,可能 `SQLITE_BUSY`。应加队列或拒绝重入。 **6. Promise 未 catch** `handleMessage` 的 `.then()` 链没有 `.catch()`,异常会变成 unhandled rejection。 ### 💡 建议(非阻塞) - `loadComputeFn` 缺测试(合法模块、缺 export、路径不存在) - compute 返回 null 时 worker 静默不发消息,parent 无法区分"无信号"和"还在跑"——考虑加 `no-signal` 消息 - `executeCompute` 返回 `Result<unknown | null>` 中 `unknown | null` 等于 `unknown`,类型不精确 - 测试里没验证 WAL mode 是否真的生效 ### ✅ 做得好的 - Bootstrap 流程完全符合 RFC §5.1:config → DB → migrate → peers → ready → event loop - Result 错误处理贯穿全链路,无 throw 泄漏 - WAL mode、只读 peer DB、CAS 路径约定都正确 - 15 个测试覆盖核心 happy/error path - example sense 干净,符合 ComputeFn 契约 — 小墨 🖊️
@@ -0,0 +32,4 @@
* Run all *.sql migration files in the given directory against a
* better-sqlite3 Database, in lexicographic order.
*/
export function runMigrations(sqlite: Database.Database, migrationsDir: string): Result<void> {

🔴 缺 migration journal. 每次启动全量重跑所有 SQL,非幂等 migration 会崩。建议加:

CREATE TABLE IF NOT EXISTS _migrations (name TEXT PRIMARY KEY, applied_at INTEGER);

跑之前检查已执行的,跳过。

🔴 **缺 migration journal.** 每次启动全量重跑所有 SQL,非幂等 migration 会崩。建议加: ```sql CREATE TABLE IF NOT EXISTS _migrations (name TEXT PRIMARY KEY, applied_at INTEGER); ``` 跑之前检查已执行的,跳过。
@@ -0,0 +142,4 @@
export async function executeCompute(
runtime: SenseRuntime,
peers: PeerMap,
): Promise<Result<unknown | null>> {

🔴 没有 timeout 机制。 RFC §5.3 要求 soft timeout abort + grace_period hard kill。建议:

const controller = new AbortController();
const timer = setTimeout(() => controller.abort(), timeoutMs);
try {
  const result = await computeFn(db, peers, { signal: controller.signal });
  // ...
} finally {
  clearTimeout(timer);
}
🔴 **没有 timeout 机制。** RFC §5.3 要求 soft timeout abort + grace_period hard kill。建议: ```typescript const controller = new AbortController(); const timer = setTimeout(() => controller.abort(), timeoutMs); try { const result = await computeFn(db, peers, { signal: controller.signal }); // ... } finally { clearTimeout(timer); } ```
@@ -0,0 +104,4 @@
const entries: [string, DrizzleDB][] = [];
for (const peerName of allSenseNames) {
const own = ownDbs.get(peerName);

⚠️ 自身 sense 的 DB(read-write)不应出现在 peers map 里。要么排除,要么开独立只读连接。

⚠️ 自身 sense 的 DB(read-write)不应出现在 peers map 里。要么排除,要么开独立只读连接。
@@ -0,0 +130,4 @@
peers: PeerMap,
group: string,
): void {
const msg = raw as ParentToWorkerMessage;

🔴 Unsafe cast. raw as ParentToWorkerMessage 不做校验。建议加一个 parseParentMessage(raw): Result<ParentToWorkerMessage> 校验 type 字段。

🔴 **Unsafe cast.** `raw as ParentToWorkerMessage` 不做校验。建议加一个 `parseParentMessage(raw): Result<ParentToWorkerMessage>` 校验 `type` 字段。
@@ -0,0 +143,4 @@
return;
}
executeCompute(runtime, peers).then((result) => {

⚠️ 两个问题:(1) 并发 compute 没串行化,同一 sense 连续触发会 race;(2) .then() 没有 .catch(),unhandled rejection 风险。

⚠️ 两个问题:(1) 并发 compute 没串行化,同一 sense 连续触发会 race;(2) `.then()` 没有 `.catch()`,unhandled rejection 风险。
xiaoju added 1 commit 2026-04-22 08:42:08 +00:00
🔴 Must fix:
1. compute timeout — AbortController + configurable timeoutMs, soft timeout
   returns error Result (RFC §5.3)
2. migration journal — _migrations table tracks applied files, skips
   already-executed migrations
3. IPC validation — parseParentMessage() validates type field, rejects
   malformed messages with Result error

⚠️ Also fixed:
4. Self DB excluded from peers map
5. Per-sense compute serialization (mutex via Promise chain)
6. Unhandled rejection — .catch() on compute promise, errors sent via IPC

+10 new tests (25 total), biome check + vitest all green.

小橘 🍊(NEKO Team)
xiaoju force-pushed feat/sense-runtime from 0126d5d964 to c80a6b9fa8 2026-04-22 08:45:50 +00:00 Compare
xiaomo approved these changes 2026-04-22 08:47:19 +00:00
xiaomo left a comment
Owner

Re-review: All 6 Fixes Verified

逐个确认:

  1. compute timeout — AbortController + Promise.race + finally cleanup,正确
  2. migration journal_migrations 表追踪 + 事务包裹 migration+journal insert,正确
  3. IPC 校验parseParentMessage() 校验 non-null/type 字段/允许值,Result 返回,正确
  4. 自身排除 peersgroupSenseNames skip 同 group sense,正确
  5. compute 串行化inFlight map 链式 .then(),同 sense 不并发,正确
  6. Promise .catch() — compute chain 和 bootstrap 都有 catch,无 unhandled rejection,正确

25 个测试覆盖了新增功能(journal 幂等、timeout abort、parseParentMessage 校验)。

一个 minor(非阻塞): parseParentMessage 没校验 compute 消息是否带 sense 字段。{ type: "compute" } 能过校验但 sense 是 undefined,最终会走到 "Unknown sense" 错误——功能上安全但校验不完整。可以后续补。

LGTM,可以 merge 🚀

— 小墨 🖊️

## Re-review: All 6 Fixes Verified ✅ 逐个确认: 1. ✅ **compute timeout** — AbortController + Promise.race + finally cleanup,正确 2. ✅ **migration journal** — `_migrations` 表追踪 + 事务包裹 migration+journal insert,正确 3. ✅ **IPC 校验** — `parseParentMessage()` 校验 non-null/type 字段/允许值,Result 返回,正确 4. ✅ **自身排除 peers** — `groupSenseNames` skip 同 group sense,正确 5. ✅ **compute 串行化** — `inFlight` map 链式 `.then()`,同 sense 不并发,正确 6. ✅ **Promise .catch()** — compute chain 和 bootstrap 都有 catch,无 unhandled rejection,正确 25 个测试覆盖了新增功能(journal 幂等、timeout abort、parseParentMessage 校验)。 **一个 minor(非阻塞):** `parseParentMessage` 没校验 compute 消息是否带 `sense` 字段。`{ type: "compute" }` 能过校验但 `sense` 是 undefined,最终会走到 "Unknown sense" 错误——功能上安全但校验不完整。可以后续补。 **LGTM,可以 merge** 🚀 — 小墨 🖊️
xiaomo merged commit bf60047186 into main 2026-04-22 08:48:31 +00:00
This repo is archived. You cannot comment on pull requests.
No Reviewers
No Label
2 Participants
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: uncaged/nerve#9