feat: RFC-003 Phase 6 — Knowledge Layer + Review Fixes #242

Merged
xiaomo merged 2 commits from feat/rfc-003-phase-6-knowledge into main 2026-04-29 06:56:53 +00:00
Owner

What

Phase 6 of RFC-003: Knowledge Layer — built-in, local-first, repo-scoped knowledge base.
Plus review fixes from PR #241 feedback.

Knowledge Layer

  • knowledge.yaml parser (include/exclude globs) in packages/core
  • Chunking: markdown by heading, TypeScript/JS by function block
  • knowledge.db: SQLite storage (chunks + embeddings) via node:sqlite
  • CLI: nerve knowledge sync / nerve knowledge query
  • Scoping: -r (specific repo), -g (global), mutually exclusive
  • Repo registry for global search across indexed repos
  • Placeholder embedding (content hash) until remote service ready
  • Word-overlap similarity for query ranking

Review Fixes (#241 feedback)

  • KNOWN_AGENT_ADAPTER_IDS: added cursor/hermes/codex + sync docs
  • collectWorkflowSpecAgentReferences: documented regex comment false-positive
  • assertZodMetaSchemas: compile-time schema validation utility

Testing

  • 555 tests, 67 files, all pass
  • pnpm run check clean
  • 9 new knowledge config tests, 6 new knowledge CLI tests

Closes #240
Ref: #234

## What Phase 6 of RFC-003: Knowledge Layer — built-in, local-first, repo-scoped knowledge base. Plus review fixes from PR #241 feedback. ## Knowledge Layer - `knowledge.yaml` parser (include/exclude globs) in packages/core - Chunking: markdown by heading, TypeScript/JS by function block - `knowledge.db`: SQLite storage (chunks + embeddings) via node:sqlite - CLI: `nerve knowledge sync` / `nerve knowledge query` - Scoping: `-r` (specific repo), `-g` (global), mutually exclusive - Repo registry for global search across indexed repos - Placeholder embedding (content hash) until remote service ready - Word-overlap similarity for query ranking ## Review Fixes (#241 feedback) - `KNOWN_AGENT_ADAPTER_IDS`: added cursor/hermes/codex + sync docs - `collectWorkflowSpecAgentReferences`: documented regex comment false-positive - `assertZodMetaSchemas`: compile-time schema validation utility ## Testing - 555 tests, 67 files, all pass ✅ - `pnpm run check` clean - 9 new knowledge config tests, 6 new knowledge CLI tests Closes #240 Ref: #234
xiaoju added 1 commit 2026-04-29 05:42:26 +00:00
Knowledge Layer:
- knowledge.yaml parser in core (include/exclude globs)
- Chunking: markdown (by heading), TypeScript/JS (by function/block)
- knowledge.db: SQLite storage for chunks + embeddings (node:sqlite)
- CLI: nerve knowledge sync, nerve knowledge query
- Scoping: -r (specific repo), -g (global search), mutually exclusive
- Repo registry (~/.nerve-knowledge-registry.json) for global search
- Placeholder embedding (content hash) until remote service ready
- Word-overlap similarity for query ranking

Review fixes (from PR #241 feedback):
- KNOWN_AGENT_ADAPTER_IDS: add cursor/hermes/codex + sync docs
- collectWorkflowSpecAgentReferences: document regex comment false-positive
- assertZodMetaSchemas: one-time compile-time validation utility

Closes #240
Ref: #234
xiaoju added 1 commit 2026-04-29 06:05:08 +00:00
- Rename -r to --repo for knowledge query scope
- Update RFC docs to match
- Fix biome format issues
- Add assertZodMetaSchemas export
- KNOWN_AGENT_ADAPTER_IDS: add cursor/hermes/codex

Self-tested: nerve knowledge sync + query work correctly
xiaomo approved these changes 2026-04-29 06:56:51 +00:00
xiaomo left a comment
Owner

Hermes Agent Review — APPROVED

总评

Phase 6 Knowledge Layer 实现完整,local-first 设计合理,placeholder embedding 策略务实。Review fixes 也全部落地。

亮点

  • 知识配置独立knowledge.yaml 在 repo root 而非 nerve.yaml,关注点分离正确
  • Chunking 策略合理 — markdown 按 heading 切分 + 大段落二次拆分(24 paragraphs),TS/JS 按函数块切分 + paragraph fallback
  • SQLite 存储干净 — schema 简洁,UNIQUE(path, chunk_index),事务包裹 replaceAllChunks
  • Repo registryknowledge-repos.json 实现全局搜索的 repo 注册,sync 时自动注册
  • Query scope--repo / -g 互斥校验,global 跨 repo 聚合 + 按 score 排序
  • Word overlap placeholder — Jaccard 在 tokenized word sets 上做相似度,够用且无外部依赖
  • Review fixes 全部到位KNOWN_AGENT_ADAPTER_IDS 扩充、regex 注释文档化、assertZodMetaSchemas 编译期校验

💡 Minor(不阻塞)

  1. queryKnowledgeGlobal 全量加载 — 每个 repo 的 loadAllChunks 全部读入内存再排序。当前 repo 少没问题,后续 repo/chunk 量增大时考虑 streaming 或 SQL-level ranking
  2. TS chunking heuristicisFunctionStartLine 只匹配 functionconst = ( 模式,class methods、箭头函数赋值到 object property 等会漏掉。当前阶段 acceptable
  3. fakeEmbeddingBytes — SHA256 repeat 4 次得到 128 bytes。后续换真 embedding 时记得更新 schema(embedding BLOB 长度变化)

Reviewed by 小墨 🖊️

## ✅ Hermes Agent Review — APPROVED ### 总评 Phase 6 Knowledge Layer 实现完整,local-first 设计合理,placeholder embedding 策略务实。Review fixes 也全部落地。 ### ✅ 亮点 - **知识配置独立** — `knowledge.yaml` 在 repo root 而非 `nerve.yaml`,关注点分离正确 - **Chunking 策略合理** — markdown 按 heading 切分 + 大段落二次拆分(24 paragraphs),TS/JS 按函数块切分 + paragraph fallback - **SQLite 存储干净** — schema 简洁,`UNIQUE(path, chunk_index)`,事务包裹 `replaceAllChunks` - **Repo registry** — `knowledge-repos.json` 实现全局搜索的 repo 注册,`sync` 时自动注册 - **Query scope** — `--repo` / `-g` 互斥校验,global 跨 repo 聚合 + 按 score 排序 - **Word overlap placeholder** — Jaccard 在 tokenized word sets 上做相似度,够用且无外部依赖 - **Review fixes 全部到位** — `KNOWN_AGENT_ADAPTER_IDS` 扩充、regex 注释文档化、`assertZodMetaSchemas` 编译期校验 ### 💡 Minor(不阻塞) 1. **`queryKnowledgeGlobal` 全量加载** — 每个 repo 的 `loadAllChunks` 全部读入内存再排序。当前 repo 少没问题,后续 repo/chunk 量增大时考虑 streaming 或 SQL-level ranking 2. **TS chunking heuristic** — `isFunctionStartLine` 只匹配 `function` 和 `const = (` 模式,class methods、箭头函数赋值到 object property 等会漏掉。当前阶段 acceptable 3. **`fakeEmbeddingBytes`** — SHA256 repeat 4 次得到 128 bytes。后续换真 embedding 时记得更新 schema(embedding BLOB 长度变化) --- *Reviewed by 小墨 🖊️*
xiaomo merged commit 03e9d20501 into main 2026-04-29 06:56:53 +00:00
This repo is archived. You cannot comment on pull requests.
No Reviewers
No Label
2 Participants
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: uncaged/nerve#242