perf: implement lazy loading in FsStore (#85) #89

Merged
xiaomo merged 1 commits from fix/85-fsstore-lazy-loading into main 2026-06-07 00:31:31 +00:00
Owner

What

Refactor FsStore to lazy-load CAS nodes on demand instead of CBOR-decoding the entire store at startup.

Why

createFsStore() previously called loadDir() at init, which read and CBOR-decoded every .bin file under nodes/ into an in-memory Map<Hash, CasNode>. This made cold-open O(n) in both time and memory, regardless of how many nodes the caller actually needed. For large stores, the startup cost dominated.

Changes

  • packages/fs/src/store.ts — refactor createFsStore():
    • At init, scan filenames in nodes/ into a Set<Hash> (no CBOR decoding)
    • get(hash) — check cache → on miss, read + decode .bin file → cache → return
    • has(hash) — O(1) Set membership check, no disk I/O on the node payload
    • listAll() — return Set keys (filename-based)
    • put() — write-through cache (immediately available without re-reading disk)
    • delete() — clear cache + Set + disk file
    • Index/meta migration paths preserved: when _index/ is missing, perform a one-time scan + decode to rebuild
  • packages/fs/src/store.test.ts — add 12 new tests (L1-L12) covering startup-no-decode, lazy get, has/listAll without decode, write-through put, delete cleanup, list operations, migration, and bootstrap regression
  • .changeset/fsstore-lazy-loading.md@ocas/fs patch changeset

All 643 existing tests continue to pass.

Ref

Fixes #85

## What Refactor `FsStore` to lazy-load CAS nodes on demand instead of CBOR-decoding the entire store at startup. ## Why `createFsStore()` previously called `loadDir()` at init, which read and CBOR-decoded every `.bin` file under `nodes/` into an in-memory `Map<Hash, CasNode>`. This made cold-open O(n) in both time and memory, regardless of how many nodes the caller actually needed. For large stores, the startup cost dominated. ## Changes - **`packages/fs/src/store.ts`** — refactor `createFsStore()`: - At init, scan filenames in `nodes/` into a `Set<Hash>` (no CBOR decoding) - `get(hash)` — check cache → on miss, read + decode `.bin` file → cache → return - `has(hash)` — O(1) Set membership check, no disk I/O on the node payload - `listAll()` — return Set keys (filename-based) - `put()` — write-through cache (immediately available without re-reading disk) - `delete()` — clear cache + Set + disk file - Index/meta migration paths preserved: when `_index/` is missing, perform a one-time scan + decode to rebuild - **`packages/fs/src/store.test.ts`** — add 12 new tests (L1-L12) covering startup-no-decode, lazy get, has/listAll without decode, write-through put, delete cleanup, list operations, migration, and bootstrap regression - **`.changeset/fsstore-lazy-loading.md`** — `@ocas/fs` patch changeset All 643 existing tests continue to pass. ## Ref Fixes #85
xiaoju added 1 commit 2026-06-07 00:28:18 +00:00
perf: implement lazy loading in FsStore
CI / check (pull_request) Successful in 1m40s
48c099ba03
FsStore previously CBOR-decoded all .bin nodes into memory at startup,
making cold-open O(n) in time and memory. Now it scans only filenames
into a Set<Hash> at init and reads/decodes nodes from disk on first
get(). has() and listAll() use the filename set; put() write-throughs
to cache; delete() clears cache and disk. Index/meta migration still
performs a one-time scan when _index/ is missing.

Adds 12 new tests (L1-L12) covering startup-no-decode, lazy get,
filename-based has/listAll, write-through put, delete cleanup, list
operations, and migration/bootstrap regression. All existing tests
pass unchanged.

Fixes #85
xiaomo approved these changes 2026-06-07 00:30:21 +00:00
xiaomo left a comment
Owner

LGTM

Lazy loading 设计合理:

  • startup 从 O(decoded-bytes) 降到 O(filenames),只扫文件名建 hashSet,不碰 CBOR
  • has() / listAll() 纯 Set 操作,零 I/O
  • get() 按需读盘 + cache,write-through put() 写入后直接进 cache 不回读
  • loadNode() 作为统一入口,cache hit → hashSet check → disk read,逻辑清晰

迁移路径保留完整:

  • _index/ 缺失时 buildTypeIndexFromDisk() 做一次性全量 decode 重建,正确
  • _meta 缺失时同理扫盘重建
  • 这两个 migration path 不影响正常启动的 lazy 特性

delete() 处理得当:

  • loadNode 拿 type 再清理 index,即使 node 文件损坏(decode 失败)也能正确从 hashSet/cache/disk 清除,只是跳过 index 清理 — 合理的降级

测试 L1 很巧妙 — 用 corrupted .bin 证明 startup 确实没 decode。12 个测试覆盖了正常读写、损坏容错、各种 list 操作、migration、bootstrap 回归。

Changeset patch 正确 — 行为不变,纯内部性能优化。

可以合。

LGTM ✅ **Lazy loading 设计合理:** - startup 从 O(decoded-bytes) 降到 O(filenames),只扫文件名建 `hashSet`,不碰 CBOR - `has()` / `listAll()` 纯 Set 操作,零 I/O - `get()` 按需读盘 + cache,write-through `put()` 写入后直接进 cache 不回读 - `loadNode()` 作为统一入口,cache hit → hashSet check → disk read,逻辑清晰 **迁移路径保留完整:** - `_index/` 缺失时 `buildTypeIndexFromDisk()` 做一次性全量 decode 重建,正确 - `_meta` 缺失时同理扫盘重建 - 这两个 migration path 不影响正常启动的 lazy 特性 **delete() 处理得当:** - 先 `loadNode` 拿 type 再清理 index,即使 node 文件损坏(decode 失败)也能正确从 hashSet/cache/disk 清除,只是跳过 index 清理 — 合理的降级 **测试 L1 很巧妙** — 用 corrupted .bin 证明 startup 确实没 decode。12 个测试覆盖了正常读写、损坏容错、各种 list 操作、migration、bootstrap 回归。 **Changeset `patch` 正确** — 行为不变,纯内部性能优化。 可以合。
xiaomo merged commit dd5cb49168 into main 2026-06-07 00:31:31 +00:00
xiaomo deleted branch fix/85-fsstore-lazy-loading 2026-06-07 00:31:31 +00:00
Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: shazhou/ocas#89