diff --git a/.cursor/rules/sync-readme.mdc b/.cursor/rules/sync-readme.mdc index b8a6ab2..96bd435 100644 --- a/.cursor/rules/sync-readme.mdc +++ b/.cursor/rules/sync-readme.mdc @@ -1,68 +1,3 @@ # Sync README -When updating README.md files in this monorepo, follow these conventions. - -## Scope - -- Root `README.md` — project overview and navigation hub -- Per-package `packages/*/README.md` — each package self-contained - -## Root README Structure - -The root README should have these sections in order: - -1. **Title and one-liner** — content-addressed storage for JSON with schema validation -2. **Overview** — 2-3 paragraphs explaining what it does and key concepts -3. **Architecture** — dependency layer diagram (text-based) -4. **Packages** — table with ALL packages from packages/ directory, columns: Package, Description, Type (cli/lib) -5. **Quick Start** — install, build, basic usage -6. **CLI Reference** — brief command list, detailed usage in cli-json-cas README -7. **Development** — bun install / build / check / test -8. **Publishing** — changeset workflow (bun run release) - -## Per-Package README Structure - -Each package README should have: - -1. **Title** — package name -2. **One-line description** — matching package.json -3. **Overview** — what it does, where it sits in the architecture, dependencies -4. **Installation** — bun add (for libs) or "included as binary" (for cli) -5. **API** (lib packages) — all exports from src/index.ts with type signatures, grouped by category, minimal usage examples -6. **CLI Usage** (cli packages) — command reference with examples -7. **Internal Structure** — brief src/ file organization -8. **Configuration** (if applicable) - -## Execution Steps - -### Step 1: Gather current state -For each package read: -- package.json (name, version, description, dependencies, bin) -- src/index.ts (public API exports) -- Existing README.md (preserve hand-written content worth keeping) - -### Step 2: Update root README -- Ensure ALL packages in packages/ directory are listed in the table -- Update CLI command reference from actual --help output -- Keep Quick Start examples valid - -### Step 3: Write/update each package README -- Follow the per-package structure -- API section MUST match actual src/index.ts exports — never invent -- For cli packages: document CLI binary name, how it is invoked -- For lib packages: document exported types and functions -- Internal structure: list actual files in src/ - -### Step 4: Verify -- All relative links work -- Package names match package.json -- No references to removed/renamed packages -- bun run build still passes - -## Guidelines - -- Only document what src/index.ts actually exports -- Root README summarizes, package READMEs go into detail -- Verify CLI examples against actual commands -- Preserve existing good prose when updating -- English for all README content +See [docs/sync-readme.md](../../docs/sync-readme.md) for full rules. diff --git a/CLAUDE.md b/CLAUDE.md index 9fe4b3b..a54962f 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -67,6 +67,10 @@ bun run format # Biome format (auto-fix) - Reference issues: `Fixes #N` / `Closes #N` - Author: `小橘 ` +## Project Rules + +- [docs/sync-readme.md](docs/sync-readme.md) — README sync conventions + ## Before Submitting 1. `bun test` — all tests pass diff --git a/README.md b/README.md index 4a60ac6..1a0586d 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,139 @@ # json-cas -Self-describing content-addressable storage with JSON Schema typed nodes \ No newline at end of file +Self-describing content-addressable storage with JSON Schema typed nodes. + +## Overview + +json-cas is a monorepo for storing and validating JSON data in a content-addressable store (CAS). Each node has a typed payload: its `type` field is the hash of a JSON Schema node that describes the payload shape. Hashes are 13-character Crockford Base32 strings derived from XXH64 over deterministic CBOR encoding. + +A bootstrap meta-schema is stored as a self-referencing seed node (`type === hash`). All other schemas are registered as nodes typed by that meta-schema. Payloads can reference other nodes via `format: "cas_ref"` fields; the library provides traversal, reference extraction, and integrity verification. + +Use the in-memory store for tests and embedded apps, the filesystem store for persistence, the workflow package for pre-built agent/workflow schemas, and the CLI for local store management. + +## Architecture + +``` + ┌─────────────────┐ + │ cli-json-cas │ + └────────┬────────┘ + │ + ┌──────────────┴──────────────┐ + ▼ ▼ + ┌─────────────────┐ ┌──────────────────────┐ + │ json-cas-fs │ │ json-cas-workflow │ + └────────┬────────┘ └──────────┬───────────┘ + │ │ + └──────────────┬───────────────┘ + ▼ + ┌─────────────────┐ + │ json-cas │ (core) + └─────────────────┘ +``` + +| Layer | Package | Role | +|-------|---------|------| +| Core | `@uncaged/json-cas` | Hashing, schemas, stores, verify, bootstrap | +| Storage | `@uncaged/json-cas-fs` | Filesystem-backed `Store` | +| Domain | `@uncaged/json-cas-workflow` | Workflow/agent JSON Schemas and payload types | +| CLI | `@uncaged/cli-json-cas` | `json-cas` command-line tool | + +## Packages + +| Package | Description | Type | +|---------|-------------|------| +| [`@uncaged/json-cas`](packages/json-cas/README.md) | Core CAS engine — hashing, schema, store, verify, bootstrap | lib | +| [`@uncaged/json-cas-fs`](packages/json-cas-fs/README.md) | Filesystem-backed CAS store | lib | +| [`@uncaged/json-cas-workflow`](packages/json-cas-workflow/README.md) | Workflow integration layer (schemas + types) | lib | +| [`@uncaged/cli-json-cas`](packages/cli-json-cas/README.md) | CLI tool (`json-cas` binary) | cli | + +## Quick Start + +```bash +git clone +cd json-cas +bun install --no-cache +bun run build +``` + +```typescript +import { + bootstrap, + createMemoryStore, + putSchema, + validate, +} from "@uncaged/json-cas"; + +const store = createMemoryStore(); +await bootstrap(store); + +const typeHash = await putSchema(store, { + type: "object", + properties: { message: { type: "string" } }, + required: ["message"], + additionalProperties: false, +}); + +const hash = await store.put(typeHash, { message: "hello" }); +const node = store.get(hash); +console.log(validate(store, node!)); // true +``` + +For a persistent store: + +```typescript +import { createFsStore } from "@uncaged/json-cas-fs"; +import { bootstrap } from "@uncaged/json-cas"; + +const store = createFsStore("/path/to/store"); +await bootstrap(store); +``` + +Or use the CLI (see [CLI Reference](#cli-reference) and [`packages/cli-json-cas/README.md`](packages/cli-json-cas/README.md)). + +## CLI Reference + +Binary: `json-cas` (from `@uncaged/cli-json-cas`). Default store: `~/.uncaged/json-cas`. + +``` +Usage: json-cas [--store ] [--json] [args] + +Commands: + init Create store dir and write bootstrap seed + bootstrap Write meta-schema seed, print hash + schema put Register schema, print type hash + schema get Print schema JSON + schema list List all schemas (name + hash) + schema validate Validate node against its schema + put Store node, print hash + get Print node as JSON + has Print true/false + verify Verify integrity, print ok/corrupted + refs List direct cas_ref edges + walk [--format tree] Recursive traversal + hash Compute hash without storing (dry run) + cat [--payload] Output node (--payload for payload only) + +Flags: + --store Store directory (default: ~/.uncaged/json-cas) + --json Compact JSON output +``` + +## Development + +```bash +bun install --no-cache # install workspace dependencies +bun run build # tsc --build (libs) +bun run check # biome check +bun run format # biome format --write +bun test # run all package tests +``` + +## Publishing + +Releases use [Changesets](https://github.com/changesets/changesets). From the repo root: + +```bash +bun run release # changeset version → build → publish to npm (@uncaged/*) +``` + +Individual packages block `prepublishOnly` and expect releases via the workspace `release` script. diff --git a/docs/sync-readme.md b/docs/sync-readme.md new file mode 100644 index 0000000..b8a6ab2 --- /dev/null +++ b/docs/sync-readme.md @@ -0,0 +1,68 @@ +# Sync README + +When updating README.md files in this monorepo, follow these conventions. + +## Scope + +- Root `README.md` — project overview and navigation hub +- Per-package `packages/*/README.md` — each package self-contained + +## Root README Structure + +The root README should have these sections in order: + +1. **Title and one-liner** — content-addressed storage for JSON with schema validation +2. **Overview** — 2-3 paragraphs explaining what it does and key concepts +3. **Architecture** — dependency layer diagram (text-based) +4. **Packages** — table with ALL packages from packages/ directory, columns: Package, Description, Type (cli/lib) +5. **Quick Start** — install, build, basic usage +6. **CLI Reference** — brief command list, detailed usage in cli-json-cas README +7. **Development** — bun install / build / check / test +8. **Publishing** — changeset workflow (bun run release) + +## Per-Package README Structure + +Each package README should have: + +1. **Title** — package name +2. **One-line description** — matching package.json +3. **Overview** — what it does, where it sits in the architecture, dependencies +4. **Installation** — bun add (for libs) or "included as binary" (for cli) +5. **API** (lib packages) — all exports from src/index.ts with type signatures, grouped by category, minimal usage examples +6. **CLI Usage** (cli packages) — command reference with examples +7. **Internal Structure** — brief src/ file organization +8. **Configuration** (if applicable) + +## Execution Steps + +### Step 1: Gather current state +For each package read: +- package.json (name, version, description, dependencies, bin) +- src/index.ts (public API exports) +- Existing README.md (preserve hand-written content worth keeping) + +### Step 2: Update root README +- Ensure ALL packages in packages/ directory are listed in the table +- Update CLI command reference from actual --help output +- Keep Quick Start examples valid + +### Step 3: Write/update each package README +- Follow the per-package structure +- API section MUST match actual src/index.ts exports — never invent +- For cli packages: document CLI binary name, how it is invoked +- For lib packages: document exported types and functions +- Internal structure: list actual files in src/ + +### Step 4: Verify +- All relative links work +- Package names match package.json +- No references to removed/renamed packages +- bun run build still passes + +## Guidelines + +- Only document what src/index.ts actually exports +- Root README summarizes, package READMEs go into detail +- Verify CLI examples against actual commands +- Preserve existing good prose when updating +- English for all README content diff --git a/packages/cli-json-cas/README.md b/packages/cli-json-cas/README.md new file mode 100644 index 0000000..f8bd5fd --- /dev/null +++ b/packages/cli-json-cas/README.md @@ -0,0 +1,98 @@ +# @uncaged/cli-json-cas + +CLI tool for json-cas stores. + +## Overview + +`@uncaged/cli-json-cas` provides the `json-cas` command for managing a filesystem-backed store: bootstrap, schema registration, node CRUD, integrity checks, reference listing, and graph walks. It uses `@uncaged/json-cas-fs` for persistence and `@uncaged/json-cas` for core operations. + +**Dependencies:** `@uncaged/json-cas`, `@uncaged/json-cas-fs` + +## Installation + +Published as an npm package with a binary entry: + +```bash +bun add -g @uncaged/cli-json-cas +# or from the monorepo workspace: +bun link +``` + +**Binary name:** `json-cas` (points to `src/index.ts`, run with Bun). + +In development: + +```bash +bun packages/cli-json-cas/src/index.ts [args] +``` + +## CLI Usage + +``` +Usage: json-cas [--store ] [--json] [args] +``` + +### Global flags + +| Flag | Description | +|------|-------------| +| `--store ` | Store directory (default: `~/.uncaged/json-cas`) | +| `--json` | Compact JSON output for commands that print JSON | + +### Commands + +| Command | Description | +|---------|-------------| +| `init` | Create store directory and write bootstrap seed; prints meta hash | +| `bootstrap` | Write meta-schema seed into existing store; prints hash | +| `schema put ` | Register schema from file; prints type hash | +| `schema get ` | Print schema JSON | +| `schema list` | List all schemas (`hash name`) | +| `schema validate ` | Validate node against its schema; prints `valid` / `invalid` | +| `put ` | Store node; prints content hash | +| `get ` | Print full node as JSON | +| `has ` | Print `true` or `false` | +| `verify ` | Verify integrity; prints `ok` or `corrupted` | +| `refs ` | Print direct `cas_ref` targets (one per line) | +| `walk ` | BFS traversal; one hash per line | +| `walk --format tree` | Tree-formatted traversal | +| `hash ` | Compute hash without storing | +| `cat ` | Print node JSON | +| `cat --payload` | Print payload only | + +### Examples + +```bash +# Initialize default store at ~/.uncaged/json-cas +json-cas init + +# Use a custom store path +json-cas --store ./data/cas bootstrap + +# Register a schema and store a payload +json-cas schema put ./schemas/item.json +# → prints type hash, e.g. 0123456789ABCD + +json-cas put 0123456789ABCD ./payloads/item.json +# → prints content hash + +json-cas get --json +json-cas verify +json-cas walk --format tree +``` + +## Internal Structure + +| File | Purpose | +|------|---------| +| `index.ts` | Argument parsing, command dispatch, and all CLI logic | + +There is no separate `src/` module tree; the CLI is a single entry file. Tests (if present) are co-located under the package. + +## Configuration + +| Setting | Default | Override | +|---------|---------|----------| +| Store directory | `~/.uncaged/json-cas` | `--store ` | + +No config file is read; all behavior is controlled via flags and command arguments. diff --git a/packages/json-cas-fs/README.md b/packages/json-cas-fs/README.md new file mode 100644 index 0000000..fdf4082 --- /dev/null +++ b/packages/json-cas-fs/README.md @@ -0,0 +1,67 @@ +# @uncaged/json-cas-fs + +Filesystem-backed CAS store. + +## Overview + +`@uncaged/json-cas-fs` implements a persistent `Store` on disk. Each node is stored as `.bin` (CBOR-encoded `CasNode`). A `_index/` directory maps type hashes to content hashes for `listByType`. Stores support bootstrap via the same `BOOTSTRAP_STORE` symbol as the in-memory implementation. + +Depends on `@uncaged/json-cas` for hashing, CBOR encoding, and types. + +**Dependencies:** `@uncaged/json-cas`, `cborg` + +## Installation + +```bash +bun add @uncaged/json-cas-fs +``` + +## API + +Exported from `src/index.ts`: + +```typescript +function createFsStore(dir: string): BootstrapCapableStore; +``` + +`BootstrapCapableStore` is re-exported from `@uncaged/json-cas` (via the return type). The store loads existing `.bin` files on open and migrates or builds the type index on first use. + +### Example + +```typescript +import { bootstrap, putSchema } from "@uncaged/json-cas"; +import { createFsStore } from "@uncaged/json-cas-fs"; + +const store = createFsStore("./my-cas-store"); +await bootstrap(store); + +const typeHash = await putSchema(store, { + type: "object", + properties: { id: { type: "string" } }, + required: ["id"], + additionalProperties: false, +}); + +const hash = await store.put(typeHash, { id: "item-1" }); +console.log(store.has(hash)); // true after restart if same dir +``` + +### On-disk layout + +``` +my-cas-store/ +├── .bin # CBOR CasNode +├── _index/ +│ └── # newline-separated content hashes +└── ... +``` + +Writes use atomic rename (`.tmp` → `.bin`). + +## Internal Structure + +| File | Purpose | +|------|---------| +| `store.ts` | `createFsStore`, load/save nodes and type index | +| `index.ts` | Public export | +| `store.test.ts` | Filesystem store tests | diff --git a/packages/json-cas-workflow/README.md b/packages/json-cas-workflow/README.md new file mode 100644 index 0000000..6ec9721 --- /dev/null +++ b/packages/json-cas-workflow/README.md @@ -0,0 +1,177 @@ +# @uncaged/json-cas-workflow + +Workflow integration layer (schemas + types). + +## Overview + +`@uncaged/json-cas-workflow` registers eleven JSON Schemas for agent/workflow execution graphs (definitions, thread lifecycle, content, and React-style tool/session nodes) and exports matching TypeScript payload types. Call `registerWorkflowSchemas(store)` once per store to obtain type hashes for each schema. + +Sits above `@uncaged/json-cas`; typically used with `json-cas-fs` or `createMemoryStore` when building workflow-aware applications. + +**Dependencies:** `@uncaged/json-cas` + +## Installation + +```bash +bun add @uncaged/json-cas-workflow +``` + +## API + +Exported from `src/index.ts`. + +### Schema registry + +```typescript +type WorkflowSchemaHashes = { + agent: Hash; + roleSchema: Hash; + role: Hash; + workflow: Hash; + threadStart: Hash; + threadStep: Hash; + threadEnd: Hash; + content: Hash; + reactSession: Hash; + reactTurn: Hash; + reactToolCall: Hash; +}; + +async function registerWorkflowSchemas(store: Store): Promise; +``` + +Idempotent: safe to call multiple times on the same store (duplicate puts return the same hashes). + +### Payload types + +Definition layer: + +```typescript +type AgentPayload = { + package: string; + version: string; + config: Record; +}; + +type RoleSchemaPayload = Record; + +type RolePayload = { + name: string; + description: string; + systemPrompt: string; + extractPrompt: string; + schema: Hash; // cas_ref → role-schema +}; + +type WorkflowTransition = { + from: string; + to: string; + when: string | null; +}; + +type WorkflowPayload = { + name: string; + description: string; + roles: Record; // cas_ref → role + moderator: WorkflowTransition[]; +}; +``` + +Execution layer: + +```typescript +type ThreadStartPayload = { + workflow: Hash; + input: string; + depth: number; + parentThread: Hash | null; + agents: Record; +}; + +type ThreadStepPayload = { + role: string; + meta: Record; + content: Hash; + react: Hash; + start: Hash; + previous: Hash | null; +}; + +type ThreadEndPayload = { + returnCode: number; + summary: string; + start: Hash; + lastStep: Hash; +}; + +type ContentPayload = { + text: string; +}; +``` + +React layer: + +```typescript +type ReactTurnTokens = { + input: number; + output: number; +}; + +type ReactSessionPayload = { + agent: Hash; + role: string; + turns: Hash[]; + totalTokens: number; + durationMs: number; +}; + +type ReactTurnPayload = { + input: Hash; + output: Hash; + toolCalls: Hash[]; + tokens: ReactTurnTokens; + latencyMs: number; +}; + +type ReactToolCallPayload = { + name: string; + arguments: Hash; + result: Hash; + durationMs: number; +}; +``` + +(`Hash` is imported from `@uncaged/json-cas` in source; consumers should import `Hash` from `@uncaged/json-cas` when typing their own code.) + +### Example + +```typescript +import { bootstrap, createMemoryStore } from "@uncaged/json-cas"; +import { + registerWorkflowSchemas, + type WorkflowPayload, +} from "@uncaged/json-cas-workflow"; + +const store = createMemoryStore(); +await bootstrap(store); + +const schemas = await registerWorkflowSchemas(store); + +const workflowHash = await store.put(schemas.workflow, { + name: "demo", + description: "Example workflow", + roles: {}, + moderator: [], +} satisfies WorkflowPayload); + +console.log(workflowHash); +``` + +## Internal Structure + +| File | Purpose | +|------|---------| +| `schemas.ts` | JSON Schema definitions and `registerWorkflowSchemas` | +| `types.ts` | Payload TypeScript types | +| `index.ts` | Public exports | +| `index.test.ts` | Registry and schema tests | diff --git a/packages/json-cas/README.md b/packages/json-cas/README.md new file mode 100644 index 0000000..c25358f --- /dev/null +++ b/packages/json-cas/README.md @@ -0,0 +1,159 @@ +# @uncaged/json-cas + +Core CAS engine — hashing, schema, store, verify, bootstrap. + +## Overview + +`@uncaged/json-cas` is the foundation of the json-cas monorepo. It defines content-addressed nodes (`CasNode`), the `Store` interface, XXH64-based hashing with deterministic CBOR, JSON Schema registration and validation (including `cas_ref` links between nodes), bootstrap seeding, and integrity verification. + +Other packages build on this layer: `json-cas-fs` provides persistence, `json-cas-workflow` registers domain schemas, and `cli-json-cas` exposes store operations on the command line. + +**Dependencies:** `ajv`, `cborg`, `xxhash-wasm` + +## Installation + +```bash +bun add @uncaged/json-cas +``` + +## API + +All symbols below are exported from `src/index.ts`. + +### Types + +```typescript +/** 13-character uppercase Crockford Base32 (XXH64) */ +type Hash = string; + +type CasNode = { + type: Hash; + payload: T; + timestamp: number; // Unix epoch ms +}; + +type Store = { + put(typeHash: Hash, payload: unknown): Promise; + get(hash: Hash): CasNode | null; + has(hash: Hash): boolean; + listByType(typeHash: Hash): Hash[]; +}; + +type JSONSchema = Record; + +type BootstrapCapableStore = Store & { + [BOOTSTRAP_STORE](payload: unknown): Promise; +}; +``` + +### Hashing + +```typescript +function computeHash(typeHash: Hash, payload: unknown): Promise; +function computeSelfHash(payload: unknown): Promise; +function cborEncode(value: unknown): Uint8Array; +``` + +`computeHash` — `XXH64(utf8(typeHash) ++ CBOR(payload))` for normal nodes. + +`computeSelfHash` — `XXH64(CBOR(payload))` for bootstrap nodes where `type === hash`. + +### Bootstrap + +```typescript +const BOOTSTRAP_STORE: unique symbol; + +async function bootstrap(store: Store): Promise; +``` + +Writes the meta-schema seed node (idempotent). Requires a `BootstrapCapableStore` (e.g. from `createMemoryStore()`). + +### Schema + +```typescript +class SchemaValidationError extends Error; + +async function putSchema(store: Store, jsonSchema: JSONSchema): Promise; +function getSchema(store: Store, typeHash: Hash): JSONSchema | null; +function validate(store: Store, node: CasNode): boolean; +function refs(store: Store, node: CasNode): Hash[]; +function walk( + store: Store, + rootHash: Hash, + visitor: (hash: Hash, node: CasNode) => void, +): void; +``` + +- `putSchema` — stores a schema typed by the meta-schema; returned hash is the `typeHash` for conforming payloads. +- `refs` — collects all `format: "cas_ref"` values in the payload per schema shape. +- `walk` — BFS from `rootHash`, following `cas_ref` edges; cycles are visited once. + +### Store + +```typescript +function createMemoryStore(): BootstrapCapableStore; +``` + +In-memory `Store` with type indexing, suitable for tests and ephemeral use. + +### Verify + +```typescript +async function verify(hash: Hash, node: CasNode): Promise; +``` + +Recomputes hash from `node` and compares to `hash` (self-referencing vs normal rules). + +### Example + +```typescript +import { + bootstrap, + createMemoryStore, + putSchema, + refs, + validate, + walk, +} from "@uncaged/json-cas"; + +const store = createMemoryStore(); +const metaHash = await bootstrap(store); + +const personType = await putSchema(store, { + type: "object", + properties: { + name: { type: "string" }, + friend: { type: "string", format: "cas_ref" }, + }, + required: ["name"], + additionalProperties: false, +}); + +const aliceHash = await store.put(personType, { name: "Alice" }); +const bobHash = await store.put(personType, { + name: "Bob", + friend: aliceHash, +}); + +const bob = store.get(bobHash)!; +console.log(validate(store, bob)); // true +console.log(refs(store, bob)); // [aliceHash] +walk(store, bobHash, (h) => console.log(h)); // bobHash, aliceHash +``` + +## Internal Structure + +| File | Purpose | +|------|---------| +| `types.ts` | `Hash`, `CasNode`, `Store` | +| `hash.ts` | `computeHash`, `computeSelfHash` | +| `cbor.ts` | Deterministic CBOR encoding | +| `bootstrap-capable.ts` | `BOOTSTRAP_STORE` symbol and capability check | +| `bootstrap.ts` | Meta-schema seed and `bootstrap()` | +| `store.ts` | `createMemoryStore()` | +| `mem-store.ts` | Alternate in-memory store (tests only; not exported) | +| `schema.ts` | Schema put/get/validate, `refs`, `walk` | +| `verify.ts` | Node integrity verification | +| `index.ts` | Public exports | + +Tests live in `src/*.test.ts` and `tests/`.