Merge pull request 'docs: knowledge cards — 8 core concept cards' (#15) from docs/knowledge-cards into main

docs: knowledge cards — 8 core concept cards (#15)
This commit was merged in pull request #15.
This commit is contained in:
2026-06-01 08:52:32 +00:00
8 changed files with 452 additions and 0 deletions
+41
View File
@@ -0,0 +1,41 @@
---
title: Bootstrap
aliases: [Meta-Schema, Self-Reference, 引导]
tags: [concept, architecture]
related: [Schema, Store, Content Addressing]
---
# Bootstrap
Bootstrap solves the chicken-and-egg problem of a typed content-addressed store: every node needs a schema, but schemas are themselves nodes — so what is the schema of the first schema?
## The Self-Referencing Meta-Schema
The answer is a **self-referencing node** — a node whose `type` field points to its own hash:
```
meta-schema node:
hash: X
type: X ← points to itself
payload: { ... } ← defines what a valid schema looks like
```
This is the only node in the entire store where `type === hash`. It is the root of the type hierarchy — every schema's type chain eventually leads back to it.
## What Bootstrap Registers
`bootstrap(store)` is called once when a [[Store]] is opened (and is idempotent). It registers:
1. **The meta-schema** — defines the structure of all schemas (allowed JSON Schema keywords)
2. **Primitive type schemas**`@ocas/string`, `@ocas/number`, `@ocas/object`, `@ocas/array`, `@ocas/bool`
3. **Output schemas** — 18 `@ocas/output/*` schemas for [[Render System|CLI envelope]] types
All of these are stored as regular CAS nodes, addressable by hash. The `@ocas/*` aliases are convenience names resolved at runtime.
## Idempotency
Calling `bootstrap()` multiple times returns the same hashes — because the payloads are identical, [[Content Addressing]] guarantees the same hashes. This means `openStore()` can safely bootstrap on every open without worrying about duplicates.
## Why It Matters
Bootstrap is what makes OCAS fully self-describing. There is no external type registry, no hardcoded special cases (except the single self-reference). A store file contains everything needed to understand its own data — schemas, types, and the meta-schema that defines them all.
+83
View File
@@ -0,0 +1,83 @@
---
title: CLI
aliases: [ocas command, 命令行]
tags: [api]
related: [Store, Render System, Variable, Schema]
---
# CLI
The `ocas` CLI is the primary interface for interacting with an OCAS [[Store]]. All commands output JSON in the [[Render System|envelope]] format (`{ type, value }`), making them composable via pipes.
## Configuration
| Priority | Source | Example |
|----------|--------|---------|
| 1 | `--home <path>` flag | `ocas --home /tmp/mystore put ...` |
| 2 | `OCAS_HOME` env var | `export OCAS_HOME=/data/ocas` |
| 3 | Default | `~/.ocas` |
The variable database lives at `<home>/variables.db` by default, overridable with `--var-db <path>`.
## Commands
### CAS Operations
```bash
ocas put <type> <file|--pipe> # store a node, returns its hash
ocas get <hash> # retrieve a node
ocas has <hash> # check existence
ocas hash <type> <file|--pipe> # compute hash without storing
ocas verify <hash> # check integrity + schema validity
ocas refs <hash> # list direct ocas_ref edges
ocas walk <hash> # recursive DAG traversal
ocas list --type <hash|alias> # list nodes by type
ocas list-schema # list all schema hashes
ocas list-meta # list meta-schema hashes
ocas gc # garbage collection
```
### [[Variable]] Management
```bash
ocas var set <name> <hash> [--tag key:value] [--tag label]
ocas var get <name> --schema <hash>
ocas var delete <name> [--schema <hash>]
ocas var list [prefix] [--schema <hash>] [--tag ...]
ocas var tag <name> --schema <hash> <operations...>
```
### [[Render System|Template & Render]]
```bash
ocas template set <schema-hash> <file|--inline text>
ocas template get <schema-hash>
ocas template list
ocas template delete <schema-hash>
ocas render <hash> [--resolution n] [--decay n] [--epsilon n]
ocas render --pipe/-p [options]
```
## Flags
| Flag | Description |
|------|-------------|
| `--home <path>` | Store directory |
| `--var-db <path>` | Variable database path |
| `--json` | Compact JSON output (no pretty-printing) |
| `--pipe`, `-p` | Read from stdin (`put`/`hash`: raw JSON; `render`: envelope) |
| `--schema <hash>` | Schema filter for var commands |
| `--tag <expr>` | Tag/label operations (repeatable) |
| `--render`, `-r` | Render output inline (equivalent to piping to `ocas render -p`) |
| `--inline <text>` | Inline text content for `template set` |
| `--format tree` | Tree display for `walk` |
## Type Aliases
The CLI resolves `@ocas/*` aliases to hashes automatically:
```bash
ocas put @ocas/object data.json # resolves @ocas/object → hash
ocas put @ocas/schema schema.json # auto-routes to putSchema()
ocas list --type @ocas/schema # list all schemas
```
+58
View File
@@ -0,0 +1,58 @@
---
title: Content Addressing
aliases: [CAS, 内容寻址]
tags: [concept]
related: [Schema, Store]
---
# Content Addressing
Content addressing is the foundational principle of OCAS: a piece of data is identified by the hash of its content, not by a location or a user-chosen key.
## How It Works
Data goes in → deterministic hash comes out → that hash **is** the address.
```
payload: { "name": "Alice" }
↓ CBOR encode → XXH64 → Crockford Base32
hash: A0QKG4ERMXSFG (13 characters, always)
```
Same content always produces the same hash. Different content always produces a different hash.
## Hash
OCAS hashes are 13-character uppercase strings using **Crockford Base32** encoding over a **XXH64** digest.
- Input: CBOR-encoded `{ type, payload }` (timestamp excluded — same logical data = same hash)
- Output: `[0-9A-HJKMNP-TV-Z]{13}` (e.g. `A0QKG4ERMXSFG`)
- Crockford Base32 excludes ambiguous characters (I, L, O, U), making hashes safe for copy-paste and verbal communication
The `Hash` type is defined as a plain string — no wrapper object, no overhead.
## Node
Every piece of data stored in OCAS is a **Node** — a three-field record:
```typescript
type CasNode = {
type: Hash; // hash of the node's schema (points to a Schema node)
payload: unknown; // the actual data
timestamp: number; // Unix epoch ms, set on first store
};
```
- `type` links the node to its [[Schema]], forming a typed DAG
- `payload` is the user's data, validated against the schema on write
- `timestamp` is metadata only — it is **not** included in hash computation, so storing the same logical data twice returns the same hash
## Immutability
Once stored, a node can never change — modifying the payload would change the hash, creating a different node. This gives OCAS several properties for free:
- **Deduplication** — identical data is stored exactly once
- **Integrity verification** — recompute the hash to check for corruption
- **Safe sharing** — a hash is a tamper-proof reference; if you have the hash, you know exactly what data you'll get
Mutability is handled at a higher layer by [[Variable]] — mutable pointers to immutable nodes.
+39
View File
@@ -0,0 +1,39 @@
---
title: Garbage Collection
aliases: [GC, 垃圾回收]
tags: [concept, api]
related: [Variable, Schema, Store]
---
# Garbage Collection
OCAS uses **mark-and-sweep** garbage collection to reclaim storage from nodes that are no longer referenced.
## Algorithm
1. **Roots** — all [[Variable]] values are roots
2. **Mark** — from each root, recursively `walk()` the DAG via [[Schema|ocas_ref]] edges, marking every reachable node and its schema
3. **Schema chain preservation** — walk each reachable node's type chain upward to preserve the meta-schema hierarchy
4. **Bootstrap preservation** — self-referencing nodes (where `type === hash`) are always kept alive
5. **Sweep** — delete all unmarked nodes
```bash
ocas gc
# → { total, reachable, collected, scanned }
```
## Key Properties
- **Global** — GC operates on the entire [[Store]], not scoped to any variable prefix. Cross-reference between variables is valid and expected
- **Schema-aware** — the type (schema) of every reachable node is also marked as reachable, preventing orphaned schemas
- **Safe** — bootstrap nodes (the self-referencing meta-schema) are unconditionally preserved, so GC can never break the store's foundation
- **Deterministic** — same store state always produces the same GC result
## What Stays Alive
A node survives GC if any of the following is true:
- A [[Variable]] points to it directly
- It is reachable via `ocas_ref` edges from a variable's value
- It is the schema (type) of any reachable node
- It is a self-referencing [[Bootstrap]] node
+67
View File
@@ -0,0 +1,67 @@
---
title: Render System
aliases: [Envelope, Template, 渲染系统]
tags: [concept, api]
related: [Schema, CLI]
---
# Render System
The render system turns typed data into human-readable output. It has three layers: the envelope format for machine consumption, templates for human formatting, and resolution decay for hierarchical rendering.
## Envelope
Every [[CLI]] command outputs a `{ type, value }` JSON envelope:
```json
{ "type": "B4K9...", "value": "A0QKG4ERMXSFG" }
```
- `type` is the hash of an `@ocas/output/*` [[Schema]] describing the value's shape
- `value` is the command's result — a hash, a node, a boolean, an array, etc.
This uniform shape means any command's output can be piped into another:
```bash
ocas put @ocas/object data.json | ocas render -p
ocas gc | ocas render -p
ocas list --type @ocas/schema | ocas render -p
```
`wrapEnvelope(store, alias, value)` creates an envelope by resolving the alias to its schema hash.
## Templates
Templates control how a schema's data renders as text. They use LiquidJS syntax with the node's payload available under `payload`:
```liquid
Name: {{ payload.name }}, Age: {{ payload.age }}
```
Templates are managed as [[Variable]] bindings in the `@ocas/template/text/<schema-hash>` namespace, with the template content stored as a CAS node typed `@ocas/string`.
```bash
ocas template set <schema-hash> template.liquid
ocas template get <schema-hash>
ocas template list
ocas template delete <schema-hash>
```
[[Bootstrap]] registers default templates for all `@ocas/output/*` schemas, so CLI output is always renderable out of the box.
## Resolution Decay
When rendering nodes that reference other nodes (via [[Schema|ocas_ref]]), the render system uses **resolution decay** to control recursion depth:
- `--resolution <n>` — initial detail level (default: 1.0)
- `--decay <n>` — multiplier per level (default: 0.5)
- `--epsilon <n>` — cutoff threshold (default: 0.01)
At each level, `resolution *= decay`. When resolution drops below epsilon, referenced nodes render as raw `ocas:<hash>` strings instead of being expanded. This prevents infinite expansion while giving progressively less detail at deeper levels.
## Render Modes
- **`ocas render <hash>`** — render a stored node by hash, expanding ocas_ref references
- **`ocas render -p`** — read a `{ type, value }` envelope from stdin and render it
- **`ocas <any-command> -r`** — inline render shortcut; any command with `-r` / `--render` automatically pipes its envelope output through the renderer, equivalent to `| ocas render -p`
- **`renderDirect(type, value, store, options)`** — in-memory render without requiring the data to be stored (used internally for CLI output)
+60
View File
@@ -0,0 +1,60 @@
---
title: Schema
aliases: [JSON Schema, Type System, 类型系统]
tags: [concept, api]
related: [Content Addressing, Bootstrap, Store]
---
# Schema
Every node in OCAS has a type. Schemas define what shapes of data are valid, and they are themselves stored as nodes in the CAS — making the type system self-describing.
## JSON Schema Subset
OCAS schemas use a subset of JSON Schema. Supported keywords:
| Category | Keywords |
|----------|----------|
| Core | `type`, `properties`, `required`, `additionalProperties`, `items` |
| Combinators | `anyOf`, `oneOf`, `allOf`, `not` |
| Conditionals | `if`, `then`, `else` |
| String | `minLength`, `maxLength`, `pattern`, `format`, `enum`, `const` |
| Number | `minimum`, `maximum`, `exclusiveMinimum`, `exclusiveMaximum`, `multipleOf` |
| Array | `minItems`, `maxItems`, `uniqueItems`, `prefixItems`, `contains` |
| Object | `patternProperties`, `propertyNames`, `minProperties`, `maxProperties` |
| Metadata | `title`, `description`, `default`, `examples`, `deprecated`, `readOnly`, `writeOnly`, `$comment` |
Schemas are validated recursively by `isValidSchema()` before being stored, and payloads are validated against their schema by ajv on every `put()`.
## References — `ocas_ref`
Nodes can reference other nodes. A property with `format: "ocas_ref"` declares that its string value is a [[Content Addressing|Hash]] pointing to another node:
```json
{
"type": "object",
"properties": {
"author": { "type": "string", "format": "ocas_ref" }
}
}
```
When the store sees `ocas_ref`, it knows this field is a graph edge — not just a string. This enables:
- **`refs(node)`** — extract direct references from a node
- **`walk(hash, callback)`** — recursively traverse the DAG
- **[[Garbage Collection]]** — mark reachable nodes from roots
### collectRefs
`collectRefs(schema, value)` recursively walks the schema to find all `ocas_ref` values in a payload. It handles all schema structures: `properties`, `items`, `additionalProperties`, `anyOf`/`oneOf`/`allOf`, `if`/`then`/`else`, `prefixItems`, `patternProperties`, `contains`, and `not`.
## Namespace
Alias names starting with `@ocas/` are reserved for the system. Built-in aliases:
- `@ocas/schema` — the meta-schema (self-referencing)
- `@ocas/string`, `@ocas/number`, `@ocas/object`, `@ocas/array`, `@ocas/bool` — primitive types
- `@ocas/output/*` — [[Render System|envelope]] schemas for CLI output
Users define their own schemas freely — the only restriction is the `@ocas/` prefix.
+47
View File
@@ -0,0 +1,47 @@
---
title: Store
aliases: [存储接口]
tags: [concept, api]
related: [Content Addressing, Bootstrap]
---
# Store
The Store is the abstract storage interface at the heart of OCAS. All operations — put, get, verify, gc — go through this interface.
## Interface
```typescript
type Store = {
put(typeHash: Hash, payload: unknown): Promise<Hash>;
get(hash: Hash): CasNode | null;
has(hash: Hash): boolean;
delete(hash: Hash): void;
listAll(): Hash[];
listByType(typeHash: Hash): Hash[];
listMeta(): Hash[];
listSchemas(): Hash[];
};
```
`put()` computes the [[Content Addressing|hash]] from `{ type, payload }`, validates the payload against its [[Schema]], and stores the [[Content Addressing|node]]. If a node with the same hash already exists, it's a no-op — content addressing gives deduplication for free.
## Implementations
### MemoryStore
In-memory `Map<Hash, CasNode>`. Used in tests and for ephemeral computation (e.g. computing a hash without persisting). Created via `createMemoryStore()`.
### FsStore
Filesystem-backed store (`@ocas/fs`). Nodes are serialized as CBOR files under a content-addressed directory tree. Created via `openStore(path)`, which:
1. Creates the directory if it doesn't exist
2. Runs [[Bootstrap]] automatically
3. Returns a ready-to-use Store
The default location is `~/.ocas`, configurable via `OCAS_HOME` environment variable or `--home` [[CLI]] flag.
## VariableStore
A companion store for [[Variable]] bindings, backed by SQLite. Created via `createVariableStore(dbPath, store)`. It sits alongside the CAS store — variables point into the CAS but live in their own database.
+57
View File
@@ -0,0 +1,57 @@
---
title: Variable
aliases: [Mutable Binding, 变量系统]
tags: [concept, api]
related: [Content Addressing, Garbage Collection, Store]
---
# Variable
Variables are **mutable pointers to immutable data** — the same pattern as git branches pointing to commits. They bridge the gap between OCAS's immutable [[Content Addressing|content-addressed]] storage and the need for mutable state.
## Data Model
A variable is a named binding:
```typescript
type Variable = {
name: string; // e.g. "myapp/config"
schema: Hash; // type of the pointed-to node
value: Hash; // hash of the current node
created: number; // Unix epoch ms
updated: number; // Unix epoch ms
tags: Record<string, string>;
labels: string[];
};
```
The primary key is `(name, schema)` — the same name can point to nodes of different types. When using `var set`, the schema is automatically inferred from the node the hash points to, so you don't need to specify it explicitly.
## Operations
```bash
ocas var set <name> <hash> [--tag env:prod] [--tag pinned]
ocas var get <name> --schema <hash>
ocas var delete <name> [--schema <hash>]
ocas var list [prefix] [--schema <hash>] [--tag env:prod]
ocas var tag <name> --schema <hash> status:active pinned :archived
```
`var set` is an upsert — creates or updates. Name prefix filtering replaces the old scope concept: `var list myapp/` returns all variables under `myapp/`.
## Tags and Labels
Tags and labels share a unified command (`var tag`):
- **Tags** are `key:value` pairs — same-key mutually exclusive (e.g. `env:prod` replaces `env:dev`)
- **Labels** are bare strings (e.g. `pinned`, `important`)
- **Deletion** uses `:` prefix (e.g. `:status` deletes the `status` tag, `:pinned` deletes the label)
- Tag keys and label names share a namespace — you can't have both `status:active` and a `status` label
## Namespace Protection
Variable names starting with `@ocas/` are reserved for internal use (e.g. `@ocas/template/text/<hash>` for [[Render System|template]] storage). The CLI rejects user attempts to write to this namespace.
## Role in Garbage Collection
Variables are the **roots** of [[Garbage Collection]]. Any node reachable from a variable's value hash (via [[Schema|ocas_ref]] edges) is kept alive; unreachable nodes are swept.