Product · SynOI Vault
TestingThe store underneath SynOI.
Content-addressed. Encrypted. Signed.
The Vault is the byte-store every other SynOI product writes to. The Gateway's LLM response cache lives here. Decision Receipts are filed here. Retrieval-augmented generation runs against it. Two retention tiers (working memory and authoritative canonical) share one substrate, one identity model, one cryptographic boundary. Built on an open, content-addressed protocol.
Architecture
Two retention tiers. One substrate.
Memory tier
Working set
- Bounded size, LRU + adaptive eviction
- Semantic reuse via signature similarity
- Hot reads in sub-millisecond
- Backs: Gateway cache, KB lookups, conversation context
Canonical tier
Authoritative truth
- Append-only, never evicted
- Full version history + supersession chains
- Point-in-time queries on every record
- Backs: Decision Receipts, audit evidence, policy packs
CDROs flow between tiers by policy. A working-tier object endorsed by a HITL approval can be promoted to canonical and gain a version chain. A canonical object that's been superseded keeps its history; lookups by content hash still find it.
What you get
Substrate that does its job.
Content-addressed identity (OID)
Every record has a deterministic identity derived from its contents. Same content → same ID. Two compute the same OID → they ARE the same record.
Encryption at rest (AES-256-GCM)
Per-record IV, 16-byte auth tag, AAD-bound to OID + tenant. A row spliced into the wrong place fails decryption on read. Master key file in OS keychain (production) or restricted local file (dev).
Compression (zstd)
Native zstd, level 3, drops cache disk size 3–5× on LLM responses. Future: a trained dictionary on your corpus brings it to 8–12×.
Semantic retrieval (cosine similarity)
Vector similarity search built in. Find related records by meaning, not just exact key. Powers retrieval-augmented generation without bolting a separate vector DB onto your stack. Current implementation is linear cosine scan; ANN indexing (HNSW) is on the roadmap.
Multi-tenant scoping
Per-tenant namespace isolation at the storage layer. One tenant's data never visible to another, even with the same content.
Point-in-time queries
Time-travel built into the canonical tier. Ask "what did the system know on March 12?" as a first-class query, not an archaeology exercise.
Lineage + provenance
Every record carries who wrote it, when, by which authority, and what it's derived from. Decision Receipts get richer for free.
Local-first, optional cloud sync
Self-host everything. Customers on a paid plan can opt into cloud replication of canonical records (encrypted at edge, never plaintext over the wire). Free tier is local-only forever.
Substrate
Built on an open, content-addressed protocol.
An open, content-addressed protocol sits underneath: content-derived identity, multi-signature retrieval, filter-first lookup. The Vault is SynOI's reference implementation, written in TypeScript, backed by SQLite + zstd + AES-GCM today, designed to migrate cleanly onto content-addressable storage hardware when it becomes practical. The API surface survives the substrate.
Where Vault is used