Series 1 · Wax explained 2 · Wax vs qmd 3 · The chunk→fact bridge
🕯

Wax is a single‑file memory
engine for AI agents.

Documents, embeddings, full‑text indexes and a structured knowledge graph — all bundled into one .wax binary that lives on your device. No server, no cloud, no Docker. You move it like any other file.

Written in Swift 6, tuned for Apple Silicon Hybrid search: BM25 text + vector Crash‑safe via WAL + dual headers Talks to agents over MCP Apache‑2.0

Everything below is drawn from the source tree at /Users/tmk/dev/playground/Wax. Claims carry file:line citations so you can verify them. Where the README and the code disagree, this page follows the code and says so.

01 What is this thing, really?

Strip away the marketing and Wax is three databases hiding inside one file, plus a coordinator that keeps them in sync.

Most retrieval‑augmented (RAG) setups bolt together a document store, a vector database, and a text search server — three services to deploy and keep consistent. Wax collapses all of that into a single self‑contained binary file with a .wax extension README.md:29. Opening that file gives you four capabilities at once:

📄 A document store

Raw text/blobs are written as compressed frames (LZ4 / LZFSE / deflate / plain). Each frame carries metadata, checksums, and parent‑child links so a document and its chunks stay related. FileFormat.md:88–110

🔤 A full‑text index

An embedded SQLite FTS5 virtual table (frames_fts, unicode61 tokenizer) provides BM25 lexical search. FTS5Schema.swift:8

🧭 A vector index

Embeddings (default MiniLM, 384‑dim MiniLMEmbedder.swift:28) searched on the GPU for semantic recall. (See the honesty note about "HNSW" below.)

🕸 A knowledge graph

A built‑in EAV / entity‑fact store with bitemporal facts lives in the same SQLite blob as FTS5 — durable, queryable knowledge, not just blobs of text. StructuredMemory.md:7

How a query works

A single search fans out to both the BM25 text index and the vector index, then merges the two ranked lists with Reciprocal Rank Fusion (RRF) — a rank‑based merge that ignores the incompatible raw score scales of BM25 vs. cosine distance HybridSearch.swift:1–30. The weighting (alpha: 0 = all vector, 1 = all text) can even adapt to the query type — factual queries lean on text (0.7/0.3), semantic ones lean on vectors (0.3/0.7), temporal ones pull in a recency signal AdaptiveFusionConfig.swift:22–25.

Who uses it, and how

Honesty note — "HNSW" vs the code

The README's architecture diagram labels the vector index a "Metal HNSW Index" README.md:317,337. The shipping engine is not HNSW. MetalVectorEngine describes itself as "Metal‑backed cosine similarity search" and runs SIMD cosine‑distance kernels (cosineDistanceKernelSIMD4/8) followed by a GPU topKReduceDistances pass MetalVectorEngine.swift:6,164,189,206. That is an exhaustive (brute‑force) exact search accelerated on the GPU, not an approximate HNSW graph. A tree‑wide search for "HNSW" finds zero matches in Sources/. For the scale Wax targets (thousands → low millions of vectors on an M‑series chip) brute‑force GPU search is genuinely fast — but it is exact, not approximate, and the README mislabels it.

02 The confusing words, decoded

Wax's docs assume you already know a dozen storage‑engine terms. Tap any to expand.

03 Anatomy of a .wax file

Click a region to inspect it. Offsets and sizes are from the format spec. FileFormat.md

Top‑to‑bottom = low‑to‑high file offset. Amber = crash‑safety metadata · blue = search indexes.

Select a region to see what it stores, why it exists, and where it's defined in the source.

Why this shape? Crash‑safety by construction

The two design tricks that make a single mutable file safe to crash on:

  1. Dual header pages (A/B). Two 4 KiB headers sit at offset 0 and 4 KiB. Writes alternate between them; on open, Wax reads both and picks the one with the highest generation that still passes its SHA‑256 checksum. A crash mid‑header‑write can never corrupt the store — worst case you fall back to the previous generation. FileFormat.md:52–56
  2. Write‑ahead log (WAL) ring buffer. Every mutation — putFrame, deleteFrame, supersedeFrame, putEmbedding — is appended to a fixed‑size circular log (default 256 MiB) before it touches the main structure. On the next open, pending records are replayed; corrupt tail records are tolerated. WALAndCrashRecovery.md:24–48,62–72

Every layer — WAL records, header pages, TOC, footer — carries a SHA‑256 checksum, and the TOC additionally keeps a 32‑byte Merkle root. Corruption is detectable at every level. WALAndCrashRecovery.md:99–108 FileFormat.md:71

04 How Markdown (and any text) gets ingested

The honest, slightly surprising truth: a .md file is treated as plain UTF‑8 text. There is no Markdown parser in the ingest path.

The pipeline, step by step

1 · Read
remember(fileAt:) loads the file as Data and decodes it as UTF‑8. Non‑UTF‑8 → error. It attaches provenance metadata: source_kind=file, source_uri, source_filename, source_extension. MemoryOrchestrator+File.swift:5–35
2 · Chunk
The text is split by the token‑count chunker: default 400 target tokens with 40 tokens of overlap between consecutive chunks. OrchestratorConfig.swift:12 Tokens are real WordPiece tokens counted by a BERT tokenizer (the same family the MiniLM embedder uses), so chunk boundaries align with what the model sees. TextChunker.swift:21–58 WaxBertTokenizer/BertTokenizer.swift
3 · Embed
Each chunk is embedded into a 384‑dim vector by MiniLM (CoreML, runs on ANE/GPU). An alternative Snowflake Arctic embedder also ships. MiniLMEmbedder.swift:28,32
4 · Store
The document and its chunks become frames (parent = document, children = chunks via parentId), compressed, and appended through the WAL. FileFormat.md:96–98
5 · Index
Chunk text → FTS5 (BM25). Chunk vector → the Metal vector index. Now searchable after a flush. FTS5Schema.swift:8
What ingestion does NOT do

There is exactly one chunking strategy — tokenCount(target, overlap) ChunkingStrategy.swift:3–5. The ingest path has no Markdown‑aware behaviour: it does not split on headings, does not parse YAML frontmatter, does not extract [[wiki‑links]] or fenced code blocks (a tree‑wide search for "frontmatter" / "wikilink" finds nothing). Headings, links, and frontmatter are simply tokens in the text. PDFs are the one special case — they get a dedicated text extractor first. Ingest/PDFTextExtractor.swift Structured facts are not auto‑extracted from prose either; you assert them explicitly (see §08).

Try the chunker

Paste text and watch how the 400/40 token window would slice it. Approximation: the real chunker counts WordPiece tokens; this demo estimates tokens at ~1.3×words. Use the sliders to feel the effect of window & overlap.

05 The other Markdown story: managed projections

Separate from ingestion, the MCP broker keeps a small set of human‑editable Markdown files in two‑way sync with memory. This is the feature that actually matters for sharing knowledge.

The broker can project memory out to Markdown and re‑absorb edits back in. It manages three kinds of file under a project root AgentBrokerService+Markdown.swift:9–38:

FileHoldsDurability
MEMORY.mdDurable long‑term memories, grouped by ## sectiondurable
memory/<date>.mdDaily notes / working memoryworking
memory/DREAMS.mdA review queue of promotion candidates (checkboxes)pending approval

How a managed line is encoded

Each memory becomes a Markdown list item with an invisible HTML‑comment marker that base64‑encodes a JSON record (frame id, content hash, memory type, durability, confidence, session id…). The text is human‑readable; the marker is machine‑readable. BrokerMarkdownSync.swift:81–163

Marker decoder / encoder

Paste a managed line to decode its hidden marker — or edit the JSON to produce a line. (Example generated live from the real MarkdownProjectionMarker fields. BrokerMarkdownSync.swift:10–48)

What markdown_sync actually does

Sync is a reconciliation, not a one‑way import. For each managed file it matches Markdown lines to stored memories — first by trusted marker (frame id + content hash + memory id + source kind/path all must agree), then by content hash — and classifies every line as created / updated / unchanged / deleted AgentBrokerService+Markdown.swift:276–370:

Sharp edge to remember

Because the Markdown file is treated as authoritative for the managed set, deleting a line and syncing deletes the memory. That's powerful for hand‑editing, but it means a careless three‑way git merge of MEMORY.md (one side drops lines) can delete knowledge on the next sync. --dry_run reports the created/updated/deleted counts without writing, so you can preview before committing. AgentBrokerService+Markdown.swift:316–333

DREAMS.md — promotion review

"Dreams" are candidate long‑term memories that the broker harvested from session stores and thinks are worth keeping. They're rendered as unchecked checkboxes. You tick [x] the ones to keep; on the next sync, each checked candidate runs through a promotion judge (proposePromotion) and the approved ones are written to durable memory. Unchecked = ignored. AgentBrokerService+Markdown.swift:168–248 A human stays in the loop for what becomes permanent.

06 Sharing knowledge between a work Mac and a personal Mac

The headline question. Short answer: there is no networked replication or store‑merge in the code — but there are two viable patterns, and one is much better than the other for two‑way sharing.

What the code actually provides

A tree‑wide search for store‑level merge / import / replicate / CRDT turns up nothing — the only merge() in the broker just adds up sync counts AgentBrokerService+Markdown.swift:615. There is no "combine two .wax files" command, no peer sync, no vector clocks. Concurrency control is single‑writer: a writer lease inside the process ConcurrencyModel.md:28–38, plus an advisory flock for multiple processes on the same machine ConcurrencyModel.md:92–120. flock does not coordinate across machines.

Option A — sync the binary .wax

Drop the file in iCloud Drive / Dropbox / Syncthing, or AirDrop it. The README endorses this. README.md:280,394

Good for: backup; one machine active at a time; read‑only fan‑out to other devices.

Bad for: two machines writing concurrently. The advisory lock doesn't cross machines, and the file mutates in place (WAL + headers), so a cloud syncer that copies a half‑written file — or merges two diverged copies — can produce a conflicted copy or a corrupt store. There is no merge to reconcile them.

Option B — sync the Markdown projection ✦

Export memory → MEMORY.md (+ daily notes), sync the text via git, run markdown_sync on the other Mac. WaxCLICommand.swift:34–35

Good for: genuine two‑way knowledge sharing. Markdown is diff‑friendly and merge‑friendly; git handles concurrent edits with real three‑way merges; each memory carries a stable content hash + frame id in its marker, so reconciliation is deterministic. You can review diffs before they touch either store.

Watch out: the deletion semantics from §05 — resolve git conflicts so you don't silently drop lines, and prefer --dry_run first.

Decision helper

Recommended workflow for two Macs
  1. Keep one canonical git repo of the Markdown projection (MEMORY.md + memory/*.md) — not the binary.
  2. On each Mac, point the Wax broker at a local .wax store (never shared live).
  3. End of session: markdown_export → commit & push the Markdown.
  4. Start of session on the other Mac: pull → markdown_sync --dry_run to preview → markdown_sync to absorb.
  5. Treat the binary .wax as a fast local cache/index that's always rebuildable from the Markdown of record.

This is essentially "plain‑text knowledge base in git, with Wax as the local search engine over it" — which also sidesteps the binary's single‑writer limitation entirely.

Aside — corpus_search is not cross‑machine. It searches across the broker's local per‑session stores with provenance metadata (which session a hit came from) README.md:250–256. Useful for cross‑session recall on one machine; it is not a replication or multi‑device feature.

07 Exploring what's inside a .wax file

There is no single "open this file and show me everything" inspector. But several focused tools together let you see the contents and verify integrity.

HowSurfaceWhat it reveals
wax-cli statsCLI MCPFrame count + pending, generation counter, disk bytes, WAL state (write/checkpoint pos, committed/last seq, wrap & checkpoint counts), embedder identity (provider/model/dims), and feature flags (vector search, structured memory). StatsCommand.swift:46–116
recall / searchCLI MCPWalk stored documents/chunks by querying them (text / vector / hybrid). The practical way to "list what's in here."
facts_queryCLI MCPDump structured facts, filtered by subject and/or predicate, as of a point in time. FactsCommand.swift:165–242
entity_resolveCLI MCPLook up entities by key or fuzzy alias.
vector-health · memory_healthCLIVector index health / store health diagnostics. VectorHealthCommand.swift
markdown_exportCLI MCPThe most human‑readable "view": project all durable memory out to MEMORY.md and read it as text.
Open on next launchautomaticIntegrity is verified structurally: header A/B checksum selection, footer↔TOC hash match, per‑record SHA‑256, WAL replay. A corrupt file fails loudly on open. WALAndCrashRecovery.md:62–108
The closest thing to a raw dump

The text + EAV indexes are a real SQLite database (FTS5 + the sm_* tables), but it lives embedded inside a frame of the .wax container — it is not a loose .sqlite file you can open with the sqlite3 CLI. The frame layout, TOC, and table schemas are fully documented in the in‑repo DocC articles (FileFormat.md, StructuredMemory.md, WALAndCrashRecovery.md, ConcurrencyModel.md) — the authoritative spec. There is no built‑in fsck/dump subcommand beyond stats and the query tools above.

08 How the EAV triples are indexed — vs Datomic / DataScript / Datalevin

The deep one. Wax models knowledge as RDF‑like (subject, predicate, object) triples — but it is not a Datalog database. It's a fixed‑schema, bitemporal triple store on SQLite B‑trees.

The actual schema

Entities and predicates are interned into dictionary tables (integer surrogate keys), and every fact is one row in sm_fact with a tagged‑union object: an object_kind discriminator (1=string … 7=entity‑ref) plus seven typed columns, with a CHECK enforcing exactly one is populated. StructuredMemorySchema.swift:7–67

CREATE TABLE sm_entity   (entity_id PK, key UNIQUE, kind, created_at_ms)
CREATE TABLE sm_predicate(predicate_id PK, key UNIQUE, created_at_ms)

CREATE TABLE sm_fact (
  fact_id PK,
  subject_entity_id  → sm_entity,
  predicate_id       → sm_predicate,
  object_kind INTEGER,            -- 1=str 2=int 3=real 4=bool 5=blob 6=time 7=entity-ref
  object_text/_int/_real/_bool/_blob/_time_ms/_entity_id,   -- exactly one, by CHECK
  version_relation,                -- sets / updates / extends / retracts
  fact_hash BLOB UNIQUE,          -- SHA-256(S,P,O) → idempotent assert
  qualifiers_hash )

CREATE TABLE sm_fact_span (    -- BITEMPORAL: two independent time axes
  span_id PK, fact_id → sm_fact,
  valid_from_ms,  valid_to_ms,     -- when the fact is TRUE in the world
  system_from_ms, system_to_ms,    -- when it was RECORDED (NULL = still asserted)
  span_key_hash UNIQUE )

CREATE TABLE sm_evidence (     -- provenance back to the source text
  source_frame_id, chunk_index, span_start_utf8, span_end_utf8,
  extractor_id, extractor_version, confidence, asserted_at_ms )

The indexes Wax keeps

Only these B‑tree indexes exist on the fact tables StructuredMemorySchema.swift:109–130:

IndexColumnsAnswers fast
sm_fact_subject_pred_idx(subject_entity_id, predicate_id)"all facts about subject S" and "S + predicate P"
sm_fact_edge_out_idx(subject, predicate, object_entity) WHERE kind=7outbound graph edges (S –P→ ?)
sm_fact_edge_in_idx(object_entity, predicate, subject) WHERE kind=7inbound edges (? –P→ O) — reverse refs
sm_span_current_fact_idx(fact_id, system_from, valid_from, valid_to) WHERE system_to IS NULL"currently asserted" facts — a partial index covering only live spans
UNIQUE(fact_hash)SHA‑256(S,P,O)exact‑triple existence (dedup)

Plus key/alias indexes for resolution: sm_entity(key), sm_entity_alias(alias_norm) (NFKC‑normalized, case‑folded for fuzzy match), sm_predicate(key).

Query‑pattern explorer

Pick what you want to bind, and see whether Wax has a covering index — versus which of Datomic's four index orders a Datalog DB would use.

Head‑to‑head with the Datalog databases

DimensionWax structured memoryDatomicDataScriptDatalevin
Storage SQLite B‑trees (GRDB), embedded in the .wax frame blob Pluggable (SQL/Dynamo/Cassandra) + index segment trees In‑memory persistent sorted sets LMDB B+‑trees on disk
Covering indexes Partial set: S+P, ref‑out, ref‑in, current‑span. No attribute‑only (AEVT) or value (AVET) index. EAVT, AEVT, AVET, VAET (4 sort orders) EAVT, AEVT, AVET (+VAET for refs) EAV, AVE, VAE + value/giant tables
Value lookup ("who has status=X") No value index → table scan (object_* columns are unindexed) AVET for indexed/unique attrs AVET AVE
Reverse references sm_fact_edge_in_idx — but only for entity‑ref objects (mirrors VAET being ref‑only) VAET (ref attrs only) VAET VAE
Query language No Datalog. Fixed API: facts_query (S/P + as‑of), single‑hop edges, entity_resolve Full Datalog + rules + pull Datalog + pull Datalog + pull
Recursion / joins None in the query layer (compose in app code) Recursive rules Recursive rules Recursive rules
Time model Bitemporal — valid time + system time (like XTDB) Uni‑temporal (transaction time; as-of/history) None (immutable snapshot value) Limited / opt‑in
Dedup model Content‑addressed: UNIQUE(SHA‑256(S,P,O)) Datom identity in the index Datom identity in the set Datom identity in LMDB
Retraction Close system_to_ms on the span (soft, bitemporal) Retraction datom (op=false), new tx Retract in a tx → new value Retraction datom
Bundled extras Entity aliases + fuzzy resolve; span‑level evidence/provenance & confidence Model as data Model as data Model as data; built‑in full‑text
The one‑sentence summary

Datomic/DataScript/Datalevin are general Datalog engines — they pay for several covering index orders so that any query shape and recursive rules run efficiently. Wax is a purpose‑built bitemporal triple store that indexes only the handful of access patterns an agent‑memory layer needs (entity‑centric lookup, single‑hop graph edges, dedup, "what's true now/then") and exposes them through a small fixed API — trading Datalog's generality for a simpler engine that rides along inside the same SQLite blob as the text search.

09 Quick answers

Is a .wax file safe to keep in iCloud/Dropbox?
As a backup or for one‑machine‑at‑a‑time use, yes. For two machines writing it at once, no — the lock is advisory and same‑machine only, and there's no merge. Sync the Markdown projection instead (§06).
Can I just sqlite3 the file to browse it?
No — the SQLite DB is embedded inside a frame of the container, not a standalone .sqlite. Use wax-cli stats + search/facts_query, or markdown_export for a readable view (§07).
Does Wax extract facts from my notes automatically?
No. Ingestion stores + indexes text. Structured facts are asserted explicitly via fact_assert / entity_upsert (or the DREAMS.md promotion flow). Evidence rows can link a fact back to a source frame/chunk/UTF‑8 span when you do.
Why does the README say HNSW if it's brute force?
Best guess: aspirational/legacy docs. The shipping engine is an exact Metal cosine search (§01). Worth flagging upstream — it's a real doc/code mismatch.
What's the default chunk size?
400 target tokens, 40 overlap, counted with a BERT WordPiece tokenizer OrchestratorConfig.swift:12.