Part 1 of 3

Wax, explained

A single-file memory engine for AI agents

Documents, embeddings, full-text indexes and a structured knowledge graph — all bundled into one .wax binary that lives on your device. No server, no cloud, no Docker. You move it like any other file.Everything below is drawn from the source tree at /Users/tmk/dev/playground/Wax. Claims carry file:line citations so you can verify them; where the README and the code disagree, this page follows the code and says so.

Written in Swift 6, tuned for Apple Silicon Hybrid search: BM25 text + vector Crash-safe via WAL + dual headers Talks to agents over MCP Apache-2.0

What is this thing, really?

Strip away the marketing and Wax is three databases hiding inside one file, plus a coordinator that keeps them in sync.

Most retrieval-augmented (RAG) setups bolt together a document store, a vector database, and a text search server — three services to deploy and keep consistent. Wax collapses all of that into a single self-contained binary file with a .wax extension README.md:29. Opening that file gives you four capabilities at once:

▤ A document store: Raw text/blobs are written as compressed frames (LZ4 / LZFSE / deflate / plain). Each frame carries metadata, checksums, and parent-child links so a document and its chunks stay related. FileFormat.md:88–110
▦ A full-text index: An embedded SQLite FTS5 virtual table (frames_fts, unicode61 tokenizer) provides BM25 lexical search. FTS5Schema.swift:8
◈ A vector index: Embeddings (default MiniLM, 384-dim MiniLMEmbedder.swift:28) searched on the GPU for semantic recall. (See the honesty note about “HNSW” below.)
❖ A knowledge graph: A built-in EAV / entity-fact store with bitemporal facts lives in the same SQLite blob as FTS5 — durable, queryable knowledge, not just blobs of text. StructuredMemory.md:7

How a query works

A single search fans out to both the BM25 text index and the vector index, then merges the two ranked lists with Reciprocal Rank Fusion (RRF) — a rank-based merge that ignores the incompatible raw score scales of BM25 vs. cosine distance HybridSearch.swift:1–30. The weighting (alpha: 0 = all vector, 1 = all text) can even adapt to the query type — factual queries lean on text (0.7/0.3), semantic ones lean on vectors (0.3/0.7), temporal ones pull in a recency signal AdaptiveFusionConfig.swift:22–25.

Who uses it, and how

Swift API Embed the Wax module in an iOS/macOS app: Memory(at:) → save / search.
CLI wax-cli remember "…" / wax-cli search "…" --mode hybrid, or a long-running daemon.
MCP An MCP server (wax-mcp) exposes memory tools to Claude Code / Cursor so the agent remembers context across sessions. A broker process owns long-term memory plus per-session “scratch” stores.

Honesty note — “HNSW” vs the code

The README’s architecture diagram labels the vector index a “Metal HNSW Index” README.md:317,337. The shipping engine is not HNSW. MetalVectorEngine describes itself as “Metal-backed cosine similarity search” and runs SIMD cosine-distance kernels (cosineDistanceKernelSIMD4/8) followed by a GPU topKReduceDistances pass MetalVectorEngine.swift:6,164,189,206. That is an exhaustive (brute-force) exact search accelerated on the GPU, not an approximate HNSW graph. A tree-wide search for “HNSW” finds zero matches in Sources/. For the scale Wax targets (thousands → low millions of vectors on an M-series chip) brute-force GPU search is genuinely fast — but it is exact, not approximate, and the README mislabels it.

The confusing words, decoded

Wax’s docs assume you already know a dozen storage-engine terms. Tap any to expand.

Anatomy of a `.wax` file

Click a region to inspect it. Offsets and sizes are from the format spec. FileFormat.md

Top-to-bottom = low-to-high file offset. Amber = crash-safety metadata · blue = search indexes.

Select a region to see what it stores, why it exists, and where it’s defined in the source.

Why this shape? Crash-safety by construction

The two design tricks that make a single mutable file safe to crash on:

Dual header pages (A/B). Two 4 KiB headers sit at offset 0 and 4 KiB. Writes alternate between them; on open, Wax reads both and picks the one with the highest generation that still passes its SHA-256 checksum. A crash mid-header-write can never corrupt the store — worst case you fall back to the previous generation. FileFormat.md:52–56
Write-ahead log (WAL) ring buffer. Every mutation — putFrame, deleteFrame, supersedeFrame, putEmbedding — is appended to a fixed-size circular log (default 256 MiB) before it touches the main structure. On the next open, pending records are replayed; corrupt tail records are tolerated. WALAndCrashRecovery.md:24–48,62–72

Every layer — WAL records, header pages, TOC, footer — carries a SHA-256 checksum, and the TOC additionally keeps a 32-byte Merkle root. Corruption is detectable at every level. WALAndCrashRecovery.md:99–108 FileFormat.md:71

How Markdown (and any text) gets ingested

The honest, slightly surprising truth: a .md file is treated as plain UTF-8 text. There is no Markdown parser in the ingest path.

The pipeline, step by step

1 · Read

remember(fileAt:) loads the file as Data and decodes it as UTF-8. Non-UTF-8 → error. It attaches provenance metadata: source_kind=file, source_uri, source_filename, source_extension. MemoryOrchestrator+File.swift:5–35

2 · Chunk

The text is split by the token-count chunker: default 400 target tokens with 40 tokens of overlap between consecutive chunks. OrchestratorConfig.swift:12 Tokens are real WordPiece tokens counted by a BERT tokenizer, so chunk boundaries align with what the model sees. TextChunker.swift:21–58

3 · Embed

Each chunk is embedded into a 384-dim vector by MiniLM (CoreML, runs on ANE/GPU). An alternative Snowflake Arctic embedder also ships. MiniLMEmbedder.swift:28,32

4 · Store

The document and its chunks become frames (parent = document, children = chunks via parentId), compressed, and appended through the WAL. FileFormat.md:96–98

5 · Index

Chunk text → FTS5 (BM25). Chunk vector → the Metal vector index. Now searchable after a flush. FTS5Schema.swift:8

What ingestion does NOT do

There is exactly one chunking strategy — tokenCount(target, overlap) ChunkingStrategy.swift:3–5. The ingest path has no Markdown-aware behaviour: it does not split on headings, parse YAML frontmatter, or extract [[wiki-links]] or fenced code blocks. Headings, links, and frontmatter are simply tokens in the text. PDFs are the one special case — they get a dedicated text extractor first. Ingest/PDFTextExtractor.swift Structured facts are not auto-extracted from prose either; you assert them explicitly (see §08).

Try the chunker

Paste text and watch how the 400/40 token window would slice it. Approximation: the real chunker counts WordPiece tokens; this demo estimates tokens at ~1.3×words.

target tokens 400 overlap 40

The other Markdown story: managed projections

Separate from ingestion, the MCP broker keeps a small set of human-editable Markdown files in two-way sync with memory. This is the feature that actually matters for sharing knowledge.

The broker can project memory out to Markdown and re-absorb edits back in. It manages three kinds of file under a project root AgentBrokerService+Markdown.swift:9–38:

File	Holds	Durability
`MEMORY.md`	Durable long-term memories, grouped by `## section`	durable
`memory/<date>.md`	Daily notes / working memory	working
`memory/DREAMS.md`	A review queue of promotion candidates (checkboxes)	pending approval

How a managed line is encoded

Each memory becomes a Markdown list item with an invisible HTML-comment marker that base64-encodes a JSON record (frame id, content hash, memory type, durability, confidence, session id…). The text is human-readable; the marker is machine-readable. BrokerMarkdownSync.swift:81–163

Marker decoder / encoder

Paste a managed line to decode its hidden marker — or edit the JSON to produce a line. BrokerMarkdownSync.swift:10–48

Managed Markdown line

Decoded marker (JSON)

What `markdown_sync` actually does

Sync is a reconciliation, not a one-way import. For each managed file it matches Markdown lines to stored memories — first by trusted marker (frame id + content hash + memory id + source kind/path all must agree), then by content hash — and classifies every line as created / updated / unchanged / deleted AgentBrokerService+Markdown.swift:276–370:

In Markdown, new → written to memory (created).
In both, changed → memory updated (old frame tree deleted) (updated).
In both, identical → no-op (unchanged).
In memory, absent from Markdown → deleted from memory… unless the memory is locked. AgentBrokerService+Markdown.swift:358–367

Sharp edge to remember

Because the Markdown file is treated as authoritative for the managed set, deleting a line and syncing deletes the memory. That’s powerful for hand-editing, but it means a careless three-way git merge of MEMORY.md (one side drops lines) can delete knowledge on the next sync. --dry_run reports the created/updated/deleted counts without writing, so you can preview before committing. AgentBrokerService+Markdown.swift:316–333

`DREAMS.md` — promotion review

“Dreams” are candidate long-term memories that the broker harvested from session stores and thinks are worth keeping. They’re rendered as unchecked checkboxes. You tick [x] the ones to keep; on the next sync, each checked candidate runs through a promotion judge (proposePromotion) and the approved ones are written to durable memory. Unchecked = ignored. AgentBrokerService+Markdown.swift:168–248 A human stays in the loop for what becomes permanent.

Sharing knowledge between a work Mac and a personal Mac

The headline question. Short answer: there is no networked replication or store-merge in the code — but there are two viable patterns, and one is much better than the other for two-way sharing.

What the code actually provides

A tree-wide search for store-level merge / import / replicate / CRDT turns up nothing — the only merge() in the broker just adds up sync counts AgentBrokerService+Markdown.swift:615. There is no “combine two .wax files” command, no peer sync, no vector clocks. Concurrency control is single-writer: a writer lease inside the process ConcurrencyModel.md:28–38, plus an advisory flock for multiple processes on the same machine ConcurrencyModel.md:92–120. flock does not coordinate across machines.

Option A — sync the binary .wax

Drop the file in iCloud Drive / Dropbox / Syncthing, or AirDrop it. The README endorses this. README.md:280,394

Good for: backup; one machine active at a time; read-only fan-out to other devices.

Bad for: two machines writing concurrently. The advisory lock doesn’t cross machines, and the file mutates in place (WAL + headers), so a cloud syncer that copies a half-written file — or merges two diverged copies — can produce a conflicted copy or a corrupt store. There is no merge to reconcile them.

Option B — sync the Markdown projection (recommended)

Export memory → MEMORY.md (+ daily notes), sync the text via git, run markdown_sync on the other Mac. WaxCLICommand.swift:34–35

Good for: genuine two-way knowledge sharing. Markdown is diff-friendly and merge-friendly; git handles concurrent edits with real three-way merges; each memory carries a stable content hash + frame id in its marker, so reconciliation is deterministic. You can review diffs before they touch either store.

Watch out: the deletion semantics from §05 — resolve git conflicts so you don’t silently drop lines, and prefer --dry_run first.

Decision helper

Recommended workflow for two Macs

Keep one canonical git repo of the Markdown projection (MEMORY.md + memory/*.md) — not the binary.
On each Mac, point the Wax broker at a local .wax store (never shared live).
End of session: markdown_export → commit & push the Markdown.
Start of session on the other Mac: pull → markdown_sync --dry_run to preview → markdown_sync to absorb.
Treat the binary .wax as a fast local cache/index that’s always rebuildable from the Markdown of record.

This is essentially “plain-text knowledge base in git, with Wax as the local search engine over it” — which also sidesteps the binary’s single-writer limitation entirely.

Aside — corpus_search is not cross-machine. It searches across the broker’s local per-session stores with provenance metadata (which session a hit came from) README.md:250–256. Useful for cross-session recall on one machine; it is not a replication or multi-device feature.

Exploring what’s inside a `.wax` file

There is no single “open this file and show me everything” inspector. But several focused tools together let you see the contents and verify integrity.

How	Surface	What it reveals
`wax-cli stats`	CLI MCP	Frame count + pending, generation counter, disk bytes, WAL state (write/checkpoint pos, committed/last seq, wrap & checkpoint counts), embedder identity (provider/model/dims), and feature flags. StatsCommand.swift:46–116
`recall` / `search`	CLI MCP	Walk stored documents/chunks by querying them (text / vector / hybrid). The practical way to “list what’s in here.”
`facts_query`	CLI MCP	Dump structured facts, filtered by subject and/or predicate, as of a point in time. FactsCommand.swift:165–242
`entity_resolve`	CLI MCP	Look up entities by key or fuzzy alias.
`vector-health` · `memory_health`	CLI	Vector index health / store health diagnostics. VectorHealthCommand.swift
`markdown_export`	CLI MCP	The most human-readable “view”: project all durable memory out to `MEMORY.md` and read it as text.
Open on next launch	automatic	Integrity is verified structurally: header A/B checksum selection, footer↔TOC hash match, per-record SHA-256, WAL replay. A corrupt file fails loudly on open. WALAndCrashRecovery.md:62–108

The closest thing to a raw dump

The text + EAV indexes are a real SQLite database (FTS5 + the sm_* tables), but it lives embedded inside a frame of the .wax container — it is not a loose .sqlite file you can open with the sqlite3 CLI. The frame layout, TOC, and table schemas are documented in the in-repo DocC articles — the authoritative spec. There is no built-in fsck/dump subcommand beyond stats and the query tools above.

How the EAV triples are indexed — vs Datomic / DataScript / Datalevin

The deep one. Wax models knowledge as RDF-like (subject, predicate, object) triples — but it is not a Datalog database. It’s a fixed-schema, bitemporal triple store on SQLite B-trees.

The actual schema

Entities and predicates are interned into dictionary tables (integer surrogate keys), and every fact is one row in sm_fact with a tagged-union object: an object_kind discriminator (1=string … 7=entity-ref) plus seven typed columns, with a CHECK enforcing exactly one is populated. StructuredMemorySchema.swift:7–67

CREATE TABLE sm_entity   (entity_id PK, key UNIQUE, kind, created_at_ms)
CREATE TABLE sm_predicate(predicate_id PK, key UNIQUE, created_at_ms)

CREATE TABLE sm_fact (
  fact_id PK,
  subject_entity_id  → sm_entity,
  predicate_id       → sm_predicate,
  object_kind INTEGER,            -- 1=str 2=int 3=real 4=bool 5=blob 6=time 7=entity-ref
  object_text/_int/_real/_bool/_blob/_time_ms/_entity_id,   -- exactly one, by CHECK
  version_relation,                -- sets / updates / extends / retracts
  fact_hash BLOB UNIQUE,          -- SHA-256(S,P,O) → idempotent assert
  qualifiers_hash )

CREATE TABLE sm_fact_span (    -- BITEMPORAL: two independent time axes
  span_id PK, fact_id → sm_fact,
  valid_from_ms,  valid_to_ms,     -- when the fact is TRUE in the world
  system_from_ms, system_to_ms,    -- when it was RECORDED (NULL = still asserted)
  span_key_hash UNIQUE )

CREATE TABLE sm_evidence (     -- provenance back to the source text
  source_frame_id, chunk_index, span_start_utf8, span_end_utf8,
  extractor_id, extractor_version, confidence, asserted_at_ms )

The indexes Wax keeps

Only these B-tree indexes exist on the fact tables StructuredMemorySchema.swift:109–130:

Index	Columns	Answers fast
`sm_fact_subject_pred_idx`	`(subject_entity_id, predicate_id)`	“all facts about subject S” and “S + predicate P”
`sm_fact_edge_out_idx`	`(subject, predicate, object_entity)` WHERE kind=7	outbound graph edges (S –P→ ?)
`sm_fact_edge_in_idx`	`(object_entity, predicate, subject)` WHERE kind=7	inbound edges (? –P→ O) — reverse refs
`sm_span_current_fact_idx`	`(fact_id, system_from, valid_from, valid_to)` WHERE system_to IS NULL	“currently asserted” facts — a partial index covering only live spans
`UNIQUE(fact_hash)`	SHA-256(S,P,O)	exact-triple existence (dedup)

Plus key/alias indexes for resolution: sm_entity(key), sm_entity_alias(alias_norm) (NFKC-normalized, case-folded for fuzzy match), sm_predicate(key).

Query-pattern explorer

Pick what you want to bind, and see whether Wax has a covering index — versus which of Datomic’s four index orders a Datalog DB would use.

Head-to-head with the Datalog databases

Dimension	Wax structured memory	Datomic	DataScript	Datalevin
Storage	SQLite B-trees (GRDB), embedded in the `.wax` frame blob	Pluggable (SQL/Dynamo/Cassandra) + index segment trees	In-memory persistent sorted sets	LMDB B+-trees on disk
Covering indexes	Partial set: S+P, ref-out, ref-in, current-span. No attribute-only (AEVT) or value (AVET) index.	EAVT, AEVT, AVET, VAET (4 sort orders)	EAVT, AEVT, AVET (+VAET for refs)	EAV, AVE, VAE + value/giant tables
Value lookup (“who has status=X”)	No value index → table scan	AVET for indexed/unique attrs	AVET	AVE
Reverse references	sm_fact_edge_in_idx — entity-ref objects only (mirrors VAET being ref-only)	VAET (ref attrs only)	VAET	VAE
Query language	No Datalog. Fixed API: `facts_query` (S/P + as-of), single-hop edges, `entity_resolve`	Full Datalog + rules + pull	Datalog + pull	Datalog + pull
Recursion / joins	None in the query layer (compose in app code)	Recursive rules	Recursive rules	Recursive rules
Time model	Bitemporal — valid + system time (like XTDB)	Uni-temporal (transaction time)	None (immutable snapshot)	Limited / opt-in
Dedup model	Content-addressed: `UNIQUE(SHA-256(S,P,O))`	Datom identity in the index	Datom identity in the set	Datom identity in LMDB
Retraction	Close `system_to_ms` on the span (soft, bitemporal)	Retraction datom (op=false)	Retract in a tx → new value	Retraction datom
Bundled extras	Entity aliases + fuzzy resolve; span-level evidence/provenance & confidence	Model as data	Model as data	Model as data; built-in full-text

The one-sentence summary

Datomic/DataScript/Datalevin are general Datalog engines — they pay for several covering index orders so that any query shape and recursive rules run efficiently. Wax is a purpose-built bitemporal triple store that indexes only the handful of access patterns an agent-memory layer needs (entity-centric lookup, single-hop graph edges, dedup, “what’s true now/then”) and exposes them through a small fixed API — trading Datalog’s generality for a simpler engine that rides along inside the same SQLite blob as the text search.

Quick answers

Is a .wax file safe to keep in iCloud/Dropbox?

As a backup or for one-machine-at-a-time use, yes. For two machines writing it at once, no — the lock is advisory and same-machine only, and there’s no merge. Sync the Markdown projection instead (§06).

Can I just sqlite3 the file to browse it?

No — the SQLite DB is embedded inside a frame of the container, not a standalone .sqlite. Use wax-cli stats + search/facts_query, or markdown_export for a readable view (§07).

Does Wax extract facts from my notes automatically?

No. Ingestion stores + indexes text. Structured facts are asserted explicitly via fact_assert / entity_upsert (or the DREAMS.md promotion flow). Evidence rows can link a fact back to a source frame/chunk/UTF-8 span when you do.

Why does the README say HNSW if it’s brute force?

Best guess: aspirational/legacy docs. The shipping engine is an exact Metal cosine search (§01). Worth flagging upstream — it’s a real doc/code mismatch.

What’s the default chunk size?

400 target tokens, 40 overlap, counted with a BERT WordPiece tokenizer OrchestratorConfig.swift:12.

Next · Part 2

Wax vs qmd →

Part 3

The chunk→fact bridge →