Part 2 of 3

Wax vs qmd

and the case for combining them

qmd (“Query Markup Documents”) and Wax look like twins — local-first, SQLite-backed, BM25+vector hybrid, MCP servers. They’re not. They solve adjacent but different problems, and that difference is exactly why one can sit on top of the other.qmd · TypeScript Wax · Swift
qmd claims carry file:line citations into its source; Wax claims keep the amber cites from Part 1.

Two axioms that decide everything

Almost every trade-off below follows from two structural facts.

① Index-as-cache vs store-of-record. qmd’s SQLite (~/.cache/qmd/index.sqlite) is a disposable index over files-on-disk: collections are dirs+globs; update --pull does a git pull then re-indexes; content dedups by hash. The files are the truth, the DB is rebuildable. Wax’s .wax is the truth — a transactional, crash-safe store (WAL, dual headers, Merkle).

② TypeScript + llama.cpp vs Swift + CoreML. qmd’s value-add is ~2 GB of GGUF models driven from TS via node-llama-cpp: embeddinggemma-300M, qwen3-reranker-0.6B, a finetuned 1.7B query-expander.llm.ts:252–255 Wax stays CoreML-light (MiniLM-384) with no LLM in the loop.

So “qmd on Wax” really means…

keep qmd’s brain (chunking, expansion, rerank) and swap its storage/retrieval for Wax. Whether that’s wise depends entirely on which feature and which build strategy — the rest of this page.

Side by side

Grounded in qmd’s source, not its README.

Dimension	qmd	Wax
Language / runtime	TypeScript · Node ≥22 / Bun	Swift 6 · Apple-Silicon-native
Storage	Plain SQLite + FTS5 + `sqlite-vec` db.ts	Custom `.wax` container; SQLite embedded in a frame
Source of truth	The files on disk (index is a cache)	The `.wax` file itself
Text search	FTS5 / BM25 (+ CJK normalization) store.ts:763	FTS5 / BM25
Vector search	`vec0` cosine, embeddinggemma-300M GGUF store.ts:1169	Metal brute-force cosine, MiniLM-384
Fusion	RRF k=60 + 2× original + rank bonuses store.ts:3807	RRF k=60 + query-type adaptive weights
LLM in loop	Yes — query expansion + reranker	No
Structured / EAV	None — flat documents+chunks	Bitemporal EAV graph
Sync model	Sync the files, rebuild cache (git-native)	Sync binary (risky) or the Markdown projection
Transparency	`sqlite3`-openable; `--explain` traces	Opaque (DB inside a frame); `stats` + query tools

Shared DNA: local-first, single-SQLite-backed, BM25+vector hybrid with the same RRF constant (k=60), and an MCP server with stdio+HTTP. The divergence is everything in the bottom four rows.

qmd’s retrieval pipeline

The real hybridQuery store.ts:4496 — the part worth borrowing. Click a stage.

Wax natively does only stages 3–4 (hybrid + RRF). Stages 2 (expand) and 5–7 (rerank+blend) are pure pre/post-retrieval — which is precisely why they port onto any backend.

Which qmd features port onto Wax?

Tap any feature to expand it — how cleanly it sits on Wax-as-storage, and what you inherit free vs. build yourself.

✓ clean (no Wax change) · ⚒ workable (you own glue) · △ redundant/partial · ✗ fights Wax

Two build strategies

The embedder swap is the fork in the road.

The deciding constraint

Wax’s daemon/MCP search uses Wax’s own MiniLM embedder — there’s no exposed “search with my precomputed vector.” So if you want qmd’s multilingual embeddinggemma, you must go native Swift — which means rewriting qmd’s ~15k-line pipeline. Stay in TS to reuse qmd’s code and you inherit MiniLM-384-en. There’s no cheap middle.

The verdict

Real benefits

Inherit a native Metal hybrid index + crash-safe single file.
Sidestep qmd’s biggest wart: on macOS qmd must setCustomSQLite() to Homebrew’s libsqlite3 (Apple’s is built OMIT_LOAD_EXTENSION) db.ts:25–100 — Wax bundles its engines, so that failure class vanishes.
The prize: Wax’s superset — turn retrievals into durable, bitemporal, citable EAV facts + sessions. qmd can’t.

Real costs

You re-add the ~2 GB of models + llama.cpp weight that Wax was designed to avoid.
You lose qmd’s transparency: arbitrary SQL, --explain raw scores, llm_cache as a table.
You lose qmd’s elegant “sync files, rebuild cache” model — or you fight Wax’s store-of-record nature.
Strategy B = a big rewrite; Strategy A = embedder/IPC compromise.

My take

“qmd, but native” is a poor trade — qmd’s architecture already fits its job. The one combination worth building is the inverse: not “qmd on Wax storage” but qmd’s retrieval feeding Wax’s structured memory. Use qmd-style smart chunking + rerank to find the right passages, then distill them into Wax EAV facts (with sm_evidence pointing back at the source span). The backend-agnostic pieces — rerank and query expansion — drop in cleanly and are the layer to build first.

← Part 1

Wax explained

Next · Part 3

The chunk→fact bridge →