qmd's retrieval feeding Wax's structured memory. A tool that reads an org‑mode corpus — devlogs, notes, session journals — chunks it along its own structure, and distills durable, bitemporal, citable EAV facts back into Wax. Grounded in a real tree: ~/dev/org/{devlogs,notes} and ~/dev/my/claude-journal.
Cites: green = the new bridge · amber = Wax · blue = qmd.
All org. Markdown is vestigial here. Three shapes, each already semantically segmented.
| Sub‑corpus | Path shape | Structure the bridge exploits |
|---|---|---|
| Devlogs (481) | org/devlogs/<project>/<yyyy>/<mm>/<date>_<project>.org | #+SOURCE-FINGERPRINT: (a SHA — free content hash), #+TITLE, #+FILETAGS: :daily:summary:<proj>:, then * Summary → * Journal → ** HH:MM — → **** Key Decisions with dense [[orgit-rev:…::<sha>]] commit links |
| Notes (93) | org/notes/<yyyy>/<slug>.org | :PROPERTIES: drawer (:ID:,:CREATED:), *Affects:*/*Severity:*, * Root cause → ** Layer N, #+begin_src/#+begin_example blocks |
| claude‑journal (249) | my/claude-journal/<project>/<yyyy>/<mm>/<ts>_<project>.org | #+PROJECT:, * Context / * User Context / * Reflections / * Observations → ** <obs> |
The author already split this into Decisions / Observations / Root‑cause / Reflections — high signal‑to‑noise. And project + date + kind live in the path and headers: free entity and time metadata, no inference needed.
qmd guesses structure from # density. Your org tree hands it over.
Score break points by regex (H1=100, code‑fence=80, HR=60, paragraph=20…) and search a ~200‑token window for the best cut with squared‑distance decay. store.ts:106–242 It's reconstructing structure the author left implicit.
Prefer whole heading subtrees — a **** Key Decisions block is a chunk. Only fall back to qmd‑style windowed cutoff inside an oversized subtree. Never split inside #+begin_…/#+end_… or :PROPERTIES:…:END: (the org analogue of qmd's code‑fence guard).
Two consequences worth their own line: the heading path tells you which section a chunk is in — so you distill only the high‑signal ones and skip Conversation Excerpts for free; and the directory names are your entity table (every project is a top‑level dir).
Subtree‑first, block‑aware. Edit the org below; tune the granularity. Pre‑filled with a (faithful) slice of a real invoicekit devlog.
distill high‑signal section · skip routed out by heading · block contains a no‑split region · oversize would get a windowed sub‑cut. Each chunk's provenance — file, project, date, heading path, char span — rides into Wax as metadata.
Click a stage.
Each extracted triple lands as a fact_assert with an sm_evidence row pointing back at the exact source span.
| Extraction output | → Wax | Source field |
|---|---|---|
subject (canonical key) | fact_assert.subject | resolved, see §06 |
predicate (controlled vocab) | fact_assert.predicate | — |
object (typed) | fact_assert.object | int / string / entity‑ref |
| entry date | valid_from_ms | #+DATE / path |
| extraction time | system_from_ms | now |
| chunk frame id | sm_evidence.source_frame_id | StructuredMemorySchema.swift:94 |
| char span in chunk | span_start_utf8/_end_utf8 | …:95–97 |
"org-bridge" + model id | extractor_id/_version | …:99–100 |
| LLM confidence | confidence | …:102 |
Key Decisions chunk abovevalid_time = the day you wrote the devlog ("what was decided on 2026‑05‑01"); system_time = when the bridge learned it. So facts(about: project:invoicekit, asOf: <date>) reconstructs what you knew then, and re‑running with a better extractor opens new system‑time spans without losing history.
Wax does zero co‑reference; assertFact binds subjects by exact key. So the bridge owns identity — but your corpus makes it easy.
entity_upsert(key:"project:invoicekit", kind:"Project", aliases:["invoicekit","ik"]). The dir tree is your entity table.entity_resolve(surface) → reuse the rank‑0 match, else entity_upsert. Commits become entities keyed by short SHA (commit:876d114), so a decision can point at the commit that implemented it — exactly what your orgit-rev links already encode.Wax interns predicates by exact key, so decided and made_decision would split into two. Lock a small set tuned to these logs; the extraction prompt maps prose onto it or drops the candidate.
SHA‑256(S,P,O) (UNIQUE(fact_hash)), so re‑extracting an unchanged chunk is a no‑op.#+SOURCE-FINGERPRINT / content hash.version_relation:updates → the old span closes, history preserved.DREAMS.md promotion flow is the model AgentBrokerService+Markdown.swift:168.(subject, predicate, source‑span) instead of object text.gptel setup) prompted with heading context + the vocab is the realistic path.org/.fact-review.org checklist.Given org‑first + your Emacs/Babashka stack, the realistic shape is a Babashka/Clojure (or TS) orchestrator: org‑parse → subtree chunk → extract → drive Wax via its CLI/daemon (fact_assert already takes an evidence arg AgentBrokerService.swift:1187). No Swift needed unless you want the custom‑embedder path from Part 2. First step: a 50‑file dry run over org/devlogs/invoicekit — the extracted facts will tell you fast whether they're worth keeping.