9 May 2026 · Craig Miller · 6 min read · All notes

The open legal-AI stack, post-Munir.

In May 2026, the open-source legal-AI stack stopped being a list of products and started being a layered architecture. Mike for documents. vLLM for inference. CloseVector for retrieval. DONNA for delegation. happi.md for the protocol. Five layers, one architectural question: who decided what, on what evidence, from what corpus. The convergence is happening in real time, and most of the legal-tech press has not yet noticed.

The pattern is easier to see now than it was six months ago. Twelve months ago there was Harvey, there was Legora, and there was a long tail of chatbots wrapped in legal-themed prompts. The open-source response was a single category — "open-source AI for legal work" — that read as one project pitched against three commercial incumbents. That framing has aged badly. What's actually emerging is a stack: distinct projects, each owning one layer, each readable through the same architectural lens.

The layers, named

The stack, as it stood at the start of this week:

Layer 1 · Document drafting

Mike — Will Chen

AGPL-3.0. Draft, summarise, redline. Built by a former Latham & Watkins lawyer; positioned, in Will's Artificial Lawyer interview last week, as "an open-source alternative to Harvey and Legora with feature parity, zero cost, self-hostable." 2,481 stars, 702 forks at the time of writing. The drafting layer of the stack.

Layer 2 · Local inference

vLLM integration in Mike — Joseph Breda

The pull request is small. The architectural consequence is large. A drafting layer that can target vLLM on the firm's own GPUs is a drafting layer that satisfies Munir by construction — no public egress, no third-party API, no privilege waiver. Joseph's PR turns Mike from "open-source code that calls a hosted model" into "open-source code that runs end-to-end inside the firm's perimeter."

Layer 3 · Retrieval audit

CloseVector — Dean Hoffman

Not open-source — patent pending, proprietary — but architecturally part of the same shape. CloseVector applies a cryptographic audit chain at the retrieval layer: every search produces both results and a defensible record of methodology. It answers the regulator's "on what evidence?" question in the same way DONNA answers the "who decided what?" question. Different surfaces, same primitive.

Layer 4 · Delegation audit

DONNA — chiefofstaff-legal

AGPL-3.0. Voice-first, self-hosted, AGPL. The IDR primitive — Intent Decision Record — produces a signed, chained record per delegated decision. The chain is replayable by an independent verifier. Designed for the post-Munir, pre-Article-13 environment, where the regulator's question is no longer "do you have an AI policy?" but "can you produce evidence the policy was followed?"

Layer 5 · Open protocol

happi.md v1.1

The format spec for the audit-chain record. Public, versioned, parser-validated in four languages. The point of the protocol is that an audit chain written by DONNA can be read by any tool that implements happi.md — including, in principle, CloseVector's retrieval records, a future Mike audit log, or a regulator's verifier. The protocol is the layer that turns five products into one stack.

The architectural question the stack answers

Take a single matter — a contested asylum appeal, a contract negotiation, a discovery exercise. Read it through the stack:

Mike drafted the witness statement. Audit log: which clauses came from the model, which were edited, when, by whom.
vLLM served the inference. Audit log: which model weights, which parameters, on which hardware, no egress.
CloseVector retrieved the supporting authorities. Audit log: which corpus, which query, which results were returned, which were used.
DONNA brokered the delegations. Audit log: which voice instruction triggered which action, which human reviewed the output, when the IDR was signed.
happi.md is the format every layer writes in. Audit log: portable across vendors, replayable across years, verifiable without trusting any single supplier.

Read together, the five layers answer one question: who decided what, on what evidence, from what corpus, with which model, and under what oversight. That is the question Munir asked the bar to be ready to answer. It is also the question Article 13 asks every high-risk AI provider in the EU to be ready to answer from 2 August 2026. The two regulatory instruments converge on the same architectural object.

What this is not

This is not a roundup, and the stack is not a product partnership. None of the five projects has a commercial relationship with any other. None has integrated with any other (yet). What they share is an architectural insight that has become harder to ignore over the last six months: regulated AI use is record-keeping AI use, and the records have to chain together across layers if they are going to satisfy a regulator who is reading the file end-to-end.

The ecosystem has other projects — Omnilex (Zurich, Swiss-law focus, USD 4.5M raised November 2025), OpenContracts (annotation + MCP), and a long tail of contract-review tooling — and over the next twelve months at least one or two of them will fit into the same five-layer reading. The stack is not closed. It is converging.

What we'd suggest

For a firm reading Munir, EU AI Act Article 13, and the wave of state-level US transparency rules at once: the right architectural question is no longer "which AI vendor do we pick?" but "which audit-chain stack do we adopt?" The shape of the answer is converging. The choice the firm has to make in 2026 is not between products. It is between picking up a converging architecture early or being told what the architecture is in late 2027 by the first regulator to write a fine cheque.

We'd argue the open layers (Mike, vLLM, DONNA, happi.md) and the proprietary layers that share the architecture (CloseVector at the retrieval layer) belong in the same procurement file. They answer adjacent questions in the same regulator's reading order. The "DONNA versus X" framing that the legal-tech press has reached for, repeatedly, since launch — that framing is the one piece of the conversation that does not survive contact with the architecture.

If you're building one of the layers — or you've spotted a project that fits into the stack as I've described it — write to notes@donnaoss.com. I'll publish a follow-up that names the layer and credits the project. The point of an open architecture is that it is read, extended, and corrected in public.

Donna probat.

Craig Miller · 9 May 2026 · cape town · zurich