MirrorOS Core: Formally Verified Governance for AI Agents

Terminal output of an MRS tamper-proof seal.

Inspiration

We didn't start from a research paper. We started from years inside environments where ungoverned access gets contracts terminated.

Kevin spent a decade architecting enterprise NetSuite environments; SuiteScript, SDF/SuiteCloud, large-scale ERP integrations. He's seen what happens when automation drifts out of compliance from the inside. Brandon came up through full-stack engineering across Python, Go, Laravel, and React; but the formative experience was years of federal contracting work across DOD, ITAR, DEA, and DLA environments, where explicit permission boundaries aren't optional and audit trails aren't log files.

Different domains; same lesson: when authorization is implicit and governance is an afterthought, the question isn't whether something goes wrong. It's whether you can prove what happened when it does. When AI agents started getting MCP access to enterprise systems; making hundreds or thousands of micro-decisions a day across financial records, procurement workflows, and sensitive data; the implications were immediate.

"We prompted it carefully" is not a governance story. Not in an ERP environment. Not in a federal system. Not anywhere the stakes are real. We built the layer we needed.

What It Is

MirrorOS Core is an open-source governance substrate for agentic AI systems. Before any agent touches a system, MirrorOS runs a dual-gate verification:

Gate 1: Prolog behavioral verification. The agent's intended action is evaluated against the Codex; formal governance laws written in SWI-Prolog. These encode business policy directly. Not prompts; logic. It does not vary with model temperature or phrasing.
Gate 2: Z3 structural verification. The action is verified for mathematical consistency using Z3, a formal theorem prover from Microsoft Research. Where Prolog evaluates behavioral compliance, Z3 proves structural validity.

Neither gate can override the other. Both must pass. Every decision; whether permitted or rejected; is sealed in a tamper-proof, Merkle-verified audit ledger powered by immudb. Once written, that record cannot be altered.

The authority hierarchy is explicit and never inverted:

Prolog (Law) → Z3 (Verification) → Python (Bridge) → Agents (Action)

The Key Architectural Insight

In a standard Nova Act integration, the LLM decides whether an action is permitted; governance lives in the prompt and is probabilistic. In MirrorOS, Nova Act receives mechanical instructions only. The policy decision happened upstream in formal logic before the browser was ever touched. Nova Act is a cursor, not a decision-maker.

This is the difference between governance that hopes and governance that proves.

How We Built It

Phase 1: The Formal Foundation

We started with the verification engine. The Codex laws were written in SWI-Prolog; encoding policies like "an agent may not invoke financial write operations without an established approval context" and "access to sensitive records is restricted to authorized roles." Z3 predicates were built to mirror these laws structurally.

A critical early decision was the concordance check at boot time. Every Z3 predicate must have a corresponding Prolog mapping; if any predicate lacks one, the system raises a ConcordanceError and refuses to start. Drift between Z3 and Prolog is structurally impossible after this check passes.

The Python MRS bridge connects the two gates and exposes the verification engine to downstream agents. We scaffolded the public repo and established the authority hierarchy.

Phase 2: LedgerLark and Nova Act Integration

To prove the architecture in practice, we built LedgerLark; an accounts payable orchestration agent. LedgerLark processes a bill queue, routing each item through two MRS gates before any downstream action is taken:

Routing gate: Which agent role should handle this item?
Approval gate: Is that agent authorized for this transaction?

Vendor	Amount	Result	Route
Office Supplies Co	$450	PERMITTED	→ clerk
Cloud Infra Ltd	$8,500	PERMITTED	→ auditor
Unknown Vendor Co	$300	REJECTED	blocked
Strikaris Dev Services	$15,000	PERMITTED	→ auditor

Approved bills are recorded in Zoho Books via Nova Act. Rejected bills never reach the browser. Every decision is sealed in immudb with a Merkle proof on the same round-trip; running python -m ledger.verify <action_id> independently confirms the record has not been modified.

We also built mock adapters, an accounting demo with a live invoice UI, and a graceful fallback path so the MRS runs normally even if immudb is offline.

Challenges We Faced

Zoho Books free plan has no Bills module. Our AP demo was designed around bill processing, but the free tier only supports Expenses. We had to pivot the entire downstream flow.

Zoho dropdown fields don't respond to Nova Act clicks. Standard click-and-select patterns failed. We discovered that typing the agent type value and pressing Enter was the only reliable input method; a workaround that required rethinking our Nova Act instruction sequences.

Nova Act blocked entirely on rejected bills. This was actually the design working correctly; but it forced us to architect the gate to fire before Nova Act is ever invoked, not inside it. The browser is untouched on a rejection. That constraint sharpened the entire architecture.

Long Nova Act instructions caused context issues. Complex multi-step browser instructions needed to be split across two sequential Nova Act calls to avoid degraded performance.

immudb Merkle proof handling. verifiedSet returns a Merkle proof on the same write round-trip; getting verified: True is the key demo moment. We needed careful error handling for the offline fallback path so the system degrades gracefully rather than failing hard.

Concordance drift. The class of bug where a Z3 predicate exists without a corresponding Prolog atom is subtle and dangerous. Our boot-time concordance check catches it before the system ever accepts a request.

What We Learned

The deepest lesson is architectural, not technical: you don't need to make the model smarter to make it trustworthy. You need to build the system around it so trust is structural, not probabilistic.

The gate being upstream of the LLM; not inside it; is the entire point. MirrorOS doesn't constrain what Nova can do by prompting it harder. It constrains what Nova is asked to do by proving the request is valid before the browser is ever opened.

Compliance verification in MirrorOS consumes zero LLM tokens. It runs entirely in Prolog and Z3. The verdicts are deterministic. The same inputs always produce the same outputs. The audit trail is not a log file; it is a cryptographic proof.

Trust in AI agents is not built by making better models. It is built by making better systems around them.

MirrorOS is that system.

Built With

SWI-Prolog · Z3 Theorem Prover · Python · Amazon Nova Act · Amazon Web Services · immudb · FastAPI · Docker

Open Source

Apache 2.0 · github.com/Strikaris-Tech/mirroros-core

Built With

amazon
amazon-web-services
bash
fastapi
immudb
nova
nova-act
prolog
python
swi-prolog
z3

Submitted to

Amazon Nova AI Hackathon

Created by

Every governance system eventually faces the same question: who watches the agent? My contribution to MirrorOS Core was designing the architecture that answers it; formally, deterministically, and before the agent ever acts.

I built the Mirror Reasoning Stack from the ground up; the dual-gate verification engine where SWI-Prolog evaluates behavioral compliance against the Codex Laws and Z3 proves structural validity against system axioms. Neither gate can override the other. Both must pass. That architecture is the core of everything MirrorOS does.

I also built the Python bridge layer that connects the formal verification engine to Amazon Nova, translating agent intent into proof obligations and returning deterministic verdicts with zero LLM token cost for compliance checks.

The part that matters most to me is what this proves about AI governance itself: you don't need to make the model smarter to make it trustworthy. You need to build the system around it so trust is structural, not probabilistic. MirrorOS Core is that system; and the formal methods foundation we've built is what makes the proof real.

Brandon Lee Zeek
Full-stack engineer, co-founder of Strikaris. Building the MirrorOS governance substrate: Prolog/Z3 bridge, immudb ledger, LedgerLark demo.
My contributions focused on validation, quality assurance, and go-to-market work for MirrorOS Core. I identified and resolved a Windows-specific path compatibility bug in the MRS bridge layer that prevented the Prolog gate from loading on non-Unix systems. I conducted code reviews of the ledger and bridge components, validated the LedgerLark demo end-to-end on Windows, and reviewed the project's open-source licensing. I also authored the project's public-facing blog post for builder.aws.com, translating the technical architecture into a governance narrative for enterprise AI and MCP audiences.

Kevin Khan

Updates

Brandon Lee Zeek started this project — Mar 16, 2026 04:14 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.

Brandon Lee Zeek posted an update — Mar 16, 2026 02:09 PM EDT

One of the last pieces we locked in before submission was the immudb ledger integration. Every MRS verdict: permitted or rejected, is now sealed in a cryptographically verified ledger entry backed by a Merkle tree. That means any decision the system makes can be independently verified after the fact with a single CLI call.

The design goal was simple: the audit trail has to be as trustworthy as the governance engine itself. Prolog and Z3 give you a deterministic verdict. immudb gives you proof that verdict was never touched after it was written. Those two properties together are what make MirrorOS useful in a real compliance context, not just a demo.

The integration is fully graceful; MirrorOS starts and runs normally if immudb isn't available, which keeps the barrier to entry low for anyone forking the repo. But when it is running, every action leaves a receipt.

Log in or sign up for Devpost to join the conversation.