Inspiration

We started thinking about what actually goes wrong when companies deploy AI agents over their internal knowledge bases. The failure mode isn't usually the model. It's the data. Outdated policies that never got removed, documents with PII that slipped through, two versions of the same guide that say completely different things. The AI reads all of it and has no way to know what to trust.

We wanted to build the layer that sits in front of that.

What We Built

The Safety Diver has three parts:

Purified Ingestion scans every document for PII and secrets before it enters the knowledge base. SSNs, credit cards, API keys, AWS credentials. Anything toxic gets blocked. Everything else gets a pollution report attached so you know what made it through.

The Auditor lets you query the corpus and surface contradictions between sources. We detect two types: documents that are completely different on the same topic (low Jaccard similarity), and documents that are nearly identical but disagree on specific numbers (same structure, different facts). The second type is the dangerous one.

When a contradiction is found, you can declare which source is canonical. That decision gets written to Human Delta's agent filesystem and applies immediately to all future search results.

The Clarity Dashboard shows search results with trust scores and governance badges, so you can see exactly which sources an AI agent would prioritize.

Challenges

The biggest technical challenge was figuring out Human Delta's filesystem API. The /v1/fs endpoint has two different protocols depending on what you are doing, and using the wrong one silently does nothing. Our governance system appeared to be working for hours before we realized none of the writes were actually persisting. We debugged it by hitting the API directly with curl until we found the right request format.

We also ran into issues with HD's markdown processing modifying our stored data on every read/write cycle, which forced us to switch to a plain text format for storing governance records.

What We Learned

Building on top of a vector search API taught us a lot about the gap between "semantically similar" and "factually consistent." High similarity scores don't mean two documents agree. That gap is exactly the problem this project is trying to solve.

Built With

Share this project:

Updates