What inspired us

Insurance decisions feel like a black box. You send in a claim, you get back a yes or a no, and if you ask why, you get a confident sounding paragraph that proves nothing. We wanted to flip that. If a machine is going to decide whether your claim gets paid, it should be able to show its working in a way a normal person, or a regulator, can actually follow.

The real question was what the AI should and should not be allowed to do. Language models are brilliant at reading messy human text. They are not something you want quietly deciding who gets paid, because you cannot audit a vibe. So we drew a hard line. The AI reads the claim and nothing else. The decision itself is made by a logic engine running rules we wrote ahead of time.

What it does

You paste a real, rambling car insurance claim. Claude reads it and pulls out plain facts, like "this was a collision", "the driver was sober", "they were doing paid deliveries", "the damage was 4,100 pounds". Those facts go into a Vadalog ontology that encodes a full motor policy. The engine derives a decision, covered or not covered, and hands back the exact chain of clauses that led there. When a claim is covered it also works out the payout, capped at the car's market value and minus the excess.

The case we love is the one a flat rules checker gets wrong. Someone crashes while doing deliveries, which is normally excluded. But they bought the Business and Delivery Use add-on. A naive system sees the exclusion and stops. Our engine reasons about the rules themselves, sees that the endorsement overrides the exclusion, and pays the claim. That override step shows up right there in the proof, so you can see exactly why the answer flipped.

How we built it

The policy is a real one. We took a UK comprehensive motor policy as a template and encoded it as 60 clauses across 10 sections, with the dozen clauses that actually decide claims written as executable Vadalog rules. The rest are there as standard terms, exactly like a real policy booklet, which is why most clauses never fire on any given claim.

The pieces:

  • Next.js, React and Tailwind for the web app.
  • Claude (Anthropic) for the single job of reading a claim into structured facts.
  • Vadalog, a Datalog style reasoning language, for the decision and the line by line derivation.
  • Prometheux as the hosted reasoning platform, where the same policy lives as an ontology of connected concepts with full lineage you can browse.

Everything runs live. There are no canned answers. The facts are extracted in real time, the engine runs in real time, and we cross check the decision against the same program on the real Prometheux platform.

What we learned

The biggest lesson was about where to put the intelligence. The instinct is to let the model do everything. The better design was to give the model the smallest possible job and let a transparent system carry the weight. The proof is not a story the model tells after the fact. The proof is the mechanism that produced the answer. That single distinction is the whole product.

We also got a real feel for stratified negation. The override logic, "an exclusion stands unless something overrides it", is exactly the kind of thing that trips up naive rule engines and falls out naturally in a proper Datalog style language.

Challenges we ran into

  • Keeping the AI in its lane. It really wants to tell you whether the claim is covered. We had to constrain it hard so it only ever emits facts and never an opinion.
  • Making a 60 clause policy feel real without it denying everything. Most clauses should never fire, which is how real policies behave, but it took care to make sure the demo claims still resolved cleanly.
  • Latency. Our first version called the hosted reasoner inline and the page took more than 20 seconds. We moved the cross check to a background step, so the local engine answers instantly and the platform confirmation arrives a moment later.
  • Two surfaces on the platform. Concepts power the lineage graph, and that graph is the ontology, but there is also a separate blank ontology canvas. Working out which view actually showed our work took some digging.

Built With

Share this project:

Updates