Inspiration

In consumer software, a 99% success rate is a triumph. In the physical world, it is a catastrophe.

In aviation, medical devices, or high-voltage engineering, a 99% success rate means 1,000 planes crashing every single day.

We are rushing to integrate Large Language Models (LLMs) into critical workflows, but standard LLMs are probabilistic engines—they are designed to predict the next token, not to verify truth. When an engineer asks for the "maximum charge voltage" of a battery, a hallucination isn't just a bug; it is a fire, a recall, or a lawsuit.

We built Akili because we believe the future of AI in engineering isn't about making chatbots that chat better. It's about building control planes that reason better. We wanted to prove that you can build a system where "I don't know" is a feature, not a failure.

What it does

Akili is a Reasoning Control Plane for mission-critical engineering.

It is an AI assistant that is only allowed to answer when it can mathematically prove the answer from the source document.

  • Ingests Technical Truth: Users upload dense technical PDFs (datasheets, schematics, pinout tables).
  • Creates a Map: Instead of summarizing text, Akili extracts governing laws (Units, Bijections, Grids) and locks them to precise $(x,y)$ pixel coordinates.
  • Verifies, Then Answers: When a user asks a question, Akili checks its deterministic database. If the answer exists, it returns the value and a green bounding box on the original PDF proving exactly where the data came from.
  • Refuses to Guess: If the answer cannot be derived from the extracted structure, Akili explicitly refuses to answer.

How we built it

We architected Akili around a strict Separation of Concerns between Perception (Probabilistic) and Logic (Deterministic).

  1. Perception (Gemini 3): We use Google Gemini 3's advanced multimodal vision capabilities for the ingestion layer. We prompt Gemini to "read" the visual layout of tables and schematics and output structured JSON containing:
    • Units: Measurements with values and physical units (e.g., $4.2V$).
    • Grids: Tabular data mapped to row/column indices.
    • Bijections: 1-to-1 mappings (e.g., Pin Name $\leftrightarrow$ Pin Number).
  2. Validation (Pydantic v2): We use Pydantic to enforce strict schema validation. If Gemini outputs data that doesn't fit the physical laws of the document (e.g., a voltage without a unit), it is rejected before it ever enters our database.
  3. The Truth Store (SQLite): Validated facts are stored in a relational database, keyed by their document coordinates $(x,y)$.
  4. Verification Layer (Python): User queries are processed by deterministic Python functions that query the Truth Store. We do not ask the LLM to "think" about the answer; we calculate it.
  5. Frontend: A React + TypeScript interface visualizes the "Green Box" proof overlay on the PDF.

Challenges we faced

  • The "Helpful Assistant" Problem: LLMs are trained to be helpful, which means they want to guess an answer even when they shouldn't. We had to spend significant time tuning the "Refusal" logic to ensure Akili would rather stay silent than hallucinate.
  • Coordinate Grounding: Extracting precise pixel coordinates from a raw PDF image is difficult. We had to iterate heavily on our Gemini 3 system instructions to ensure it returned bounding boxes that perfectly aligned with the visual tables.
  • Normalization: Teaching the system that $4.2V$, $4200mV$, and "4.2 Volts" are the same physical quantity required implementing a robust normalization layer using a physics-aware library.

Accomplishments that we're proud of

  • The "Money Shot": Seeing the green bounding box snap perfectly onto a tiny cell in a 50-row table for the first time was magical. It turned the AI from a "black box" into a transparent tool.
  • Deterministic Refusal: We successfully built a system that consistently returns REFUSED: Fact not found for missing data, rather than making up a plausible-sounding lie.
  • Neuro-symbolic Architecture: We successfully bridged the gap between the messy, probabilistic world of vision (reading a PDF) and the strict, binary world of engineering constraints.

What we learned

  • Gemini 3 is a Sensor: We stopped treating the model as a "writer" and started treating it as a "sensor." Its ability to interpret visual spatial relationships (like which label belongs to which pin on a schematic) is far superior to text-only parsing.
  • Structure is Safety: The key to safe AI isn't better prompting; it's better architecture. By constraining the AI's output to strict schemas (Grids/Units), we eliminated entire classes of hallucination.

What's next for Akili

  • CAD Integration: Expanding ingestion to support .DWG and .DXF files for mechanical engineering.
  • Semantic Search: Using embeddings to allow users to search for "similar components" across thousands of datasheets.
  • Enterprise Connectors: Integrating directly with JIRA and Confluence so engineers can verify requirements against datasheets automatically.

Built With

Share this project:

Updates