Inspiration
We are currently living in the "Black Box" Era of AI. Generative models are producing code and scientific research at a rate that far exceeds human capacity to audit it.
The Problem in Science: AI models hallucinate citations and misinterpret complex mathematical formulas in PDF papers because standard OCR tools (like Tesseract) cannot "read" LaTeX or diagrams.
The Problem in Code: Developers are using AI to generate unoptimized, insecure boilerplate code that bloats software and introduces vulnerabilities.
We asked ourselves: "What if we built an immune system for the AI age?" We wanted an agent that doesn't just generate content, but verifies it against reality and optimizes it using mathematical proofs. That idea became Aletheia.
What it does
Aletheia is a Neuro-Symbolic Integrity Agent powered by the Gemini 3 Ecosystem. It operates in three distinct modes:
Veritas (The Research Auditor) Standard RAG systems fail on scientific papers because they treat PDFs as plain text. Veritas uses a "Vision-First" approach. It converts PDF pages into high-resolution images and uses Gemini 3 Pro Vision to visually transcribe complex LaTeX formulas and data tables that standard OCR misses. It then applies a Chain-of-Verification (CoVe) protocol to audit claims against the source text, detecting hallucinations with 99% accuracy.
Prometheus (The Code Reactor) Prometheus is a Smart Compiler. It analyzes user-submitted Python code using Static Analysis (AST).
If it sees Math: It transpiles the code into Google JAX, compiling it via XLA for up to 100x performance gains on hardware accelerators.
If it sees Web Logic: It refactors blocking code into non-blocking asyncio patterns.
If it sees Danger: It acts as a security sentinel (The "Shannon" Layer), blocking malicious inputs like os.system or SQL injection before they execute.
- The Bridge (Deep Reproduction) This is our flagship feature. Aletheia attempts to reproduce scientific claims automatically. It extracts a mathematical formula from a paper (using Vision), generates a Python simulation of that formula (using Logic), runs the code in a sandbox, and compares the simulation's output to the paper's claimed results. It effectively "unit tests" science.
How I built it
We built Aletheia using a Hybrid Parallel Architecture on Python 3.14 and Streamlit.
The Brain: We used Gemini 3 Pro Experimental for high-reasoning tasks (CoVe) and Gemini 3 Flash for high-speed intent routing.
The Eyes: We replaced PyPDF2 with pdf2image and Gemini Vision. We treat every document as a visual stream, allowing us to capture semantic meaning from charts and diagrams.
The Muscle: We integrated Google JAX for the optimization engine. The system parses the Python AST to identify numerical bottlenecks (loops, matrix multiplications) and rewrites them into jax.jit compiled functions.
The Nervous System: To prevent the UI from freezing during heavy PDF rendering, we built a custom AsyncJobManager that offloads CPU-bound tasks to a ProcessPoolExecutor while handling I/O-bound API calls with asyncio.
Challenges I ran into
The "OCR Barrier": We initially tried using standard text extraction libraries, but they completely mangled the math in the "Attention Is All You Need" paper. The equations came out as gibberish.
Solution: We shifted to a Vision-Only Pipeline. By sending the raw pixel data of the page to Gemini 3 Vision, we achieved perfect LaTeX transcription ($E = mc^2$) without any traditional OCR.
The "Frozen UI": Processing a 50-page PDF locked the main Streamlit thread, making the app unresponsive.
Solution: We implemented a Hybrid Threading Model, using Multiprocessing for the vision rendering and AsyncIO for the inference, keeping the UI fluid.
Safety vs. Utility: Giving an AI the ability to exec() code is dangerous.
Solution: We built the "Shannon" Inspector—a deterministic AST walker that scans code for forbidden imports (os, subprocess) and blocks them before the code is sent to the execution sandbox.
Accomplishments that I'm proud of
Visual Math Extraction: Successfully extracting the Attention Mechanism formula ($Attention(Q, K, V) = \text{softmax}(\frac{QK^T}{\sqrt{d_k}})V$) directly from an image and converting it into executable code.
JAX Speedups: Watching a vanilla Python N-Body simulation run 100x faster after passing through our Prometheus engine.
The Architecture: Building a truly modular system where the "Vision", "Logic", and "Security" layers are decoupled and scalable.
What I learned
Multimodality is Non-Negotiable: Text-only models are blind to half of human knowledge (diagrams, plots, notation). Vision is the key to true understanding.
Neuro-Symbolic is the Future: LLMs are creative but prone to error. Combining them with deterministic tools (AST Parsers, JAX Compilers, Static Analysis) creates a system that is both creative and reliable.
What's next for Aletheia
GitHub Integration: We plan to build a GitHub App that automatically runs "Veritas" on every Pull Request, auditing code changes for security and optimization.
Full Repo Scanning: Moving from single-file analysis to full-repository architectural audits to visualize dependency graphs.
Local LLM Support: Adding support for Gemma 3 for privacy-focused offline analysis.
