The Architecture

CLARITY

The IDE for Your Brain — Formally Prove Contradictions in Your Reasoning

Inspiration

Every year, billion-dollar companies collapse, peer-reviewed research gets retracted, and public discourse dissolves — not for a lack of data, but because of The Logic Gap. Humans are biologically wired to be "eloquently confused." We use sophisticated language to mask contradictions that mathematics would never permit.

We've all seen it: a strategy meeting where everyone nods at a "bulletproof" pitch, only for the project to fail six months later because of a fundamental logical inconsistency no one caught. A researcher spends months building on a circular premise. An engineering manager commits to a timeline that is mathematically impossible. The reasoning sounded airtight. It wasn't.

We have IDEs to debug our code. Spreadsheets to debug our math. We have nothing to debug our thinking.

Current AI makes this worse, not better. ChatGPT can make your flawed reasoning sound more polished. It summarizes confusion eloquently — but it doesn't prove where your logic breaks.

We built CLARITY to bridge this gap: a Neural-Symbolic engine that translates messy, ambiguous natural language into formal logic and uses deterministic SAT solvers to prove whether your argument is sound. Not guess. Prove.

What it does

CLARITY is a real-time reasoning analysis system that maps your arguments into formal logic, verifies contradictions with mathematical proofs, and surfaces hidden assumptions, logical fallacies, and practical tensions — all visualized as an interactive argument graph.

Semantic Mapping — As you speak or type, CLARITY uses Gemini 3.0 Flash to parse your input into atomic propositions (claims, evidence, assumptions) with formal logic expressions ($\land$, $\lor$, $\to$, $\lnot$) and maps the relationships between them (supports, contradicts, depends_on, assumes).

Formal Verification — Propositions are converted into Conjunctive Normal Form (CNF) and run through a Glucose3 SAT Solver. If your logic contains a contradiction, CLARITY doesn't guess — it provides a Mathematical Proof of Unsatisfiability (UNSAT) and extracts the Minimal Unsatisfiable Core: the smallest set of statements that cannot all be true simultaneously.

Fallacy Detection — Graph algorithms (NetworkX) surface structural flaws: circular reasoning via DFS cycle detection, hasty generalization via degree analysis, and load-bearing assumptions via betweenness centrality — the specific premises that, if wrong, collapse the entire argument.

Multi-Modal Voice Mode — Using Gemini Live, users engage in a bi-directional reasoning session, watching their thoughts materialize as a live, interactive argument graph in real-time. After analysis, click "Ask CLARITY" and Gemini's natural voice walks you through the findings — leading with the most critical contradiction and pointing to the formal proof on the graph.

How we built it

The Architecture: Neural-Symbolic Reasoning

CLARITY's competitive advantage lies in a "Logic Sandwich" that combines Gemini's semantic nuance with the mathematical rigor of symbolic solvers.

INPUT (Voice or Text)
  │
  ├─ Web Speech API → clean transcription for analysis
  └─ Gemini Live → voice acknowledgment
        │
        ▼
LAYER 1: PROPOSITION PARSER (Gemini 3.0 Flash)
  ├─ Extracts claims, evidence, assumptions, implicit premises
  ├─ Generates formal logic expressions
  ├─ Maps relationships as a directed graph
  └─ Output: JSON-structured logical representation
        │
        ▼
LAYER 2: PARALLEL VERIFICATION ENGINES (6 engines, asyncio.gather)
  ├─ SAT Solver (Glucose3 via PySAT)
  │    ├─ Converts to CNF: Φ = ⋀ᵢ Cᵢ
  │    ├─ Proves UNSAT → formal contradiction
  │    └─ Extracts Minimal Unsatisfiable Core
  ├─ Graph Analyzer (NetworkX)
  │    ├─ DFS cycle detection → circular reasoning
  │    ├─ Degree analysis → hasty generalization
  │    └─ Betweenness centrality → load-bearing assumptions
  ├─ Validity Checker → argument soundness
  ├─ Ambiguity Detector → equivocation risks
  ├─ Tension Detector → practical conflicts
  └─ Temporal Tracker → drift across conversation rounds
        │
        ▼
LAYER 3: VISUALIZATION + VOICE
  ├─ React Flow: interactive argument graph (Dagre + radial layouts)
  ├─ Framer Motion: animated contradiction edges, node transitions
  ├─ Sidebar: formal proofs, fallacy cards, insights
  └─ Gemini Live: on-demand voice explanation ("Ask CLARITY")

Gemini 3.0 Flash: The Logical Compiler

Gemini 3.0 Flash serves as the Semantic Parsing Layer, using constrained JSON output to transform messy human arguments into structured logical units. Unlike free-form text, JSON mode enforces a strict schema, ensuring every proposition is machine-readable and SAT-solver compatible. It maps natural language into formal expressions such as $(P \land Q) \to R$ and tracks temporal drift by comparing new statements against a running logical graph — catching mid-conversation contradictions that stateless LLMs miss.

The Neural-Symbolic Bridge

When the SAT solver returns UNSAT, Gemini 3.0 Flash is re-prompted with the conflicting clause set to identify exactly which premise caused the failure. This creates a debugging loop for human reasoning — the user doesn't just learn that their logic is inconsistent, but precisely where it collapsed.

Gemini Live: Multimodal Narrative Layer

Gemini Live's native audio capabilities provide a seamless interface through two distinct phases. First, responsive acknowledgment — providing immediate spoken cues while the solver runs asynchronously, eliminating perceived latency. Second, narrative synthesis — converting verified symbolic proofs into human-centric explanations (e.g., "Your argument is invalid because Premise A contradicts Premise B.").

The result: Gemini provides the semantic understanding, while the SAT solver provides the proof. Together, they achieve what neither could alone — mathematical certainty with a human voice.

The Logic Engine

Python-SAT (Glucose3) serves as the primary solver. Arguments are represented as complex Boolean formulas in CNF where each clause $C_i$ represents a logical constraint derived from user speech. Relationships between propositions generate additional clauses:

Supports: $(\neg \text{from} \lor \text{to})$
Contradicts: $(\neg \text{from} \lor \neg \text{to})$

When the combined formula $\Phi = \bigwedge_{i=1}^{n} C_i$ is UNSAT, the contradiction is mathematically proven — same input, same proof, every time.

The Frontend

A high-performance Next.js 16 dashboard using React Flow (@xyflow/react) for real-time graph rendering with Dagre hierarchical layout, and Framer Motion for smooth, polished animations. Results stream as Server-Sent Events — propositions appear first, then contradiction edges flash red as the SAT solver finishes, then fallacy badges animate in from the graph analyzer.

Challenges we ran into

The Dual-Pipeline Voice Problem — Our biggest engineering challenge. The original design had Gemini Live handling both conversation and analysis delivery in one continuous session. This caused cascading failures: 1008 policy violations when injecting analysis text into an active audio session, sentence fragments that looped 5-6 times, and WebSocket crashes that wiped the graph on Round 2. We spent 14+ hours debugging the audio pipeline before arriving at the correct architecture: complete separation of concerns. Web Speech API handles transcription. Analysis runs through SSE streaming. Gemini Live handles voice only — with a fresh, one-shot session for each "Ask CLARITY" request. The audio playback itself required an accumulate-then-play approach: collect all PCM chunks, concatenate into a single Float32 buffer, play as one continuous AudioBuffer. Streaming chunk-by-chunk produced gaps. One buffer = zero gaps. Sometimes the simplest solution wins.

Rate Limiting Under Parallel Load — Running Gemini Live (continuous WebSocket) simultaneously with 6-7 analysis API calls produces 429 RESOURCE_EXHAUSTED errors. We separated models: Gemini 3.0 Flash for all analysis calls (separate rate limit pool), Gemini Live exclusively for voice. Analysis calls are staggered by 500ms with exponential backoff retry. If the enhanced parser gets rate-limited, it degrades gracefully to a basic parser while preserving the existing graph.

The Parsing Gap — If Gemini misinterprets a sentence, the SAT solver will "perfectly" prove something that was never actually said. We addressed this with proposition consolidation (merging duplicates with 85% word overlap), confidence scoring on every proposition, and a fallback parser for when the primary parser fails. The graph itself serves as the human-in-the-loop verification — users see exactly what was parsed and can immediately spot misinterpretations before trusting the proof.

Accomplishments we're proud of

A working formal verification engine in 48 hours. CLARITY isn't a prompt wrapper. Natural language enters one end; mathematical proofs come out the other. When the SAT solver says UNSAT, the contradiction is proven in the same sense that a compiler error is proven. Deterministic. Reproducible. Verifiable.

Six parallel analysis engines streaming in real-time. Contradiction detection, fallacy analysis, validity checking, ambiguity detection, tension analysis, and temporal tracking all run concurrently via asyncio.gather. The graph builds itself in front of your eyes — propositions first, then red contradiction edges, then fallacy highlights.

Voice mode that feels like a product. Speaking your reasoning and watching an argument graph materialize in real-time, then clicking "Ask CLARITY" to hear Gemini explain the formal contradictions in natural voice — this is the moment that makes CLARITY feel like more than a hackathon project.

Multi-round conversation tracking. CLARITY maintains a persistent logical graph across conversation rounds. Contradict yourself three turns later — the temporal tracker catches it. This is something stateless LLM conversations fundamentally cannot do.

What we learned

We learned that Gemini's reasoning capability is a superpower for compilation, not conversation. By treating Gemini not as a chatbot but as a Natural Language to Formal Logic Compiler, we unlocked a level of structured output quality that makes deterministic verification possible. The combination of AI parsing + SAT solving is strictly more powerful than either alone.

We also learned that users don't just want answers — they want to see the topography of their thoughts. The interactive graph is 90% of the value. When a user speaks a contradiction and sees a red dashed edge flash between two nodes with a formal proof attached, that's the moment they understand their own reasoning for the first time. Visualization isn't a feature. It is the product.

And we learned that audio engineering is its own discipline. Network jitter, AudioWorklet initialization, WebSocket lifecycle management, PCM buffer alignment — each is a rabbit hole. The lesson: when streaming audio breaks in mysterious ways, stop streaming. Accumulate and play.

What's next for CLARITY

CLARITY SDK — A library to add "Logic Unit Testing" to any document editor or communication tool. Verify a Notion page, a Slack thread, or an architectural decision record with a single API call.

Conflict Resolution Mode — A dual-user mode where two parties enter their arguments, and CLARITY identifies the exact, fundamental premise where their logic diverges. Not "who's right" — "where exactly do you disagree."

Decision Templates — Pre-built logical frameworks for high-stakes scenarios: Startup Pivots, Clinical Research Validation, Legal Argumentation, Engineering Architecture Reviews. Lower the barrier from "type your argument" to "fill in the blanks and get formal verification."

Deeper Formal Methods — Temporal logic (LTL/CTL) for reasoning about event sequences, modal logic for possibility and necessity, probabilistic SAT for handling uncertainty. The mathematical foundation is built — now we extend the toolkit.

CLARITY is a new category of tool: The IDE for Thinking. It showcases the best of the Gemini ecosystem: Gemini 3.0 Flash for high-speed logical parsing, Gemini Live for natural voice interaction, and structured output that bridges the gap between neural AI and symbolic logic. We've proven that when you combine Neural AI with Symbolic Logic, you don't just get a better chatbot — you get Mathematical Certainty.