Inspiration

Every year, millions of traffic accidents result in lengthy insurance disputes, clogged court systems, and delayed justice. The traditional claims process is manual, subjective, and prone to "he-said-she-said" biases. We asked ourselves: What if we could put a Forensic Engineer in the cloud? We were inspired by the gap between raw video evidence—which is everywhere now due to dashcams—and the ability to interpret it scientifically. With the release of Gemini 3, we realized that an AI model finally possessed the multimodal reasoning capabilities (vision, physics intuition, and logical deduction) to act as an impartial expert witness.

What it does Incident Lens AI is a professional-grade forensic video analysis suite. It transforms raw crash footage into a defensible legal case file in seconds. Forensic Physics Engine: Estimates vehicle speeds using frame-by-frame photogrammetry and motion blur analysis. Predictive Damage Modeling: Validates speed estimates by correlating kinetic energy calculations with observed vehicle deformation (crumple zones). Legal Narrative Generator: Synthesizes findings into a formal "Chain of Events" document suitable for court proceedings, citing specific timestamps for every claim. Liability Determination: Allocates fault percentages based on traffic laws, identifying specific rule violations (e.g., Right of Way). Signal State Inference: Deduces the color of traffic lights that are off-camera by analyzing cross-traffic flow and pedestrian behavior. Confidence Auditing: A self-reflection engine that assigns confidence scores (High/Moderate/Low) to every finding, flagging uncertainties transparently.

How we built it We built Incident Lens AI as a high-performance React application powered by the Google GenAI SDK. Video Processing Pipeline: When a user uploads footage, we extract keyframes and audio tracks in the browser using the HTML5 Canvas and Web Audio APIs. This ensures privacy and reduces latency. Gemini 3 Integration: We utilize gemini-3-pro-preview with a massive, multi-phase system instruction. We stream the response to create a "Glass Box" AI, where the user watches the AI "think" through the evidence (identifying weather, tracking vehicles, calculating physics) before receiving the final structured JSON data. Visualizations: We used Recharts for liability pie charts and driver risk radar plots, and Lucide React for the UI. Report Generation: The app generates downloadable PDFs (Executive Summary, Legal Brief) on the fly using jspdf.

Challenges we ran into Hallucination Control: Early versions would guess speeds. To fix this, we implemented a "Physics Sandbox" and "Predictive Damage" layer where the AI must mathematically prove its speed estimates against the physical damage seen in the video. If the math doesn't match the visual damage, the confidence score drops. Context Window Management: High-resolution video consumes many tokens. We optimized our frame sampling algorithm to prioritize impact moments while maintaining enough temporal resolution for speed calculations. Structured Output Stability: Getting the model to output a stream of reasoning text followed by a strictly valid JSON block was tricky. We engineered a robust parsing system that handles mixed-content streams gracefully.

Accomplishments that we're proud of The "Damage Validation" Tab: It’s incredibly satisfying to see the AI calculate kinetic energy in Joules and correctly predict that a car should be "Totaled" based on the math, matching the visual reality. Legal Writing Style: The AI generates narratives that sound like they were written by a seasoned traffic attorney, complete with third-person phrasing and evidence citations. Real-time Reasoning: The UI for the "Live Reasoning Stream" makes the AI feel alive and trustworthy, as you can see it noticing details like "sun glare on the traffic light" or "driver glancing at phone."

What we learned We learned that Gemini 3 has an innate understanding of physics. It didn't just see "a car moving fast"; it understood that because the car moved X distance in Y frames, and the road was wet, the stopping distance would be insufficient. We also learned that transparency is the key to AI adoption in high-stakes fields like insurance and law—users need to see the why, not just the what.

What's next for Incident Lens AI 3D Scene Reconstruction: Using NeRFs to generate a 3D model of the accident scene from the 2D video. Live Traffic API: Connecting directly to municipal traffic camera feeds for real-time intersection monitoring. Multi-Vehicle Mesh: allowing fleets to upload footage from both vehicles involved to automatically generate a unified truth consensus.

Gemini Integration Overview Incident Lens AI is powered by Gemini 3 Pro, leveraging its native multimodal architecture to transform unstructured video evidence into professional forensic reconstructions. The application does not merely "watch" video; it fuses visual data (sampled frames) with audio waveforms to perform a holistic analysis, correlating visual impact dynamics with acoustic signatures like tire squeals or horns. Key Gemini 3 features central to the architecture include:

Native Multimodality: The system ingests interleaved video frames and raw audio data in a single request context, enabling the model to detect events that are visually obstructed but audibly distinct.

Streaming Reasoning Chains: Utilizing generateContentStream, the application exposes the model's "thought process" in real-time. This provides users with a transparent "Reasoning Trace"—visualizing the AI's step-by-step logic from physics calculations to fault determination—before the final output is generated.

JSON Mode & Structured Output: The model is instructed to synthesize complex, type-safe JSON data structures, which directly power the application’s interactive dashboards, timeline visualizations, and liability charts.

Google Search Grounding: The integration employs the Google Search tool to cross-reference observed traffic behaviors with specific legal statutes and case law, ensuring that liability findings are legally defensible and cited.

Region of Interest (ROI) Analysis: Leveraging fine-grained visual understanding, the "Deep Scan" feature allows users to crop specific image areas (like license plates or debris) for targeted, high-resolution forensic queries.

Built With

  • canvasapi
  • googlegeminiapi
  • googlesearch
  • html5
  • jspdf
  • lucidreact
  • react
  • recharts
  • tailwindcss
  • typescript
  • webaudioapi
Share this project:

Updates