Inspiration
Supply chain shrinkage costs the global logistics industry over $100 billion annually. Theft rings operate by exploiting the disconnect between data sources—a driver's verbal log says one thing, the CCTV shows another, and the manifest tells a third story. Human investigators can't cross-reference 100 hours of footage with thousands of documents.
We asked: What if AI could reason across modalities like a forensic investigator?
The Gemini 3 hackathon prompt challenged us to build beyond chatbots and simple analyzers. We wanted to create an active investigator—an AI that doesn't just describe what it sees, but correlates evidence across video, audio, and documents to catch what humans miss.
What it does
The Forensic Supply Chain Auditor is an autonomous cross-modal investigation system. It:
- Ingests multimodal evidence: Warehouse CCTV footage, driver voice logs, and shipping manifests
- Reasons across modalities: Correlates timestamps, quantities, and verbal claims against visual evidence
- Detects discrepancies: Flags temporal mismatches, quantity variances, verbal contradictions, and behavioral anomalies
- Generates forensic reports: Produces structured investigation reports with confidence scores, evidence citations, and recommended actions
Example detection: A driver's voice log claims "All 47 units loaded at 22:40." The manifest shows 47 units. But the CCTV shows only 42 packages loaded at 22:23. The agent flags this 5-unit variance with HIGH confidence, cross-referencing 3 modalities.
How we built it
- Frontend: React + TypeScript with Vite for fast development
- UI Framework: Tailwind CSS for the dark forensic dashboard aesthetic
- Visualizations: Recharts for timeline/risk charts, D3.js for network relationship graphs
- AI Core: Google Gemini 3 API (
gemini-3-pro-preview) with structured JSON output schema - Multimodal Processing: Leveraged Gemini's native video, audio, and document understanding capabilities
- Architecture: Client-side file encoding → Gemini multimodal API → Structured forensic report
The key innovation was designing a response schema that enforces structured forensic output—investigation IDs, discrepancy types, cross-modal evidence citations, confidence levels, and risk scores.
Challenges we ran into
- Cross-modal correlation prompting: Getting Gemini to truly reason across modalities rather than analyze each in isolation required careful system prompt engineering
- Evidence citation accuracy: Ensuring the AI cites specific timestamps and document fields, not vague references
- File size limits: Balancing demo usability with realistic evidence file sizes
- Structured output schema: Designing a JSON schema complex enough for forensic reports but parseable by the API
Accomplishments that we're proud of
- Built a true cross-modal reasoning system, not just a multi-input analyzer
- Created a production-quality forensic dashboard with D3 network graphs and timeline visualizations
- Achieved structured, actionable output that a real investigator could use
- Demonstrated Gemini 3's ability to correlate evidence across video, audio, and documents in a single context window
What we learned
- Gemini 3's 1M+ token context window opens entirely new application categories—forensic investigation, legal discovery, research synthesis
- Structured output schemas are critical for building AI applications that integrate into real workflows
- The "Action Era" is real: AI can now perform multi-step reasoning tasks, not just respond to prompts
What's next for Forensic Supply Chain Auditor
- Real-time streaming analysis: Process live CCTV feeds using Gemini Live API
- Pattern database: Build a historical database of flagged incidents to detect repeat offenders
- GPS integration: Cross-reference video timestamps with vehicle GPS coordinates
- Alert automation: Trigger real-time alerts when high-confidence discrepancies are detected
- Enterprise pilot: Partner with logistics companies to test on real shrinkage data
Built With
- d3.js
- google-gemini-3-api
- react
- recharts
- tailwind-css
- typescript
- vite
Log in or sign up for Devpost to join the conversation.