we wanted to test the limits of what the current frontier ai can do and we ended up building an open source palantir for crime, called sherlock.
modern investigations drown in unstructured data. millions of images and logs are gathered through forensics for a single case. core evidence is sometimes overlooked and the industry lacks a unified form of visualization.
sherlock is a full-suite solution that ingests multi-modal case files and generates a world model with spatial and temporal datapoints we call - anchors. these anchors are then dialectically analyzed by gemini for logical conflicts and ambiguity, pushing gemini's reasoning capabilities into one of the most complex domains possible: forensics. using existing cues to deduce possible scenarios.
sherlock is architected around the core of criminology. in professional forensics, the process typically moves from preservation and evidence collection to analysis and final reconstruction. we have mapped this lifecycle into three distinct operational phases:
evidence: spatial and temporal grounding
the evidence module transforms raw data into a digital world replica.
- environment mapping: we use gaussian splatting, mapping image and video evidence into 3d space. we currently use Hunyuan Mirror Model for the process but can easily integrate this into d4rt (if its opensourced).
- data synthesis: cctv feeds and sensor logs map to events. we dissect motion paths from witness statements. every piece of evidence grounds in a specific time and place.
- high-fidelity reconstruction: suspect portraits are refined. we use nano banana pro. nano banana pro ensures clarity in obscured evidence.
reasoning: physical verification and logic
sherlock uses physics to verify testimonials. gemini 3 pro analyzes reasoning limits.
- pov simulation: we simulate witness viewpoints in proxy geometry and tests if observations were physically possible with walls and other obstructions.
- conflict detection: evidence exists in tiers: environment, ground truth, logs and witness statements. tier 0-3 is immutable sensors and evironmental variables we call hard evidence. tier 3 is testimonials we call soft evidence. the system flags spatio-temporal contradictions. contradiction flags identify illogical nodes.
simulation: visualizing the truth
sherlock simulates the case.
- motion generation: we visualize the sequence of events. we currently use Hunyuan Motion Model to generate real-time interactions with the world model, but can easily switch to genie later.
- dynamic reconstruction: the system creates an interactive visualization. it turns abstract evidence into a world model. the world model reveals the physics of the event.
Built With
- backend:-go
- frontend:-node.js
- llm:-gemini-3-pro
- modal
- nano-banana
- supabase
- vercel
- world-model:-huanyuan-mirror
Log in or sign up for Devpost to join the conversation.