Inspiration
Every year, brands spend billions on video ads — and most decisions about which creative to run come down to gut feeling, focus groups, or waiting weeks for campaign data to roll in. We asked: what if you could predict how the human brain responds to your ad before you spend a single dollar?
Neuroscience research has quietly been building brain encoding models that predict fMRI responses to video stimuli with surprising accuracy. Meta's TribeV2 model — trained on thousands of hours of brain imaging data — cracked the door open. We kicked it down and turned it into something marketers can actually use.
What it does
hooked.ai lets you upload two video ads, and within minutes tells you which one will hook your audience — backed by predicted neural activity across 20,484 brain regions.
Upload two ad variants (drag and drop, no setup) Get a winner with a plain-English verdict: "Variant A drives 23% stronger emotional impact and wins the critical first 3 seconds" Four business-ready scores: Visual Engagement, Emotional Impact, Message Complexity, and Viewer Retention — translated from real neuroscience ROI mappings Interactive 3D brain viewer — explore cortical activation in real-time, hover to see which brain regions light up and what that means for your ad Second-by-second timeline — see exactly where your ad loses attention or spikes emotion Segment breakdown — Hook, Product Reveal, and CTA analyzed independently so you know exactly which part of your creative is working No neuroscience PhD required.
How we built it
- Brain encoding: Meta's TribeV2 model (facebook/tribev2) predicts vertex-level fMRI responses across the fsaverage5 cortical surface — 20,484 vertices per hemisphere, per second of video
- ROI mapping: We mapped predictions to functional brain regions using the Destrieux atlas and collapsed them into four actionable business scores
- 3D visualization: Three.js renders a real brain mesh (GLB model) with per-vertex activation coloring — smooth interpolation from deep blue (suppressed) through white to red (activated), plus wireframe overlay and interactive raycasting tooltips
- Backend: FastAPI serving both the API and frontend from a single port — upload → temp file → TribeV2 inference → structured JSON response
- Audio transcription: WhisperX (via uvx) handles speech-to-text for the model's multimodal input pipeline
- Frontend: Vanilla HTML/CSS/JS with Plotly.js for timeline charts, glassmorphism design system, and collapsible result sections — no framework overhead, just fast The entire stack runs locally on a laptop. No cloud GPU required.
Challenges we ran into
- TribeV2 was built for Linux/CUDA — we had to patch Windows path handling (backslashes corrupting HuggingFace repo IDs), fix PosixPath serialization in YAML configs, and resolve a Pydantic typing bug in the exca dependency on Python 3.13
- WhisperX float16 on CPU — ctranslate2 doesn't support float16 compute on CPU; had to patch the pipeline to fall back to int8 quantization
- Brain region mapping is hard (it's an art honestly). Translating raw vertex activations into meaningful marketing metrics required careful threshold tuning and region-to-function mapping using the Destrieux atlas
- Making neuroscience approachable — the hardest challenge wasn't technical, it was translating "default mode network suppression" into "Viewer Retention" without losing scientific validity
Accomplishments that we're proud of
- End-to-end working MVP — upload two mp4s, get a neuroscience-backed winner with interactive 3D brain visualization. It actually works
- Real inference on a laptop CPU — no cloud, no GPU, no API keys. TribeV2 runs locally in ~5 minutes per analysis
- The 3D brain viewer — hover over the temporal lobe and see "Audio processing engaged — effective for story-driven messaging." That moment when the tooltip just makes sense to a non-scientist
- Bridging two worlds — neuroscience researchers publish papers; marketers run A/B tests. We connected them
What we learned
- Brain encoding models are way more accessible than we expected — the hard part isn't the model, it's making the output meaningful to real users
- Windows compatibility is still the final boss of ML tooling
- Simple UIs that surface insight > complex dashboards that surface data
- Three.js is incredibly powerful for scientific visualization when you skip the abstractions and go straight to vertex colors
What's next for hooked.ai
- GPU acceleration — move inference to cloud GPU for sub-minute analysis
- Gemini-powered insight generation — feed brain activation patterns + campaign context into an LLM for personalized creative recommendations
- Batch testing — test 10 variants at once, auto-rank, and surface the top performer
- Platform-specific optimization — different platforms (TikTok vs. YouTube pre-roll vs. CTV) activate different attention patterns. Tailor recommendations per channel
- Export & share — one-click reports for stakeholders that were not in the room
- Export & timestamps directly into a video editor — apply changes at lightspeed, as soon as the analysis is done
Log in or sign up for Devpost to join the conversation.