Inspiration
We've all been there. You're watching a political debate, and a candidate boldly claims something that directly contradicts what they said five minutes ago. Your family Thanksgiving dinner turns into a heated argument where Uncle Bob insists he never said that thing he definitely just said. You're listening to a podcast where the guest's story keeps changing, but no one calls it out.
The problem isn't just that people contradict themselves or make false claims—it's that these moments fly by so fast in heated discussions that they're impossible to catch in real-time. Traditional fact-checkers are boring walls of text that no one reads during live debates. We needed something different: an AI-powered referee that interrupts with natural human voice, making it impossible to ignore when someone's caught in their own BS.
That's why we built NoBS (pronounced "knobs")—a debate analysis platform that listens, remembers, and calls out contradictions with the one thing people can't ignore: a human voice saying "Hold on..."
What it does
NoBS is an AI-powered debate analysis platform that detects contradictions and false claims, then announces them with natural voice feedback. It operates in two powerful modes:
Upload & Analyze Mode
Upload any audio or video file (debates, podcasts, speeches, recorded arguments) and NoBS will:
- Transcribe with speaker diarization using ElevenLabs Scribe—identifying up to 32 different speakers automatically
- Analyze every statement using Google Gemini 2.5 Flash-lite to detect contradictions, fact-check verifiable claims, and catch logical fallacies
- Create an interactive timeline showing exactly when and where BS was detected, complete with playable voice alerts
- Calculate BS scores for each participant and generate an audio summary of the entire debate
Live Real-Time Mode
For live debates happening right now, NoBS captures your microphone and:
- Streams real-time transcription using Speechmatics with speaker diarization
- Analyzes statements as they're spoken using Gemini's streaming capabilities
- Interrupts with voice alerts within 2-3 seconds when contradictions are detected via Elevenlabs TTS; Choose how your fact-checker sounds with four different voice modes!
- Displays live transcripts with real-time flagging ## How we built it
- Frontend: Next.js 15 with React 19, TypeScript, and Tailwind CSS 4
- AI Services:
- Google Gemini 2.5 Flash-lite for multi-step analysis, and creative script generation
- ElevenLabs Scribe v1 for high-accuracy transcription with speaker diarization
- ElevenLabs TTS for natural voice generation across four personality modes
- Speechmatics for WebSocket-based live transcription with speaker diarization
- Database: MongoDB (Mongoose ODM) for storing debates, statements, and flags
- Deployment: Vercel with Turbopack for optimized builds, and a GoDaddy domain
Upload Mode Pipeline:
File Upload → ElevenLabs Scribe → Statement Extraction →
Gemini Batch Analysis → ElevenLabs TTS → Interactive Results
Live Mode Pipeline:
Microphone Capture → Speechmatics WebSocket → Statement Buffering →
Gemini Streaming Analysis → ElevenLabs TTS → Real-Time Interruption
Challenges we ran into
1. Real-Time Processing Latency
Achieving true "interruption" required optimizing the entire pipeline from speech→transcription→analysis→TTS→playback. We had to:
- Buffer statements intelligently (complete sentences vs. word-by-word)
- Deliberate on a Gemini model and experiment with streaming
- Select ElevenLabs Flash v2 model for fastest TTS generation and stream audio response Target: <5 seconds total latency. We achieved 2-3 seconds through careful optimization.
2. Speaker Diarization Complexity
- We initially settled for Deepgram, which one of our member used, but its live speaker diarization is disappointing and inaccurate
- We had to swap to Speechmatics and adopt a new complex framework
- Speechmatics had a difficult-to-use SDK and required custom adaptation for use on web
- We eventually got it to work!
3. Gemini Prompt Engineering
Getting consistent, high-quality contradiction detection required extensive prompt iteration. We had to:
- Weigh our choice between quality vs speed in model choice
- Design the model to utilize structured output via JSON Schema
- Generate creative, natural-sounding alert scripts that match each voice personality
- Include confidence scoring to filter false positives
Accomplishments that we're proud of
We're particularly proud of showcasing both input AND output from ElevenLabs. While many projects use just TTS, we integrated Scribe for transcription AND TTS for voice generation—a complete audio-first experience.
Building TWO complete systems (batch processing and real-time streaming) with different technical architectures shows versatility and provides demo reliability. If live mode has issues, upload mode is a solid fallback—but both modes work.
This isn't spaghetti code. We built:
- Type-safe TypeScript throughout
- Mongoose schemas with validation
- MongoDB connection caching
- Clean separation of concerns (lib/ for services, models/ for schemas, api/ for routes) ## What we learned ### 1. AI API Orchestration is an Art Form Coordinating three different AI services (Gemini, ElevenLabs, Speechmatics) taught us that the magic isn't in individual APIs—it's in how you chain them together. Timing, error handling, streaming, and data transformation between services are critical.
2. Real-Time Systems Require Different Mental Models
Building live mode forced us to think in streams, buffers, and latency budgets. Every millisecond matters when you're trying to interrupt someone mid-sentence. We learned to optimize for perceived performance, not just raw speed.
3. Demo Architecture Matters
The dual-mode system isn't just technically interesting—it's a strategic decision. Live mode impresses, upload mode provides safety. Having both means we're prepared for any demo environment (noisy venue, networking issues, etc.).
4. Speaker Diarization is Harder Than It Looks
Distinguishing between multiple speakers, maintaining consistent IDs across sessions, and visualizing conversations with proper attribution required careful data modeling. We gained deep respect for transcription services that make this look easy.
What's next for NoBS
Fact-Checking with Real-Time Sources
Integrate web search and knowledge bases to verify claims against current data:
- "That statistic is incorrect—the actual number is..."
- "This claim was debunked by [credible source]"
- Show citations and confidence levels
Video Analysis
Use Gemini's multimodal capabilities to analyze:
- Facial expressions and body language
- On-screen graphics and quotes
- Visual evidence contradictions
Mobile App
Record debates on your phone, get instant analysis. Perfect for capturing arguments that happen anywhere. Feasible since we can use React Native.

Log in or sign up for Devpost to join the conversation.