Inspiration
Every quarter, billions of dollars move on the words of a handful of executives. We kept watching markets crater after earnings calls, TSLA down 31%, Wirecard collapsing entirely, and asking the same question: were the signals there all along?They were. Hedging language. Vague quantifiers. Answers that never actually answered the question. Executives contradicting things they'd filed with the SEC months earlier. The problem wasn't that the signals didn't exist — it was that no one was catching them live, in the room, before the market moved.That gap felt criminal. Institutional investors have armies of analysts. Retail investors have nothing. We wanted to fix that asymmetry.
What it does
EarningsLens is a real-time earnings call deception detector. You stream a live call — or replay a recorded one, and it: Transcribes the audio with speaker diarization (CEO / CFO / Analyst labels) via Deepgram WebSocket Cross-references statements against pre-indexed SEC 10-K / 10-Q filings using ChromaDB semantic search, surfacing contradictions with a similarity score ≥0.85\geq 0.85 ≥0.85 Displays a live per-speaker credibility gauge, contradiction cards, and a credibility timeline Exports a full PDF evidence report at call end
How we built it
The stack is purpose-built for latency. Every layer was chosen to keep end-to-end analysis under 2 seconds per statement. And almost all of it was written inside Cursor.Cursor as the core development environment — We used Cursor from the very first line of code. The project spans a React 18 frontend, a Python FastAPI backend, a ChromaDB vector store, and a Claude API integration — four distinct codebases that needed to talk to each other perfectly. Cursor's ability to hold the entire project context across all these files simultaneously meant we could ask questions like "why is the WebSocket dropping frames when the FastAPI endpoint is under load" and get answers that actually understood both sides of the connection. That kind of cross-file reasoning would have taken hours of manual debugging without it.Frontend — React 18 + Vite + Tailwind CSS. Web Audio API captures system audio or mic input and streams PCM chunks to Deepgram via WebSocket. Recharts renders the live credibility timeline. All animations are CSS transitions only. We wrote the entire frontend in Cursor — the live transcript feed, the animated credibility gauge, the contradiction cards that slide in from the right, all of it. Cursor's autocomplete understood our component architecture after the first few files and started suggesting the exact prop shapes and state patterns we needed.Transcription — Deepgram's streaming STT API with speaker diarization. Words arrive in real time, color-coded by speaker. Cursor helped us write the WebSocket reconnection logic and the diarization smoothing layer in a single session.AI inference — Claude (claude-sonnet-4-5) with structured tool use. Each segment is sent to a FastAPI /analyze endpoint. We designed the 11-signal taxonomy in Cursor by iterating the prompt directly in the editor alongside the Python handler — Cursor suggested tightening the JSON schema after noticing we were doing fragile string parsing downstream. It was right.Vector search — ChromaDB stores chunked SEC filings (512-token chunks, 64-token overlap) embedded with sentence-transformers/all-MiniLM-L6-v2. Contradiction queries run in parallel with Claude inference. Cursor wrote the parallel async pattern for this — we described what we wanted in a comment and it produced a working asyncio.gather implementation that we barely touched.Ingestion — A /ingest/{ticker} endpoint pulls 10-K and 10-Q filings from SEC EDGAR, chunks them, embeds them, and stores them under a {ticker}_filings collection. Cursor generated the full EDGAR API integration from a single prompt describing the data structure we needed.Demo mode — We pre-indexed Wirecard's filings and replay their Q4 2019 earnings audio at 1× speed through the same pipeline. Cursor helped us build the deterministic replay engine in under an hour.
Challenges we ran into
Latency was brutal. Getting Claude inference + ChromaDB search + WebSocket round-trip under 2 seconds required aggressive parallelization. We run the vector search concurrently with Claude's tool-use call rather than sequentially — that alone cut ≈800ms\approx 800\text{ms} ≈800ms off the pipeline. Cursor helped us identify the bottleneck by analyzing the entire async call chain across the frontend and backend simultaneously — something that would have taken a full debugging session on its own.
Defining deception scientifically. "Hedging" sounds obvious until you try to write a prompt that catches it reliably without false-positiving on legitimate uncertainty. We iterated through dozens of prompt versions entirely inside Cursor, using it to refactor and test each variant against sample transcripts without ever leaving the editor. Speaker diarization drift. Deepgram's diarization occasionally mis-labels speakers mid-call, especially when multiple people talk over each other. We built a smoothing layer that uses rolling context to correct obvious mis-assignments. Cursor drafted the initial algorithm and then helped us tune the window size when we noticed it was over-correcting on rapid back-and-forth exchanges. SEC filing parsing. 10-K filings can be 300+ pages of dense legalese with embedded tables, footnotes, and XBRL tags. Naive chunking produces terrible embeddings. Cursor helped us write the HTML stripping and prose-extraction pipeline in a fraction of the time it would have taken manually — we described the problem, it generated a working BeautifulSoup-based extractor, and we iterated from there. Demo reliability. Hackathon demos need to work every single time. Cursor helped us build a deterministic replay mode that feeds pre-transcribed segments through the live pipeline at controlled intervals, guaranteeing that the three most impressive contradiction cards always fire within the first five minutes. We stress-tested the whole thing inside Cursor by generating synthetic edge-case transcripts and running them through the pipeline.
Accomplishments that we're proud of
Built a genuinely working real-time pipeline — not a mock, not hardcoded outputs — that processes live audio end-to-end in under 2 seconds, written almost entirely in Cursor over the course of the hackathon Defined and implemented all 11 deception signal types with distinct detection logic for each, grounded in real linguistic and regulatory research — the taxonomy itself was refined through dozens of Cursor sessions The Wirecard demo fires 6 real contradictions between the Q4 2019 call transcript and their actual 10-K/10-Q filings — these are genuine, documented discrepancies from a company that later collapsed in a €1.9B fraud Built a PDF evidence report that would actually be usable in a compliance or legal context The entire frontend — live transcript feed, credibility gauge, contradiction cards, timeline chart — runs on CSS transitions only, no animation libraries, and was scaffolded and iterated entirely in Cursor Cursor's cross-file context allowed a small team to operate as if they were much larger — we were effectively pair-programming with the best engineer in the room at all times
What we learned
Cursor fundamentally changes what's possible in a hackathon. The ability to ask questions that span the entire codebase — frontend, backend, prompts, database schema — without constantly switching context is not a marginal improvement. It's the difference between finishing and not finishing a project this complex in 48 hours. Claude's structured tool use is remarkably well-suited to this problem. Having the model return a typed JSON schema rather than free text made the entire downstream pipeline deterministic and debuggable. We designed the schema in Cursor, which noticed a type inconsistency in our first draft before we'd even run it. Semantic similarity alone isn't enough for contradiction detection. Two statements can be semantically similar (cosθ>0.9\cos\theta > 0.9 cosθ>0.9) while saying opposite things. We had to add a second Claude pass that reads both statements and judges directional contradiction — similarity finds the candidate, Claude judges the conflict. Cursor helped us implement this two-stage architecture cleanly.
Earnings call language is its own dialect. "Robust," "meaningful," "constructive" — these words have almost no information content in this context. Building signal detectors calibrated to this register took most of our time, and Cursor's ability to iterate on prompt logic inline with the handler code made that process far faster. Real-time UX is unforgiving. If the credibility score doesn't update within a second of a statement ending, the demo feels broken even if the analysis is correct. Cursor helped us profile and fix three separate performance issues in the React rendering layer in a single afternoon.
What's next for EarningsLens
Real-time market data integration — overlay the credibility score timeline against the stock's intraday price movement to show the predictive correlation Multi-ticker concurrent monitoring — a single dashboard watching 5 simultaneous earnings calls during the busiest days of earnings season Historical backtesting — run EarningsLens across 10 years of earnings transcripts and correlate deception scores with 30-day post-call price movements. We expect r2>0.3r^2 > 0.3 r2>0.3 between deception score and negative price drift.
API access for funds — a clean REST API so quantitative funds can pipe EarningsLens signals directly into their models Regulatory dashboard — a read-only view for SEC analysts that logs contradictions with full source citations, designed for enforcement workflow Deeper Cursor integration — we want to build a Cursor extension that lets analysts annotate transcripts and flag new deception patterns directly in the editor, feeding back into the model's training signal
Log in or sign up for Devpost to join the conversation.