TenK — Your Always-On Earnings Analyst
The Problem
Every quarter, analysts spend 3–4 hours reading an 80-page 10-K to form a single view. Existing tools parse text but ignore embedded charts. None of them let you talk back.
What We Built
TenK is a voice-first AI analyst that reads a company's earnings report — including every chart and table — and lets you interrogate it in real-time conversation.
Drop in a PDF. Start talking. Interrupt mid-answer. Get spoken responses from an agent that has read everything.
How It Works
- Upload — Drop in any 10-K or earnings PDF
- Ingest — Gemini Flash 2.0 reads the full document multimodally, text and charts
- Talk — Gemini Live API opens a bidirectional voice stream
- Interrogate — Ask anything, interrupt anytime, drill into any number
- Export — Session ends with an auto-generated investment memo and shareable marksheet card
The Multimodal Stack
| Layer | Technology | Role |
|---|---|---|
| Vision | Gemini Flash 2.0 | Reads PDF text + interprets charts visually |
| Voice | Gemini Live API | Bidirectional audio, native interruption |
| Agent | Google ADK | Orchestrates document context + Q&A tools |
| Hosting | Google Cloud Run | Serverless backend, WebSocket support |
| Storage | Cloud Storage + Firestore | PDF store + session state |
Why Live API Changes Everything
Most voice tools are STT → LLM → TTS pipelines bolted together. Gemini Live API is natively bidirectional — the model speaks back, handles interruptions mid-sentence, and maintains conversational context throughout.
This is what makes TenK feel like talking to a human analyst, not a search bar.
Demo
"What drove iPhone revenue this quarter?" TenK: "iPhone came in at $69 billion, up 1% year over year. The real story is —" [user interrupts] "Wait, what about China?" TenK: "China was down 11% year over year. That's the key risk heading into Q2..."
No typing. No waiting. Full interruption support. Exactly like the Live Agents category describes.
What We Learned
- Gemini's long context window is genuinely powerful for financial documents — feeding an entire 10-K in one shot works better than chunking
- Spoken-first prompting matters enormously — the agent needs to speak like an analyst, not read a report aloud
- Live API + ADK wiring is the hardest part — getting document context into a live voice session requires careful session state management
- Vision on charts is underutilized — Gemini Flash caught margin trends in embedded graphs that pure text extraction would have missed entirely
Built With
- Gemini Live API — real-time voice stream
- Google ADK — agent orchestration
- Gemini Flash 2.0 — multimodal document ingestion
- Google Cloud Run — serverless backend
- Next.js 14 — frontend
The One-Liner
The analyst that never sleeps — drop in a 10-K, start talking.
Built for the Gemini Live Agent Challenge · Live Agents category · #GeminiLiveAgentChallenge
Built With
- d3.js
- express.js
- gemini
- google-cloud
- next.js
- node.js
- react
- tailwindcss
- typescript
Log in or sign up for Devpost to join the conversation.