Inspiration

Most people drown in raw CSV files. You open one, see hundreds of rows and columns, and immediately hit a wall — unless you know Python, SQL, or Tableau, the data is effectively meaningless to you. That gap between having data and understanding data is the problem NarrAI was built to close.

The MLH AI Hackfest's focus on multi-agent AI architectures gave us the push we needed. Instead of building yet another chatbot, we asked: what if you could drop a CSV and just hear what it means? No code. No dashboards. No technical skills required. Just upload, listen, and ask questions.


What It Does

NarrAI turns any CSV file into an AI voice briefing in seconds. The core pipeline can be expressed as a function composition:

Input Insight + Audio + Chart
CSV Gemini 2.5 Flash gTTS Chart.js

You drag and drop a file, and the app:

  • Generates a plain-English insight about your data using Gemini 2.5 Flash
  • Automatically narrates that insight aloud using text-to-speech audio
  • Renders a Chart.js visualisation of the most interesting trend in the data
  • Lets you ask follow-up questions in natural language and get spoken answers back

No setup. No code. Anyone can use it.


How We Built It

NarrAI is a layered multi-agent pipeline. Each AI service is its own isolated module:

Browser
  └─ uploads CSV via drag-and-drop
       └─ FastAPI /analyze endpoint
            ├─ validates file (extension, size, encoding)
            ├─ pandas → DataFrame → table summary
            ├─ gemini_agent.py → Gemini 2.5 Flash
            │    └─ generates insight (prose) + chart_data (labels, values, type)
            ├─ tts_agent.py → gTTS
            │    └─ converts insight → MP3 bytes → base64
            └─ response JSON → browser
                 ├─ renders insight text
                 ├─ plays audio briefing automatically
                 ├─ renders Chart.js visualisation
                 └─ enables follow-up Q&A loop (/followup)

File validation enforces hard constraints at both the frontend and backend — two independent gates:

$$ \text{valid}(f) = \big[\text{ext}(f) = \texttt{.csv}\big] \;\wedge\; \big[\text{size}(f) \leq S_{\max}\big] \;\wedge\; \big[\text{encoding}(f) \in \mathcal{E}\big] $$

The frontend is a single index.html — no framework, no build step. The backend is FastAPI. Each agent (gemini_agent.py, tts_agent.py, supabase_agent.py) is deliberately isolated so any one can fail without crashing the others.


Challenges We Ran Into

The ElevenLabs Pivot — Our Defining Hackathon Moment

Halfway through the build, live on Render, ElevenLabs started returning 401: detected_unusual_activity. Render's shared IP ranges had been flagged by ElevenLabs' abuse detection system. Every TTS call was silently failing. The audio feature that made the app distinctive was dead.

We pivoted in under an hour — ripped out the ElevenLabs client, replaced it with gTTS (completely free, no API key, no credentials), kept the exact same text_to_audio(text) -> bytes interface, and redeployed. main.py never changed. The frontend never changed. Only the agent module swapped. The architecture made the pivot possible.

Getting Gemini to Return Reliable Structured Data

Gemini would sometimes wrap JSON in markdown code blocks, prepend filler text, or truncate mid-object. We hardened the parser with re.search block extraction, a fallback regex to salvage just the insight string, and an empty-array guard on the frontend so Chart.js never renders a blank chart.

The follow-up Q&A loop is itself a conditional pipeline:

$$ \text{answer}(q \mid \text{insight}) = \begin{cases} \text{Gemini}(q,\, \text{insight}) & \text{if } q \neq \emptyset \ \emptyset & \text{otherwise} \end{cases} $$

CSV Edge Cases Are Brutal

Symbol-only column names normalising to "", corrupted files, all-null columns, zero-row files — every one was a potential 500 error. We caught them all.


Accomplishments We're Proud Of

  • The pivot story. ElevenLabs died mid-hackathon and we shipped a working replacement in under an hour without touching the API contract. That's the architecture paying off.
  • Zero frontend framework. The entire UI is vanilla JS in a single index.html — fast, lightweight, and fully functional.
  • Genuine agent isolation. Each service fails independently. If TTS fails, the insight still renders. If Supabase goes down, the analysis still completes. Graceful degradation by design.
  • Defence-in-depth validation. File checks run at two independent layers — frontend and backend — so malformed inputs never reach the AI agents.

What We Learned

  • Multi-agent orchestration requires blast radius thinking. Every agent needs its own try/except. One failure should never cascade into a 500.
  • Gemini needs a contract, not a prompt. Structured JSON output from an LLM requires hardened parsing — re.search, inner try/except, and regex fallbacks, not just json.loads().
  • FastAPI multipart file handling has sharp edges around large files and malformed CSVs. Guard every pd.read_csv() call explicitly.
  • Pivoting under pressure is a skill. The ElevenLabs → gTTS switch under hackathon pressure taught us more about resilient architecture than any tutorial.

What's Next for NarrAI

  • Re-enable Supabase logging with a fixed schema — store every analysis with a session_id for usage analytics
  • Multi-file support — compare two CSVs side by side with a single insight covering both
  • Shareable reports — generate a static HTML summary page per analysis that can be sent to a colleague
  • Voice selection — let users pick from multiple TTS voices or accents
  • Streaming insights — stream Gemini's response token by token so users see the insight building in real time instead of waiting

Built With

Share this project:

Updates