Expenso — The Voice-Native Financial Copilot
💡 Inspiration: The Financial Literacy Barrier
Expenso was born from a problem every Indian household, college dorm, and small business silently suffers through: money is confusing, not because it's complex, but because every tool
assumes you already know what you're doing.
The spark was small — a chaotic shared rent-and-grocery spreadsheet in a college house that no one wanted to maintain. But behind it sat a much larger truth:
▎ 70% of Indians never track personal finances. Not from lack of intent, but from cognitive overload.
Existing apps (Splitwise, Walnut, MoneyView, even global tools like Mint and YNAB) demand the same ritual: open app → tap → categorize → repeat. That friction is why financial literacy in India is stuck at 27% (NCFE, 2023) despite having the world's largest digital payment infrastructure (UPI).
We didn't want to build another tracker. We wanted to remove the keyboard, the menu, the form — and replace them with the most natural human interface: speech.
Expenso's thesis is simple: the next billion users won't type their way to financial wellness. They'll talk.
🚀 What it does: The Voice-Native Financial Copilot
Expenso is a voice-first financial OS built around Niva, an AI assistant that doesn't just answer — it acts.
A user says:
▎ "Niva, I spent ₹800 on groceries with Saurav, split it equally, and remind me if I spend more than ₹3,000 on food this week."
In under 2 seconds, Niva:
- Logs the expense with the right category
- Splits it 50/50 and updates Saurav's shared balance in real time
- Sets a contextual budget guardrail
- Recalculates the user's Financial Health Score (0–100) — a single, gamified number that replaces twelve confusing dashboards
Niva is not a chatbot. It's an agentic financial operator that works across three integrated layers:
┌──────────────┬───────────────────────────────────────────────────────────────────┬──────────────────────────────────────────────┐
│ Layer │ What it does │ Who it's for │
├──────────────┼───────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────┤
│ Personal │ Voice-driven expense capture, budgeting, insights, multi-currency │ Individuals, students, professionals │
├──────────────┼───────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────┤
│ Expenso Biz │ Voice bookkeeping ("Mark ₹2,000 cash sale, GST 18%") │ Kirana stores, freelancers, micro-businesses │
├──────────────┼───────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────┤
│ Shared Rooms │ Real-time group expense sync with AI-suggested settlements │ Roommates, families, trips, teams │
└──────────────┴───────────────────────────────────────────────────────────────────┴──────────────────────────────────────────────┘
Niva also answers:
- "Am I overspending this month compared to last?"
- "How much can I afford to save if I cancel two subscriptions?"
- "What did I spend on food deliveries in March?"
…in conversation, with charts rendered on demand, and citations to the actual transactions.
⚙️ How we built it: Enterprise Tech for Everyday Users
Expenso is engineered as a production-ready, offline-first, privacy-aware mobile platform.
Mobile & Sync
- Flutter — single codebase, Android + iOS
- Hive — offline-first local store; every action works without connectivity and reconciles on reconnect
- Supabase (Postgres + Realtime + Row-Level Security) — multi-device sync with millisecond latency for shared rooms
- pgvector — semantic memory of past transactions so Niva can answer fuzzy questions ("that big dinner last week")
The Niva AI Stack
- Gemini 2.0 + Llama 3 (via Groq) — dual-model routing: Gemini for complex reasoning and tool planning, Groq for sub-300 ms quick responses
- Custom ToolExecutor framework — a strict, schema-validated bridge between natural language and the database. Niva cannot hallucinate a transaction; every action is a typed, auditable
tool call - Deepgram (STT) + ElevenLabs (TTS) — streaming voice in both directions, so Niva starts speaking before it finishes thinking
- Vapi — real-time voice orchestration with <500 ms turn latency
Privacy & Edge ML
- On-device TF-IDF + Logistic Regression classifier for instant, private expense categorization — no transaction text leaves the phone unless the user invokes Niva
- Local-first encryption for sensitive fields; cloud sync uses zero-knowledge keys
Why this matters
The combination — agentic LLM + strict tool schemas + offline-first sync + on-device ML — is what lets Expenso work in a Tier-3 town with patchy 4G and feel as fast as ChatGPT in
Bangalore.
🚧 Challenges we ran into
- Agentic Hallucination on Money — LLMs love to invent. We built a typed ToolExecutor with JSON-schema validation, dry-run previews, and undo on every mutation. Niva can never silently
corrupt a balance.
- Sub-second Voice Loop — Stitching STT → LLM → tool call → DB write → TTS in under 1.5 seconds required streaming at every layer and speculative TTS playback while tool calls were still
resolving.
- Real-time Multi-user State — Shared Rooms needed conflict-free updates across 5+ devices simultaneously. We used Supabase Realtime + a CRDT-inspired merge layer for offline edits.
- Trust on Day One — Users won't say "transfer ₹5,000" to an AI they just met. We solved this with an explicit confirm-before-execute layer, transaction previews, and a visible audit log
of every Niva action.
🏆 Accomplishments that we're proud of
- Zero-UI Mode — a user can onboard, log a month of expenses, settle a group, and check their financial health without ever touching a button
- Financial Health Score — collapses 30+ metrics into one explainable number, with Niva narrating why it changed
- Offline-first AI — categorization, balances, and basic queries all work with no internet; Niva degrades gracefully to a local SLM for queries
- One-shot multi-action commands — most assistants do one thing per turn; Niva chains 4–6 tool calls in a single utterance
- Multi-currency, zero extra cost — built into the schema from day one, no exchange-rate API tax
🧠 What we learned
- The interface is the product. Removing the form was worth more than any new feature.
- LLMs are decision engines, not conversation toys. The magic is in the tool layer, not the prompt.
- Trust compounds. Every confirmed action makes the user delegate more next time.
- Solving boring problems beats inventing exciting ones. "Did I pay Saurav back?" matters more than another chart.
🔮 What's next for Expenso: Social Impact & Scalability
- Proactive Niva — moves from reactive ("you asked") to anticipatory ("your electricity bill is 22% higher this month, want me to investigate?")
- Receipt Intelligence — point camera → Niva extracts items, categorizes, splits, files for warranty/tax
- Vernacular Voice — full conversational support in Hindi, Tamil, Telugu, Bengali, Marathi using Sarvam AI / IndicTrans for the next 500M users
- Decentralized Financial Identity (DFI) — a portable, user-owned credit profile for the 190M unbanked Indians who have spending history but no CIBIL score
Built With
- flutter
- groq
- hive
- postgresql
- supabase
- vapi
Log in or sign up for Devpost to join the conversation.