Inspiration
bunq has one of the best public banking APIs in Europe and almost no consumer ever touches it. Every "AI agent" demo today is either read-only or YOLO-executes your money. We wanted the missing middle: an agent that proposes a precise change to your bank state, and a human who executes it. Voice is the natural input — money intents are faster spoken than tapped through six screens.
What it does
Vox is a voice-first control plane for your bunq account.
- Hold the mic, say "move €400 to rent, split my next salary 60/30/10 across rent/groceries/savings, freeze my travel card after 10pm".
- An LLM planner converts the transcript into a typed
Plan: sub-account transfers, recurring splits, conditional card freezes, per-tx limits. - The plan renders as diff cards. You tick the actions you want and approve.
- Only then does the backend hit the real bunq API. The LLM never moves a euro.
- Active rules live server-side; when one fires (salary lands, bar spend, large tx) the UI hot-flashes the affected sub-account and toasts the event over SSE.
- Demo buttons fire fake salary / bar spend / large tx so the rule engine reacts live on stage.
How we built it
- Backend: Python · FastAPI · bunq SDK · LLM planner · SQLite. Two core endpoints —
/plan(text → typedPlan) and/execute(selected indices → bunq calls)./eventsis a Server-Sent Events stream for live rule firings. - Web: React + Vite + TypeScript + Tailwind + framer-motion, Web Speech API for transcription, animated diff cards, status pills for
bunq / llm / events. - Mobile: native Android + iOS apps sharing one backend, each using the platform-native speech recognizer (
android.speech.SpeechRecognizer,SFSpeechRecognizer). - Shape: voice → transcript →
/plan→ diff cards → user-selected indices →/execute→ bunq → SSE firings → toasts. Same loop on every client.
Challenges we ran into
- Forcing the LLM to only emit a structured diff (no tool calls, no side effects) without hallucinating account IDs took aggressive schema constraints + retry-on-parse-fail.
- bunq sandbox quirks: sub-account rate limits, the OAuth/installation/device-server dance, and callbacks needing a publicly reachable URL during a hackathon.
- No first-class SSE client on mobile — we parsed
event:/data:framing by hand off the raw HTTP body channel. - Native speech parity: Android partial results, iOS permission prompts, Safari's silent Web Speech all hidden behind one
SpeechRecognizerinterface. - Per-platform backend URLs (
10.0.2.2vslocalhostvs LAN IP) solved with a runtime-overridable config. - Designing four different action types (transfer, recurring split, conditional freeze, tx limit) so they all render as scannable, tickable cards without bespoke components per type.
Accomplishments that we're proud of
- Three clients (web + Android + iOS) on one backend, feature-parity, in 70 hours.
- A real plan → diff → approve → execute loop against the live bunq API — not a mock.
- The "LLM proposes, human executes" boundary is enforced architecturally, not by prompting.
- End-to-end live rule firings: demo button → backend rule engine → SSE → mobile toast → sub-account hot-flash, sub-second.
What we learned
- For agentic apps that touch real-world state, the diff is the product. Voice is just the input modality.
- Structured outputs + a strict schema beat clever prompting for consumer-grade reliability.
- SSE is still the lowest-friction way to push events to a mobile app — no socket infra, no FCM/APNs round-trip, just a long-lived HTTP stream.
- bunq's sandbox teaches you what a well-designed banking API looks like; most of our planner schema is shaped by what bunq endpoints actually accept.
What's next for Vox
- Production bunq OAuth instead of sandbox installation/device-server.
- On-device LLM for the planner so transcripts never leave the phone — the executor stays server-side because it needs bunq credentials.
- Richer rule grammar: time windows, geofencing, payee allowlists, merchant-category limits.
- Undo / rewind 5 minutes — reverse the last executed plan in one tap, since every action is already a typed diff.
- Shared household accounts where any member can speak an intent but execution requires the owner's approval.
- Plan templates — save a frequently-spoken intent ("monthly bills") as a one-tap card.
- Open the planner — publish the action schema so other Open Banking providers (Revolut, Monzo, N26) plug in behind the same voice + diff UX.
Built With
- anthropic
- claude
- fastapi
- kmp
- kotlin
- python
- swift
- typescript
Log in or sign up for Devpost to join the conversation.