Inspiration
Banking takes too many taps. That's why your Tikkie never goes out, your EU261 claim never gets filed, your Spotify auto-renews for a third year you will never use. The cost of doing the right thing is higher than the cost of putting it off, and the difference becomes silent money: the €400 EU261 you didn't claim, the €45 your flatmate never paid back, the gym membership now in its third year. We built OpenFinn so doing the right thing stops costing willpower.
📱 What it does
OpenFinn lives in your messaging app. Say "hey bunq" and it:
- 💶 splits a €3k bonus across your sub-accounts in three real transfers, in under two seconds
- 🧾 reads your receipts, categorises them, and fires the savings sweep your past self promised
- ✈️ files real EU261 flight-delay claims on airline portals; €400 you weren't going to claim, claimed
- 🎙️ nudges the flatmate who still owes you for October, in your voice
- 💬 chats back like a friend who quietly knows your numbers
All of it runs on a real bunq sandbox account, in the WhatsApp thread you already use. Nothing simulated. Public URL: hey-bunq-akram.fly.dev.
🛠️ How we built it
OpenFinn is a stack of best-in-class pieces wired end to end.
The brain. Claude Opus 4.7 reads every message and decides what to do. It hands the work to one of four specialist agents: one for bunq operations, one for receipt vision, one for the spending rules you set in plain English, one for chasing refunds. The planner is the conductor; the specialists are the players.
Real bunq, real money. Every action that touches money hits the real bunq API. We integrated nine bunq endpoints directly into the agent: balance reads, internal transfers, payment requests, scheduled payments, card freezes, transaction lookups. Real money moves on a real sandbox account. bunq's webhook fires back into our service in under one second whenever a payment lands or a request arrives. That's how the scam-pattern intercept catches a fake package request before the user's screen even lights up.
The user channel. WhatsApp via Twilio handles incoming and outgoing messages. Voice in goes through OpenAI Whisper for transcription. Voice out goes through ElevenLabs for synthesis and FFmpeg for transcoding, so replies arrive as native WhatsApp voice notes, indistinguishable from a friend sending one.
Memory you can read. Each user gets a folder of plain markdown files. We mirror them to Supabase Storage and index them in Qdrant with two kinds of search built in: text search for chat content, image search for receipts and photos. The agent can find that Bar Americain receipt later just by describing it. You can open the markdown and check what the agent thinks it knows about you. Most agent products store memory in vector databases you cannot audit. We don't.
The escape hatch. When the API doesn't reach the task (a claim portal, a retailer refund form, a price-match request, anywhere without a public API), the same agent opens a real browser and fills it out. Same loop, different tool.
End-to-end tested. We wrote a smoke harness that runs the full inbound-to-outbound cycle against every external service we depend on: bunq, Twilio, OpenAI, Anthropic, ElevenLabs, Supabase, Qdrant. One command exercises the entire stack in one pass. If something is down, we know before the demo starts, not during it.
🥊 Challenges we ran into
Some of what OpenFinn does is not reachable through any bank API. Filing a flight-delay claim worth hundreds of euros, chasing a retailer refund, asking for a price match: these happen on websites with no public API at all. The agent needs to drive a real browser like a human would.
Playwright was the obvious tool. It's fast and we know it well. But the kinds of sites that don't have an API are exactly the ones that get redesigned constantly, and a Playwright integration means writing CSS selectors that break the next time the cookie banner moves. We didn't want to ship software that needs maintenance every quarter. So we used Claude Computer Use instead. The agent screenshots the page, reasons about what it sees, and clicks where it needs to. Slower than Playwright, but it doesn't care when a form gets shuffled. The agent reads the rendered page, not a CSS selector frozen to last December's layout.
A smaller story worth telling. WhatsApp's 24-hour customer-service window: outside that window, the agent can't speak first. We started by writing scheduler code to work around it. Then we realised the constraint was a feature in disguise. The agent never opens conversations cold; it surfaces inside windows the user already opened. No scheduler, no proactive spam vector, less code, cleaner security story. We deleted the scheduler.
🏆 Accomplishments we're proud of
Memory you can read. The agent edits a plain markdown file as it learns, and you can open it and check. The bunq webhook path returns in under one second, so scam intercepts happen before the user's screen lights up. And the whole stack is end-to-end tested against real services, not mocks. Every integration we depend on has been hit live before showtime.
💡 What we learned
Don't mix models across an agent and its subagents. Ask us how we know. Our first cut had the planner reading its model from env and the subagents hard-coded to Opus. During an Anthropic 529 spike we ran the planner on Sonnet 4.6 via env override; the subagents kept spawning on Opus, silently failed, and the planner hallucinated tool errors back to the user. Unifying everything to one env-controlled model fixed it. One knob, one rollback path, one less way to be on fire at 3am.
🚀 What's next
Give us a real bunq Stocks API. The auto-invest flow already does the work; it sweeps to a sub-account tagged "Invest" because Stocks isn't exposed yet. The day bunq ships that endpoint, the same agent places real orders without changing a line of the conversation.
Log in or sign up for Devpost to join the conversation.