-
-
JIFFY | Less Shit. More Life.
-
Jiffy
-
Jiffy Multimodal Agent
-
Capabilities
-
Checking account balance using text and sending money by voice commands
-
Parsing Receipt and figuring out participants from text message
-
owner claim form
-
Receipt Dashboard
-
Auto payment requests send over whatsapp
-
Abhinav's share claim page
-
Bill Page for Abhinav
-
login to bunq
-
pin for bunq login
-
payment page
-
Live updates from Jiffy
-
Settled Dashboard
Jiffy
Less shit. More life.
Jiffy is an agentic payment multimodal that simplifies payments, gets shit done effortlessly and saves time for users to enjoy life. All in a Jiffy!
01 The Problem
Every group payment starts the same way: one person taps their card, covers the bill, and quietly inherits a second job. They must reconstruct who ordered what, calculate individual shares, dispatch separate payment requests, and then wait, follow up, and chase. 61% of adults have covered a group expense expecting repayment, and 59% report a negative outcome — lost money, strained relationships, or worse (CreditCards.com, 2024). And even for a simple one-to-one transfer, the friction is real: opening a banking app, locating a contact, entering an account number, typing an amount, and confirming adds up to five to seven manual steps for a routine €20 payment. The infrastructure exists. The P2P market is heading toward $14.5 billion by 2034 (Allied Market Research, 2024). What has not existed, until now, is a tool that removes the burden entirely. Jiffy does.
02 The Solution
Jiffy is a multimodal payment assistant. Talk to it, show it things, and it handles the rest.
Send money
Say who to pay and how much — by voice, text, or both. Jiffy resolves the recipient, shows a confirmation, and waits for a thumbs up to send via Bunq.
"Send €50 to Barnali" → Recipient confirmed → thumbs up → payment sent
Receive money
Show Jiffy the bill, name who was there. Jiffy reads the receipt and sends each person a personal WhatsApp payment link. A thumbs up dispatches the links. Everyone marks their own items and pays. You track it all on a live dashboard.
"Collect from Abhinav and Barnali" → Receipt scanned → thumbs up → links sent → live dashboard
Each person selects only what they consumed. The payer never calculates anything.
End-to-end flow
| 01 Capture | 02 Parse | 03 Distribute | 04 Pay | 05 Track |
|---|---|---|---|---|
| Photograph receipt | AI reads line items | WhatsApp links sent | Each person pays own share | Live dashboard updates |
WhatsApp automation
No app download, no sign-up. Jiffy sends each person a personalised link via WhatsApp Web (Playwright), using the organiser's existing account. One QR code scan connects the session; it persists from there.
03 How AI Is Used
Each AI component is purpose-built for its task, using the most efficient model at each layer.
| Component | Detail |
|---|---|
Intent classification — Claude Haiku (claude-haiku-4-5) |
Classifies messages into 8 intents in real time, extracting names, amounts, and receipt IDs as structured slots. Supports English, Hindi, and Dutch. |
Receipt parsing — Claude Sonnet (claude-sonnet-4-6) |
Claude Vision reads receipt photos and returns line items and prices as structured JSON. No OCR or templates needed. |
Participant extraction — Claude Sonnet (claude-sonnet-4-6) |
Natural language ("Dinner with Priya and Dev") becomes a structured recipient list in one inference pass. |
| Conversational agent — Redis state machine | Session-aware state machine retains context across turns, inheriting prior slots (e.g. amount) automatically. |
04 Multimodal Input
Four input modes — image, voice, text, and gesture — all feeding the same agent pipeline. No keyboard required at any point.
| Modality | Implementation & Role |
|---|---|
| Image | Receipt photo processed by Claude Vision. Returns structured line items instantly. No OCR needed. |
| Voice — faster-whisper | Hold-to-talk, transcribed locally via faster-whisper (CPU, int8). No external API. English, Hindi, Dutch. |
| Text | Natural language chat input for sending payments, naming participants, and checking status. |
| Gesture — MediaPipe Hands | Thumbs up confirms, thumbs down cancels. Triggered at every payment confirmation step across both send and receive flows. Runs in-browser, no hardware needed. |
Photograph bill → name participants by voice → thumbs up to confirm. Zero keyboard interaction.
05 Infrastructure
Fully containerised on AWS EC2, deployed as a single Docker Compose stack.
| Component | Role |
|---|---|
| Frontend (Next.js) + Backend (FastAPI) | Separate Docker images, stored in ECR, pulled to EC2 on deploy. |
| nginx | Reverse proxy with SSL termination. Routes /api/* to FastAPI, all else to Next.js. |
| Redis | Session state, receipt data, and payment tokens. Persisted via Docker volume. |
| CloudFront CDN | Global distribution and HTTPS at the edge. |
| Persistent volumes | WhatsApp session, Whisper cache, and Bunq auth survive container restarts. |
06 Tech Stack
| Layer | Technology |
|---|---|
| Frontend | Next.js 16, React 19, TypeScript, Tailwind CSS 4 |
| Backend | FastAPI, Python 3.11 |
| Storage | Redis |
| AI | Claude Sonnet (receipt scan, participant extraction), Claude Haiku (intent), faster-whisper (voice), MediaPipe (gesture) |
| Payments | Bunq API (sandbox) |
| Messaging | WhatsApp Web via Playwright |
| Infra | Docker Compose, nginx, AWS EC2 + ECR + CloudFront |
Built With
- amazon-web-services
- aws-ec2
- aws-ecr
- bunq-api
- claude-haiku
- claude-sonnet
- cloudfront
- docker-compose
- fastapi
- faster-whisper
- mediapipe
- next.js-16
- nginx
- playwright
- python-3.11
- react-19
- redis
- tailwind-css-4
- typescript
- whatsapp-web


Log in or sign up for Devpost to join the conversation.