-
-
Real-time view of all detected subscriptions, their billing cadence, seat count, usage levels, and Otto's confidence in each detection
-
Otto surfaces a prioritized list of actions: cancel underused tools, downgrade oversized plans, or renegotiate contracts
-
For vendors with supported portals, Otto uses browser automation to execute cancellations directly
-
roof of completion: a screenshot captured from the Zoom vendor portal confirming the subscription was cancelled
-
When browser automation isn't available for a vendor, Otto falls back to making a real phone call
-
Full 12-turn conversation transcript — Otto calls the vendor, handles retention offers, and confirms the cancellation number
Inspiration
The average company spends \(\$49M-\$55M\) annually on SaaS with large enterprises spending \(\$123M-\$375M+\) annually according to Zylo. Yet, about \(30-50\%\) of this SaaS spend is wasted (unused licenses, duplicates, forgotten renewal, etc.) with \(77-78\%\) of companies reporting unexpected charges according to TechRadar and Zylo.
What it does
Ottofii is an autonomous AI agent that finds wasteful recurring spending, evaluates it, and cancels it for you once you approve.
Detect: Ottofii analyzes your financial signals such as bank transactions, email receipts, SaaS usage, and identifies recurring charges using deterministic cadence detection and merchant normalization.
Decide: Ottofii analyzes every detected stream against your usage patterns and generates a ranked action plan with plain-English reasoning: "You haven't used Notion in \(32\) days. Canceling saves you \(\$155\)/year."
Act: With one click to approve, Ottofii executes the cancellation automatically via multi-channel automation: browser, email, or an AI-powered phone agent (ElevenLabs) that handles retention offers and verification on your behalf.
Verify: Every completed action produces proof: a confirmation ID and screenshot, stored in a full audit trail.
How we built it
We split the system into four clear ownership layers with strict API contracts defined early in a shared TEAM.md. This allowed everyone to work in parallel without blocking each other.
Frontend: Next.js \(14\) (App Router & TS), React \(19\), TanStack Query, Shadcn UI library.
Detection Layer: Python + FastAPI — deterministic rules for merchant normalization, cadence detection (monthly/annual/quarterly), confidence scoring, and blocklisting of payroll, rent, and other protected categories.
Intelligence Layer: Before calling any LLM, we pre-score each subscription deterministically based on the monthly equivalent spend, savings rank, regret risk, so the model only handles explanation and prioritization. We use OpenAI gpt-4o-mini with JSON mode for structured output and with automatic fallback to Google Gemini \(2.0\) Flash if OpenAI fails. The response is validated on our end: hallucinated stream IDs are stripped, savings are capped at actual spend, and duplicates are removed before the plan is returned.
Execution Layer: Two channels. Playwright automates our demo cancellation portal for browser-based flows, capturing a confirmation ID and screenshot as proof. For merchants that require a phone call, we integrated ElevenLabs Conversational AI with Twilio to make real outbound calls. The voice agent handles the full conversation including retention offers, and we parse the transcript afterward to extract the confirmation number. Users can abort a live call from the UI at any point.
Database: SQLAlchemy ORM backed by Supabase (PostgreSQL). Seven tables covering the full lifecycle, including detected subscriptions, LLM-generated plans, recommendations, an action state machine (proposed → approved → executing → succeeded/failed), proof artifacts, and a full audit log. Every action carries an idempotency key to prevent double-cancellations on retries.
Challenges we ran into
ElevenLabs conversation polling: Detecting when a voice call completed (and extracting the confirmation number from the transcript) required normalizing spoken numbers ("three four seven" to \(347\)), handling variable transcript formats and building a robust polling mechanism that didn't time out prematurely.
Demo portal automation: description Rather than fighting real vendor CAPTCHAs and \(2FA\), we built a realistic fake cancellation portal with the same UX shape and data-attributes for Playwright to target.
Parallel development under time pressure: Four people building four layers simultaneously meant any interface mismatch would cascade. We wrote shared Pydantic schemas and a strict API contract doc before touching implementation code, which saved us from a lot of integration pain at the end.
Accomplishments that we're proud of
End-to-end loop that actually works: The full flow, connect → detect → plan → approve → execute → confirmation number + screenshot, runs without manual intervention. Watching Otto call a phone number, navigate a retention offer, and return a confirmation ID felt like the moment the project became real.
AI that calls people: Most hackathon projects demo a chat interface. Ours makes outbound phone calls. The ElevenLabs + Twilio integration handles live conversations, adapts to whatever the support rep says, and parses the result.
What we learned
Separate deterministic logic from LLM logic. When the model was responsible for calculations or pattern detection, the results were inconsistent. Pre-scoring the data before sending it to the model, and validating the output afterward, made the system much more reliable.
Team contracts matter as much as code. Writing down shared types and API shapes before anyone wrote implementation code meant four people could build in parallel and integrate cleanly at the end. At a hackathon, that's the difference between a working demo and a merge conflict disaster.
What's next for Ottofii
Ottofii Roadmap (Post-Hackathon → Production)
1. Data Connectors
Live Financial Integrations
- OAuth with Gmail and Plaid
- Ingest transactions and email receipts
Enterprise Ingestion
- Integrate with ERP, invoicing, and accounting systems
- Move from mock/demo data to real workflows
B2B Expansion
- Target enterprise SaaS spend \(10–100\times\) consumer scale
- Incorporate signals like team size, revenue, and usage
2. Data Analysis & Intelligence
Advanced Ingestion Pipelines
- Structured parsing of transactions, invoices, and subscriptions
Predictive Waste Detection
- Replace heuristics with real usage signals
- Example: detect inactivity (e.g., no usage in \(30+\) days)
Deterministic Pre-Scoring
- Separate:
- deterministic logic (spend, frequency, savings rank)
- LLM reasoning (context, recommendations)
- Improves reliability and reduces hallucinations
3. Agent Actions (Execution Layer)
Intelligent Negotiation
- AI-driven renegotiation via email or voice agents
Smart Downsizing
- Recommend plan downgrades instead of only cancellations
Free Trial Watcher
- Detect trial start and notify or auto-cancel before billing
Auto Mode (Premium)
- Automatically cancel unused subscriptions after a set threshold
Contract Renewal Management
- Track renewals and compute value metrics (e.g., cost per use)
4. Verification & Reliability
Anti-Fragile Execution Layer
- Hybrid automation:
- APIs
- browser automation
- email flows
- Resilient to UI and API changes
Audit Trails
- Full transparency:
- logs
- confirmation IDs
- screenshots of actions
Core Direction
Move from insight → execution platform
Build an autonomous agent for spend optimization, not just a dashboard.
Built With
- elevenlabs
- fastapi
- gemini
- next.js
- openai
- playwright
- postgresql
- python
- react
- sqlalchemy
- supabase
- twilio
- typescript
Log in or sign up for Devpost to join the conversation.