Inspiration

The average company spends \(\$49M-\$55M\) annually on SaaS with large enterprises spending \(\$123M-\$375M+\) annually according to Zylo. Yet, about \(30-50\%\) of this SaaS spend is wasted (unused licenses, duplicates, forgotten renewal, etc.) with \(77-78\%\) of companies reporting unexpected charges according to TechRadar and Zylo.

What it does

Ottofii is an autonomous AI agent that finds wasteful recurring spending, evaluates it, and cancels it for you once you approve.

Detect: Ottofii analyzes your financial signals such as bank transactions, email receipts, SaaS usage, and identifies recurring charges using deterministic cadence detection and merchant normalization.

Decide: Ottofii analyzes every detected stream against your usage patterns and generates a ranked action plan with plain-English reasoning: "You haven't used Notion in \(32\) days. Canceling saves you \(\$155\)/year."

Act: With one click to approve, Ottofii executes the cancellation automatically via multi-channel automation: browser, email, or an AI-powered phone agent (ElevenLabs) that handles retention offers and verification on your behalf.

Verify: Every completed action produces proof: a confirmation ID and screenshot, stored in a full audit trail.

How we built it

We split the system into four clear ownership layers with strict API contracts defined early in a shared TEAM.md. This allowed everyone to work in parallel without blocking each other.

Frontend: Next.js \(14\) (App Router & TS), React \(19\), TanStack Query, Shadcn UI library.

Detection Layer: Python + FastAPI — deterministic rules for merchant normalization, cadence detection (monthly/annual/quarterly), confidence scoring, and blocklisting of payroll, rent, and other protected categories.

Intelligence Layer: Before calling any LLM, we pre-score each subscription deterministically based on the monthly equivalent spend, savings rank, regret risk, so the model only handles explanation and prioritization. We use OpenAI gpt-4o-mini with JSON mode for structured output and with automatic fallback to Google Gemini \(2.0\) Flash if OpenAI fails. The response is validated on our end: hallucinated stream IDs are stripped, savings are capped at actual spend, and duplicates are removed before the plan is returned.

Execution Layer: Two channels. Playwright automates our demo cancellation portal for browser-based flows, capturing a confirmation ID and screenshot as proof. For merchants that require a phone call, we integrated ElevenLabs Conversational AI with Twilio to make real outbound calls. The voice agent handles the full conversation including retention offers, and we parse the transcript afterward to extract the confirmation number. Users can abort a live call from the UI at any point.

Database: SQLAlchemy ORM backed by Supabase (PostgreSQL). Seven tables covering the full lifecycle, including detected subscriptions, LLM-generated plans, recommendations, an action state machine (proposed → approved → executing → succeeded/failed), proof artifacts, and a full audit log. Every action carries an idempotency key to prevent double-cancellations on retries.

Challenges we ran into

ElevenLabs conversation polling: Detecting when a voice call completed (and extracting the confirmation number from the transcript) required normalizing spoken numbers ("three four seven" to \(347\)), handling variable transcript formats and building a robust polling mechanism that didn't time out prematurely.

Demo portal automation: description Rather than fighting real vendor CAPTCHAs and \(2FA\), we built a realistic fake cancellation portal with the same UX shape and data-attributes for Playwright to target.

Parallel development under time pressure: Four people building four layers simultaneously meant any interface mismatch would cascade. We wrote shared Pydantic schemas and a strict API contract doc before touching implementation code, which saved us from a lot of integration pain at the end.

Accomplishments that we're proud of

End-to-end loop that actually works: The full flow, connect → detect → plan → approve → execute → confirmation number + screenshot, runs without manual intervention. Watching Otto call a phone number, navigate a retention offer, and return a confirmation ID felt like the moment the project became real.

AI that calls people: Most hackathon projects demo a chat interface. Ours makes outbound phone calls. The ElevenLabs + Twilio integration handles live conversations, adapts to whatever the support rep says, and parses the result.

What we learned

Separate deterministic logic from LLM logic. When the model was responsible for calculations or pattern detection, the results were inconsistent. Pre-scoring the data before sending it to the model, and validating the output afterward, made the system much more reliable.

Team contracts matter as much as code. Writing down shared types and API shapes before anyone wrote implementation code meant four people could build in parallel and integrate cleanly at the end. At a hackathon, that's the difference between a working demo and a merge conflict disaster.

What's next for Ottofii

Ottofii Roadmap (Post-Hackathon → Production)

1. Data Connectors

  • Live Financial Integrations

    • OAuth with Gmail and Plaid
    • Ingest transactions and email receipts
  • Enterprise Ingestion

    • Integrate with ERP, invoicing, and accounting systems
    • Move from mock/demo data to real workflows
  • B2B Expansion

    • Target enterprise SaaS spend \(10–100\times\) consumer scale
    • Incorporate signals like team size, revenue, and usage

2. Data Analysis & Intelligence

  • Advanced Ingestion Pipelines

    • Structured parsing of transactions, invoices, and subscriptions
  • Predictive Waste Detection

    • Replace heuristics with real usage signals
    • Example: detect inactivity (e.g., no usage in \(30+\) days)
  • Deterministic Pre-Scoring

    • Separate:
    • deterministic logic (spend, frequency, savings rank)
    • LLM reasoning (context, recommendations)
    • Improves reliability and reduces hallucinations

3. Agent Actions (Execution Layer)

  • Intelligent Negotiation

    • AI-driven renegotiation via email or voice agents
  • Smart Downsizing

    • Recommend plan downgrades instead of only cancellations
  • Free Trial Watcher

    • Detect trial start and notify or auto-cancel before billing
  • Auto Mode (Premium)

    • Automatically cancel unused subscriptions after a set threshold
  • Contract Renewal Management

    • Track renewals and compute value metrics (e.g., cost per use)

4. Verification & Reliability

  • Anti-Fragile Execution Layer

    • Hybrid automation:
    • APIs
    • browser automation
    • email flows
    • Resilient to UI and API changes
  • Audit Trails

    • Full transparency:
    • logs
    • confirmation IDs
    • screenshots of actions

Core Direction

Move from insight → execution platform

Build an autonomous agent for spend optimization, not just a dashboard.

Built With

Share this project:

Updates