Inspiration

Every hackathon has its “moment of clarity.” Mine came when I watched a finance teammate slog through a stack of travel receipts—zooming into blurry photos, cross‑checking policy PDFs, and guessing whether a $63 dinner was above the regional cap. The modern enterprise runs on AI, yet expense auditing (a perfect candidate for reasoning + vision) still feels like clerical work. The Google Cloud Run challenge nudged me to ask:

Why can’t an autonomous agent audit a receipt end-to-end in under 10 seconds? Why can’t an autonomous agent audit a receipt end-to-end in under 10 seconds?

AuditAI was born from that question: blend Google’s Agent Development Kit, Gemini thinking models, and an event-driven Cloud Run backbone to eliminate the drudgery.


What it does

AuditAI is an autonomous expense audit system that uses 5 specialized AI agents to audit expense receipts in ~7 seconds, a task that takes 10-15 minutes manually. Built with Google's Agent Development Kit (ADK), Gemini AI, and deployed across all 3 Cloud Run resource types, it solves a major pain point for finance teams who waste 10-15 hours/week on manual reviews.

The 5 AI agents work in parallel to:

  • Extract structured data using Gemini 2.5 Flash Preview (global endpoint)
  • Check policy compliance using ADK + Gemini 2.5 Flash-Lite Preview

The system provides a 250x cost reduction 🎯.

  • Manual Review: At $50/hour (15 min/audit), a manual audit costs $12.50.
  • AuditAI: The total monthly cost for 1000 audits is ~$53, making each audit cost only $0.05.

How we built it

  • Architecture Modeling. I drafted the multi-agent flow (ingest → extraction → policy → anomaly → remediation → synthesis) in docs/ARCHITECTURE_AUDITAI.md, mapping each stage to a Pub/Sub topic.
  • Backend Orchestrator. Built with FastAPI, it handles uploads, creates Firestore docs, and streams progress via SSE. The stream_expense endpoint dynamically adjusts polling cadence once status moves past ingestion.
  • Worker Engine. A single agents/worker/main.py image parameterizes agent behavior via env vars (AGENT_TYPE, SUBSCRIPTION, TOPIC_OUT). Shared utilities include:
    • QPSTokenBucket for per-instance throttling.
    • call_with_retry with exponential backoff/jitter + Retry-After support.
    • _parse_json_response, a resilient JSON extractor for Gemini “thinking” responses.
  • Gemini Integration.
    • Extraction worker calls gemini-2.5-flash-preview-09-2025 via Part.from_uri for images.
    • Policy worker calls gemini-2.5-flash-lite-preview-09-2025 with the policy RAG prompt.
    • Both run through _resolve_model_name, so setting MODEL_LOCATION=global is enough.
  • Frontend. Next.js 14 + Tailwind CSS deliver a polished UI with gradient theming, drag-and-drop uploads, SSE-driven status chips, and fallback messages if the stream lags.
  • Deployment.
    • Cloud Build pipelines (deploy.sh / deploy.ps1).
    • 3 Cloud Run services (frontend, orchestrator, synthesis).
    • 4 Worker Pools (extraction, policy, anomaly, remediation) with dedicated queues (expenses.ingested.sub, expenses.extracted.policy, expenses.evaluated.anomaly, expenses.analyzed.remediation).
    • 1 Cloud Run job for nightly CSV reports.
  • Documentation (DEPLOYMENT_COMPLETE.md, README_AUDITAI.md, HACKATHON_SUBMISSION_READY.md) captures the exact commands, service accounts, and submission checklist.

Challenges we ran into

Challenge What Happened Fix
Model 404s Preview endpoints aren’t active in us-central1. Introduced MODEL_LOCATION + redeployed workers against global, updated docs.
Resource Exhausted (429) Concurrent workers + retries triggered burst throttling. Added token bucket, capped Pub/Sub flow, respectful retries, throttled env vars in scripts/YAML.
JSON Decode Failures Gemini 2.5 emits <think> blocks; plain json.loads failed. Wrote _parse_json_response to strip fences, remove thinking tags, brace-match the payload.
Infinite retries on duplicate messages Pub/Sub redeliveries triggered double processing. Pre-flight check: load Firestore doc, skip if status finalized or stage already populated.
Docs drift Model IDs and env vars changed repeatedly. Final submission sweep updated README, deployment scripts, architecture diagrams, social/blog drafts.

Accomplishments that we're proud of

We are proud of building an end-to-end, event-driven, multi-agent system that solves a real-world business problem.

  • Massive ROI: We achieved a 250x cost reduction ($12.50 vs $0.05 per audit).
  • Incredible Speed: We reduced audit times from 10-15 minutes to ~7 seconds.
  • Resilient Architecture: We built a scalable system using a modern Google Cloud stack (Cloud Run Services, Worker Pools, and Jobs), ADK, and Gemini.
  • End-to-End Solution: The result is a system that audits any receipt (image or text), cites policy sections, flags anomalies, and drafts remediation—all autonomously.

What we learned

  • Thinking models change the UX game. Gemini 2.5 Flash Preview + Flash-Lite Preview offer structured, “reasoned” outputs that still require defensive parsing. I learned how to strip <think> traces, honor Retry-After, and keep the JSON clean for Firestore writes.
  • Cloud Run Worker Pools are the missing middle tier. They bridge the gap between Functions and Services, letting Pub/Sub drive long-running AI calls without over-provisioning.
  • Global vs. regional Vertex endpoints matter. Preview models aren’t everywhere; understanding access, quota, and location constraints is critical. A single env var, MODEL_LOCATION=global, saved hours of debugging.
  • Observability for agents is different. Tracking expenseId across five. stages and adding bounded audit logs in Firestore helped keep event replay safe and interpretable.

What's next for Audit AI

  • Deeper Integration: Integrate with major accounting platforms (like QuickBooks, NetSuite, and SAP) to automatically trigger reimbursements.
  • Advanced Policy Engine: Expand the policy agent to handle highly complex, multi-level conditional rules and international compliance.
  • Proactive Analytics: Add a new agent that provides proactive insights to finance teams, such as identifying high-risk departments or common policy violations.
  • Enhanced UI: Build out a full-fledged admin dashboard for managing policies, reviewing flagged expenses, and monitoring agent performance.

Built With

  • and
  • cloud-build
  • cloud-storage
  • event-driven
  • fastapi
  • firestore
  • gemini-2.5-flash-and-flash-lite
  • google-cloud-run
  • microservices
  • next.js-14
  • node.js
  • pub/sub
  • python
  • tailwind-css
  • typescript
  • vertex-ai
Share this project:

Updates