Inspiration
Every hackathon has its “moment of clarity.” Mine came when I watched a finance teammate slog through a stack of travel receipts—zooming into blurry photos, cross‑checking policy PDFs, and guessing whether a $63 dinner was above the regional cap. The modern enterprise runs on AI, yet expense auditing (a perfect candidate for reasoning + vision) still feels like clerical work. The Google Cloud Run challenge nudged me to ask:
Why can’t an autonomous agent audit a receipt end-to-end in under 10 seconds? Why can’t an autonomous agent audit a receipt end-to-end in under 10 seconds?
AuditAI was born from that question: blend Google’s Agent Development Kit, Gemini thinking models, and an event-driven Cloud Run backbone to eliminate the drudgery.
What it does
AuditAI is an autonomous expense audit system that uses 5 specialized AI agents to audit expense receipts in ~7 seconds, a task that takes 10-15 minutes manually. Built with Google's Agent Development Kit (ADK), Gemini AI, and deployed across all 3 Cloud Run resource types, it solves a major pain point for finance teams who waste 10-15 hours/week on manual reviews.
The 5 AI agents work in parallel to:
- Extract structured data using Gemini 2.5 Flash Preview (global endpoint)
- Check policy compliance using ADK + Gemini 2.5 Flash-Lite Preview
The system provides a 250x cost reduction 🎯.
- Manual Review: At $50/hour (15 min/audit), a manual audit costs $12.50.
- AuditAI: The total monthly cost for 1000 audits is ~$53, making each audit cost only $0.05.
How we built it
- Architecture Modeling. I drafted the multi-agent flow (ingest → extraction → policy → anomaly → remediation → synthesis) in
docs/ARCHITECTURE_AUDITAI.md, mapping each stage to a Pub/Sub topic. - Backend Orchestrator. Built with FastAPI, it handles uploads, creates Firestore docs, and streams progress via SSE. The
stream_expenseendpoint dynamically adjusts polling cadence once status moves past ingestion. - Worker Engine. A single
agents/worker/main.pyimage parameterizes agent behavior via env vars (AGENT_TYPE,SUBSCRIPTION,TOPIC_OUT). Shared utilities include:QPSTokenBucketfor per-instance throttling.call_with_retrywith exponential backoff/jitter +Retry-Aftersupport._parse_json_response, a resilient JSON extractor for Gemini “thinking” responses.
- Gemini Integration.
- Extraction worker calls
gemini-2.5-flash-preview-09-2025viaPart.from_urifor images. - Policy worker calls
gemini-2.5-flash-lite-preview-09-2025with the policy RAG prompt. - Both run through
_resolve_model_name, so settingMODEL_LOCATION=globalis enough.
- Extraction worker calls
- Frontend. Next.js 14 + Tailwind CSS deliver a polished UI with gradient theming, drag-and-drop uploads, SSE-driven status chips, and fallback messages if the stream lags.
- Deployment.
- Cloud Build pipelines (
deploy.sh/deploy.ps1). - 3 Cloud Run services (frontend, orchestrator, synthesis).
- 4 Worker Pools (extraction, policy, anomaly, remediation) with dedicated queues (
expenses.ingested.sub,expenses.extracted.policy,expenses.evaluated.anomaly,expenses.analyzed.remediation). - 1 Cloud Run job for nightly CSV reports.
- Cloud Build pipelines (
- Documentation (
DEPLOYMENT_COMPLETE.md,README_AUDITAI.md,HACKATHON_SUBMISSION_READY.md) captures the exact commands, service accounts, and submission checklist.
Challenges we ran into
| Challenge | What Happened | Fix |
|---|---|---|
| Model 404s | Preview endpoints aren’t active in us-central1. |
Introduced MODEL_LOCATION + redeployed workers against global, updated docs. |
| Resource Exhausted (429) | Concurrent workers + retries triggered burst throttling. | Added token bucket, capped Pub/Sub flow, respectful retries, throttled env vars in scripts/YAML. |
| JSON Decode Failures | Gemini 2.5 emits <think> blocks; plain json.loads failed. |
Wrote _parse_json_response to strip fences, remove thinking tags, brace-match the payload. |
| Infinite retries on duplicate messages | Pub/Sub redeliveries triggered double processing. | Pre-flight check: load Firestore doc, skip if status finalized or stage already populated. |
| Docs drift | Model IDs and env vars changed repeatedly. | Final submission sweep updated README, deployment scripts, architecture diagrams, social/blog drafts. |
Accomplishments that we're proud of
We are proud of building an end-to-end, event-driven, multi-agent system that solves a real-world business problem.
- Massive ROI: We achieved a 250x cost reduction ($12.50 vs $0.05 per audit).
- Incredible Speed: We reduced audit times from 10-15 minutes to ~7 seconds.
- Resilient Architecture: We built a scalable system using a modern Google Cloud stack (Cloud Run Services, Worker Pools, and Jobs), ADK, and Gemini.
- End-to-End Solution: The result is a system that audits any receipt (image or text), cites policy sections, flags anomalies, and drafts remediation—all autonomously.
What we learned
- Thinking models change the UX game. Gemini 2.5 Flash Preview + Flash-Lite Preview offer structured, “reasoned” outputs that still require defensive parsing. I learned how to strip
<think>traces, honorRetry-After, and keep the JSON clean for Firestore writes. - Cloud Run Worker Pools are the missing middle tier. They bridge the gap between Functions and Services, letting Pub/Sub drive long-running AI calls without over-provisioning.
- Global vs. regional Vertex endpoints matter. Preview models aren’t everywhere; understanding access, quota, and location constraints is critical. A single env var,
MODEL_LOCATION=global, saved hours of debugging. - Observability for agents is different. Tracking
expenseIdacross five. stages and adding bounded audit logs in Firestore helped keep event replay safe and interpretable.
What's next for Audit AI
- Deeper Integration: Integrate with major accounting platforms (like QuickBooks, NetSuite, and SAP) to automatically trigger reimbursements.
- Advanced Policy Engine: Expand the policy agent to handle highly complex, multi-level conditional rules and international compliance.
- Proactive Analytics: Add a new agent that provides proactive insights to finance teams, such as identifying high-risk departments or common policy violations.
- Enhanced UI: Build out a full-fledged admin dashboard for managing policies, reviewing flagged expenses, and monitoring agent performance.
Built With
- and
- cloud-build
- cloud-storage
- event-driven
- fastapi
- firestore
- gemini-2.5-flash-and-flash-lite
- google-cloud-run
- microservices
- next.js-14
- node.js
- pub/sub
- python
- tailwind-css
- typescript
- vertex-ai
Log in or sign up for Devpost to join the conversation.