Audit AI

Logo
Architecture
Multi-Agent Pipeline
Landing Page
Ai Agents Working
Final Response

Inspiration

Every hackathon has its “moment of clarity.” Mine came when I watched a finance teammate slog through a stack of travel receipts—zooming into blurry photos, cross‑checking policy PDFs, and guessing whether a $63 dinner was above the regional cap. The modern enterprise runs on AI, yet expense auditing (a perfect candidate for reasoning + vision) still feels like clerical work. The Google Cloud Run challenge nudged me to ask:

Why can’t an autonomous agent audit a receipt end-to-end in under 10 seconds? Why can’t an autonomous agent audit a receipt end-to-end in under 10 seconds?

AuditAI was born from that question: blend Google’s Agent Development Kit, Gemini thinking models, and an event-driven Cloud Run backbone to eliminate the drudgery.

What it does

AuditAI is an autonomous expense audit system that uses 5 specialized AI agents to audit expense receipts in ~7 seconds, a task that takes 10-15 minutes manually. Built with Google's Agent Development Kit (ADK), Gemini AI, and deployed across all 3 Cloud Run resource types, it solves a major pain point for finance teams who waste 10-15 hours/week on manual reviews.

The 5 AI agents work in parallel to:

Extract structured data using Gemini 2.5 Flash Preview (global endpoint)
Check policy compliance using ADK + Gemini 2.5 Flash-Lite Preview

The system provides a 250x cost reduction 🎯.

Manual Review: At $50/hour (15 min/audit), a manual audit costs $12.50.
AuditAI: The total monthly cost for 1000 audits is ~$53, making each audit cost only $0.05.

How we built it

Architecture Modeling. I drafted the multi-agent flow (ingest → extraction → policy → anomaly → remediation → synthesis) in docs/ARCHITECTURE_AUDITAI.md, mapping each stage to a Pub/Sub topic.
Backend Orchestrator. Built with FastAPI, it handles uploads, creates Firestore docs, and streams progress via SSE. The stream_expense endpoint dynamically adjusts polling cadence once status moves past ingestion.
Worker Engine. A single agents/worker/main.py image parameterizes agent behavior via env vars (AGENT_TYPE, SUBSCRIPTION, TOPIC_OUT). Shared utilities include:
- QPSTokenBucket for per-instance throttling.
- call_with_retry with exponential backoff/jitter + Retry-After support.
- _parse_json_response, a resilient JSON extractor for Gemini “thinking” responses.
Gemini Integration.
- Extraction worker calls gemini-2.5-flash-preview-09-2025 via Part.from_uri for images.
- Policy worker calls gemini-2.5-flash-lite-preview-09-2025 with the policy RAG prompt.
- Both run through _resolve_model_name, so setting MODEL_LOCATION=global is enough.
Frontend. Next.js 14 + Tailwind CSS deliver a polished UI with gradient theming, drag-and-drop uploads, SSE-driven status chips, and fallback messages if the stream lags.
Deployment.
- Cloud Build pipelines (deploy.sh / deploy.ps1).
- 3 Cloud Run services (frontend, orchestrator, synthesis).
- 4 Worker Pools (extraction, policy, anomaly, remediation) with dedicated queues (expenses.ingested.sub, expenses.extracted.policy, expenses.evaluated.anomaly, expenses.analyzed.remediation).
- 1 Cloud Run job for nightly CSV reports.
Documentation (DEPLOYMENT_COMPLETE.md, README_AUDITAI.md, HACKATHON_SUBMISSION_READY.md) captures the exact commands, service accounts, and submission checklist.

Challenges we ran into

Challenge	What Happened	Fix
Model 404s	Preview endpoints aren’t active in `us-central1`.	Introduced `MODEL_LOCATION` + redeployed workers against `global`, updated docs.
Resource Exhausted (429)	Concurrent workers + retries triggered burst throttling.	Added token bucket, capped Pub/Sub flow, respectful retries, throttled env vars in scripts/YAML.
JSON Decode Failures	Gemini 2.5 emits `<think>` blocks; plain `json.loads` failed.	Wrote `_parse_json_response` to strip fences, remove thinking tags, brace-match the payload.
Infinite retries on duplicate messages	Pub/Sub redeliveries triggered double processing.	Pre-flight check: load Firestore doc, skip if status finalized or stage already populated.
Docs drift	Model IDs and env vars changed repeatedly.	Final submission sweep updated README, deployment scripts, architecture diagrams, social/blog drafts.

Accomplishments that we're proud of

We are proud of building an end-to-end, event-driven, multi-agent system that solves a real-world business problem.

Massive ROI: We achieved a 250x cost reduction ($12.50 vs $0.05 per audit).
Incredible Speed: We reduced audit times from 10-15 minutes to ~7 seconds.
Resilient Architecture: We built a scalable system using a modern Google Cloud stack (Cloud Run Services, Worker Pools, and Jobs), ADK, and Gemini.
End-to-End Solution: The result is a system that audits any receipt (image or text), cites policy sections, flags anomalies, and drafts remediation—all autonomously.

What we learned

Thinking models change the UX game. Gemini 2.5 Flash Preview + Flash-Lite Preview offer structured, “reasoned” outputs that still require defensive parsing. I learned how to strip <think> traces, honor Retry-After, and keep the JSON clean for Firestore writes.
Cloud Run Worker Pools are the missing middle tier. They bridge the gap between Functions and Services, letting Pub/Sub drive long-running AI calls without over-provisioning.
Global vs. regional Vertex endpoints matter. Preview models aren’t everywhere; understanding access, quota, and location constraints is critical. A single env var, MODEL_LOCATION=global, saved hours of debugging.
Observability for agents is different. Tracking expenseId across five. stages and adding bounded audit logs in Firestore helped keep event replay safe and interpretable.

What's next for Audit AI

Deeper Integration: Integrate with major accounting platforms (like QuickBooks, NetSuite, and SAP) to automatically trigger reimbursements.
Advanced Policy Engine: Expand the policy agent to handle highly complex, multi-level conditional rules and international compliance.
Proactive Analytics: Add a new agent that provides proactive insights to finance teams, such as identifying high-risk departments or common policy violations.
Enhanced UI: Build out a full-fledged admin dashboard for managing policies, reviewing flagged expenses, and monitoring agent performance.

Built With

and
cloud-build
cloud-storage
event-driven
fastapi
firestore
gemini-2.5-flash-and-flash-lite
google-cloud-run
microservices
next.js-14
node.js
pub/sub
python
tailwind-css
typescript
vertex-ai

Updates

KRISHNA MEWARA started this project — Nov 10, 2025 07:13 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.