Inspiration

We noticed that most businesses struggle to manually search invoice data. Every data question requires writing SQL or exporting files into Excel. This wastes time, requires technical skills, and slows decision making. We wanted to remove that barrier completely — where anyone in finance, operations, accounting or procurement could simply ask in natural English, and instantly extract the exact insight they need.

What it does

Invoice Insights is an AI Agent that converts natural language queries into executable SQL, runs them on BigQuery, and returns results as downloadable CSV automatically. Upload an invoice dataset → ask a question in plain English → AI generates SQL → SQL runs → CSV returns → ready for analysis, audit, reporting, and reconciliation.

How we built it

Frontend React UI deployed on Cloud Run (user uploads CSV + asks questions). Backend FastAPI service deployed on Cloud Run (API Key replaced with service account authentication). CSV is stored in Google Cloud Storage. BigQuery external table is created dynamically from CSV. Gemini 2.0 Flash generates SQL agentically from natural language. BigQuery executes SQL and automatically returns final CSV output back to GCS. Cloud Run handles complete serverless execution end-to-end.

Challenges we ran into

Gemini access issue-Solved using Vertex AI Gemini. SQL safety checker blocked legitimate functions Fixed with better token matching AI returned full JSON instead of just SQL was solved with robust response parsing. Cloud Run couldn't generate signed URLs problem was worked around with direct CSV streaming. GCS bucket security blocked public URLs Bypassed by sending data directly. Frontend expected URLs but got raw data - Fixed with client-side blob downloads. BigQuery schema detection failed sometimes was Improved with better error handling.

Accomplishments that we're proud of

Fully autonomous NL → SQL → Query → CSV pipeline with zero manual SQL writing. 100% serverless architecture that scales instantly. Designed a production-ready workflow, not just a demo. Built a non-trivial Agent behavior that performs real business value tasks.

What we learned

Vertex AI Gemini excels at structured data extraction and SQL generation. Gemini works strongest when instructions and schemas are grounded with real invoice column context. Cloud Run's default credentials have limitations with signed URL generation. Direct data streaming is more reliable than file-based downloads for web apps. Uniform Bucket-Level Access requires different permission strategies.

BigQuery external tables work well for quick CSV analysis without loading data.

What's next for Invoice Insights

Add chain of invoice reasoning (multiple step multi-query agents) ,fraud / anomaly detection agent mode allow direct integration into ERP (SAP/Sage/Zoho/Oracle) invoice streams or add secure multi-user role based access + audit logs and SOC2 compliant mode.

Share this project:

Updates