The Problem: A $75B Industry Flying Blind

Clinical trials are a financial black hole. Finance teams waste 20+ hours per week manually entering invoice data from PDFs. Worse, the industry has zero cost transparency no one knows what trials actually cost, if they're overpaying vendors, or how to benchmark spending.

What We Built in 72 Hours

TrialSync AI automates invoice processing using Snowflake's Document AI and Cortex LLM, transforming manual data entry into intelligent financial insights:

✅ Document AI Extraction - Upload invoice PDFs, extract structured data in 2.3 seconds (vs 20 minutes manually)

✅ Smart Matching - Cortex LLM automatically matches invoices to contracts using natural language reasoning

✅ Financial Intelligence - Chat interface answers questions like "Which vendor costs most?" using real data

✅ Real-Time Dashboard - Track $21M+ in processed invoices across multiple CRO vendors

The Technical Challenge

Hard problems we solved:

Extracting unstructured data from messy 86-page invoice PDFs Using Cortex LLM to match invoices to contracts with fuzzy logic (vendor names, PO numbers, dates) Building a conversational AI that queries financial data in natural language Creating production-ready workflows in Snowflake's Streamlit environment

Why This Matters

This isn't just an invoice tool it's the data capture engine for building the industry's first cost benchmarking platform. Every processed invoice feeds our intelligence layer:

Actual costs by therapeutic area, phase, geography Vendor performance metrics (delivery, quality, pricing) Predictive models for what trials will really cost

Built With

  • ai-parsing-document
  • ai-powered
  • backend
  • cortex
  • database
  • ocr
  • python
  • snowpark
  • sql
  • streamlit
Share this project:

Updates