Inspiration

Healthcare data is complex and often siloed in different systems. We wanted to make it accessible to everyone—not just data engineers. By combining natural language processing with AI, we envisioned a system where anyone could ask questions about patient data in plain English and get instant insights, without needing to write SQL or understand complex FHIR schemas.


What it does

Healthcare Interoperability Connector is an AI-powered natural language query engine for healthcare data:

  • Ask in English: Users type questions like "Show me all patients with diabetes" or "What are the most common conditions?"
  • AI Generates SQL: Gemini automatically converts natural language to optimized BigQuery SQL queries
  • Instant Results: Queries execute against FHIR Synthea public dataset with 2-5 second response times
  • Smart Visualization: Results display in interactive tables and charts with intelligent formatting
  • FHIR Compliant: Works with standard FHIR R4 healthcare data structures

How we built it

Architecture:

  • Backend: Python Flask API with Vertex AI (Gemini) for SQL generation
  • Frontend: React with Material-UI for modern, responsive interface
  • Data: Google BigQuery with public FHIR Synthea dataset (17 FHIR resource types)
  • Integration: Fivetran for data pipeline orchestration
  • Deployment: Google Cloud Run for serverless scalability

Key Components:

  1. NLQueryService - Converts natural language to SQL using Gemini
  2. QueryResultsVisualization - Smart formatting of complex FHIR nested objects
  3. React Dashboard - Clean, intuitive UI for healthcare professionals
  4. BigQuery Integration - Direct access to FHIR data at scale

Challenges we ran into

  1. FHIR Complexity: FHIR R4 has deeply nested structures. We had to teach Gemini the exact schema paths (e.g., code.coding[0].display, value.quantity.value)

  2. Object Formatting: React was displaying [object Object] for nested FHIR records. We built a smart formatter that extracts readable values from complex structures

  3. Schema Context: Gemini needed comprehensive schema information to generate correct queries. We created detailed prompts with FHIR-specific examples

  4. Data Source Mismatch: Initial queries pointed to non-existent local tables. We pivoted to use the public BigQuery FHIR Synthea dataset

  5. Query Accuracy: Some generated queries had incorrect field names. We improved the prompt with explicit FHIR field documentation


Accomplishments that we're proud of

End-to-End MVP: Fully functional system from natural language input to visualized results

Smart Data Formatting: Intelligent extraction of readable values from complex FHIR nested objects with hover tooltips

Production-Ready Code: Clean architecture, proper error handling, structured logging

Cloud Deployment: Successfully deployed to Google Cloud Run with public endpoint

Comprehensive Documentation: Single README covering architecture, setup, usage, and troubleshooting

Real Data: Works with actual FHIR Synthea dataset (1M+ patient records)

Fast Performance: Queries execute in 2-5 seconds on average

User-Friendly UI: Clean React interface with example queries and result visualization


What we learned

  1. FHIR is Powerful but Complex: Understanding nested FHIR structures was crucial for generating correct queries

  2. AI Needs Context: Gemini performs much better with detailed schema documentation and examples

  3. User Experience Matters: Smart formatting of results is as important as generating correct queries

  4. Prompt Engineering is Key: Small changes to AI prompts significantly impact query quality

  5. BigQuery is Scalable: Public datasets enable rapid prototyping without data setup overhead

  6. React + MUI = Productivity: Material-UI components accelerated frontend development

  7. Cloud-Native Architecture: Serverless deployment (Cloud Run) simplified DevOps


What's next for FHIR NLP System

🚀 Short Term:

  • Add query result export (CSV, JSON)
  • Implement saved query history
  • Add advanced filtering and sorting
  • Support for custom datasets

Built With

Share this project:

Updates