Inspiration
Healthcare data is complex and often siloed in different systems. We wanted to make it accessible to everyone—not just data engineers. By combining natural language processing with AI, we envisioned a system where anyone could ask questions about patient data in plain English and get instant insights, without needing to write SQL or understand complex FHIR schemas.
What it does
Healthcare Interoperability Connector is an AI-powered natural language query engine for healthcare data:
- Ask in English: Users type questions like "Show me all patients with diabetes" or "What are the most common conditions?"
- AI Generates SQL: Gemini automatically converts natural language to optimized BigQuery SQL queries
- Instant Results: Queries execute against FHIR Synthea public dataset with 2-5 second response times
- Smart Visualization: Results display in interactive tables and charts with intelligent formatting
- FHIR Compliant: Works with standard FHIR R4 healthcare data structures
How we built it
Architecture:
- Backend: Python Flask API with Vertex AI (Gemini) for SQL generation
- Frontend: React with Material-UI for modern, responsive interface
- Data: Google BigQuery with public FHIR Synthea dataset (17 FHIR resource types)
- Integration: Fivetran for data pipeline orchestration
- Deployment: Google Cloud Run for serverless scalability
Key Components:
- NLQueryService - Converts natural language to SQL using Gemini
- QueryResultsVisualization - Smart formatting of complex FHIR nested objects
- React Dashboard - Clean, intuitive UI for healthcare professionals
- BigQuery Integration - Direct access to FHIR data at scale
Challenges we ran into
FHIR Complexity: FHIR R4 has deeply nested structures. We had to teach Gemini the exact schema paths (e.g.,
code.coding[0].display,value.quantity.value)Object Formatting: React was displaying
[object Object]for nested FHIR records. We built a smart formatter that extracts readable values from complex structuresSchema Context: Gemini needed comprehensive schema information to generate correct queries. We created detailed prompts with FHIR-specific examples
Data Source Mismatch: Initial queries pointed to non-existent local tables. We pivoted to use the public BigQuery FHIR Synthea dataset
Query Accuracy: Some generated queries had incorrect field names. We improved the prompt with explicit FHIR field documentation
Accomplishments that we're proud of
✅ End-to-End MVP: Fully functional system from natural language input to visualized results
✅ Smart Data Formatting: Intelligent extraction of readable values from complex FHIR nested objects with hover tooltips
✅ Production-Ready Code: Clean architecture, proper error handling, structured logging
✅ Cloud Deployment: Successfully deployed to Google Cloud Run with public endpoint
✅ Comprehensive Documentation: Single README covering architecture, setup, usage, and troubleshooting
✅ Real Data: Works with actual FHIR Synthea dataset (1M+ patient records)
✅ Fast Performance: Queries execute in 2-5 seconds on average
✅ User-Friendly UI: Clean React interface with example queries and result visualization
What we learned
FHIR is Powerful but Complex: Understanding nested FHIR structures was crucial for generating correct queries
AI Needs Context: Gemini performs much better with detailed schema documentation and examples
User Experience Matters: Smart formatting of results is as important as generating correct queries
Prompt Engineering is Key: Small changes to AI prompts significantly impact query quality
BigQuery is Scalable: Public datasets enable rapid prototyping without data setup overhead
React + MUI = Productivity: Material-UI components accelerated frontend development
Cloud-Native Architecture: Serverless deployment (Cloud Run) simplified DevOps
What's next for FHIR NLP System
🚀 Short Term:
- Add query result export (CSV, JSON)
- Implement saved query history
- Add advanced filtering and sorting
- Support for custom datasets

Log in or sign up for Devpost to join the conversation.