AI Healthcare Symptom & Facility Assistant 💡 Inspiration Navigating the healthcare system is daunting. When patients experience symptoms like a "fluttering heart" or "blurry vision", they rarely know which medical specialty, procedure, or diagnostic equipment is required. Traditional hospital directories rely on exact keyword matching, failing to bridge the gap between patient symptoms and structured medical capabilities. We wanted to build a conversational assistant that lets users describe their symptoms in plain English, semantically maps them to the right facilities, and ranks them by actual credibility and capability rather than marketing copy.
🛠️ What it does The AI Healthcare Symptom & Facility Assistant is a Retrieval-Augmented Generation (RAG) chatbot that guides users to the most relevant local healthcare facilities:
Conversational Chatbot: Patients describe their symptoms or ask direct questions (e.g. "which hospital has an emergency room?" or "is there a pediatric doctor nearby?"). Semantic Symptom Mapping: Uses Databricks Vector Search to convert symptoms into embeddings and query the facility database. It links symptoms semantically to specialties (e.g., matching "blurry vision" to ophthalmology clinics). Blended Credibility Ranking: Evaluates facility credibility based on structured signals (affiliated staff, established year, page updates, social media presence) and blends it with semantic similarity to sort recommendations. Reference Dashboard: Displays structured, expandable facility cards in a side panel with contact details, address, procedures, and equipment. Empathetic & Cited Guidance: Powered by Meta Llama 3.3 70B, the assistant provides supportive replies with inline citations to the recommended facilities. 🏗️ How we built it We leveraged the Databricks Lakehouse Platform to construct a secure and scalable RAG application:
Databricks SQL Warehouse: Served as the storage and processing engine for the structured healthcare facilities dataset (10,088 records). Materialized Source Table (CDF enabled): Prepared a consolidated table in Unity Catalog workspace.foundation.facilities_vector_source combining name, specialties, procedures, equipment, capabilities, and descriptions. Staged with Change Data Feed (CDF). Databricks Vector Search: Built a managed Delta Sync index (facilities_vector_index) using the databricks-gte-large-en embedding model for semantic capability retrieval. Llama 3.3 70B Chat Completion: Connected via Databricks Model Serving (databricks-meta-llama-3-3-70b-instruct) for advanced healthcare reasoning and response synthesis. Streamlit: The front-end is built using Streamlit and deployed directly on Databricks Apps with automatic SSO and service principal credential mapping. 🚧 Challenges we ran into Index Endpoint Binding: Managed Delta Sync indexes standard endpoints take some time to initialize and bind. We designed a robust fallback mechanism in Python so the application fails gracefully to keyword-based SQL LIKE matching if the index is ever initializing or offline. Service Principal UC Permissions: Encountered an "Insufficient permissions for UC entity" error on the deployed Databricks App container. We resolved this by executing direct SQL grants to the App's Service Principal Client ID on the vector index and source tables. SDK Object Typing: The Databricks Python SDK requires typed ChatMessage and ChatMessageRole classes for serving endpoint queries. We resolved list-to-dict serialization bugs by mapping raw dicts into these structured classes. 🎉 Accomplishments that we're proud of Zero-Cold-Start Search: Built a functional semantic RAG chat assistant that responds instantly with zero downtime even while index pipeline sync is finalizing. Credibility Engine: Successfully engineered a multi-signal SQL credibility score that factors in years of operation, staff listings, and verified data recency, preventing RAG from falling prey to hallucinated trust metrics. Clean UI/UX: Delivered a dual-column layout (Chat history + Reference Dashboard) where the visual references update instantly in response to the chat state. 🧠 What we learned How to leverage Unity Catalog to share and govern search indexes securely. How to implement standard Databricks Model Serving chat endpoints inside Streamlit apps using typed SDK parameters. The power of combining vector search candidate generation with SQL-level analytical sorting to merge unstructured semantic search with structured credibility metadata. 🚀 What's next for the project Geolocation Routing: Integrate GPS coordinates or zipcode distances into the SQL warehouse queries to prioritize the physically closest clinics. Appointment Scheduling: Connect the chatbot to a booking database to schedule appointments directly with recommended facilities. Multilingual Symptom Support: Scale the model serving setup to support multilingual inputs so users can describe symptoms in their native languages.
Built With
- databricks-apps-(serverless-app-platform)
- databricks-gte-large-en
- databricks-model-serving
- databricks-python-sdk-cloud-platforms-&-infrastructure:-databricks
- databricks-sql-statement-execution-api
- databricks-sql-warehouse-apis:-databricks-vector-search-api
- languages:-python
- openai-compatible-chat-completions-api-ai-&-embedding-models:-databricks-meta-llama-3-3-70b-instruct-(meta-llama-3.3-70b)
- sql-frameworks:-streamlit
- unity-catalog-databases-&-data-storage:-delta-lake-(with-change-data-feed)
Log in or sign up for Devpost to join the conversation.