💡 Inspiration
Healthcare accessibility remains a critical challenge worldwide. Many people struggle to:
- Get immediate medical guidance during off-hours or emergencies
- Find the right healthcare provider for their specific needs
- Overcome language or communication barriers when describing symptoms
- Navigate complex medical information spread across the web
We envisioned a world where quality healthcare guidance is available 24/7, where finding the right doctor is as simple as having a conversation, and where technology breaks down barriers rather than creating them. This vision inspired us to build Medi AI - an intelligent medical assistant that combines cutting-edge AI with human expertise to make healthcare more accessible and responsive.
🎯 What it does
Medi AI is a comprehensive healthcare platform featuring:
AIRA - AI Responsive & Intelligent Assistant
An advanced medical AI assistant powered by OpenAI's GPT-4o that provides:
- 24/7 Medical Guidance: Instant symptom analysis and health recommendations
- Natural Voice Conversations: Speak naturally using Whisper speech-to-text and ElevenLabs voice synthesis
- Contextual Understanding: Maintains conversation history for coherent, personalized interactions
- Emergency Support: Recognizes urgent situations and provides appropriate guidance
Intelligent Caregiver Matching
A sophisticated matching algorithm that connects patients with healthcare providers based on:
- Geographic Proximity: City, state, and country-based filtering
- Specialization Matching: Aligns symptoms with medical specialties
- Consultation Preferences: Chat, video, or in-person options
- Match Scoring: Weighted algorithm considering experience, ratings, and availability
Real-Time Communication Platform
- WebSocket-Based Chat: Instant messaging between patients and caregivers
- Voice Call Integration: Real-time voice conversations with AI transcription
- Multi-Modal Support: Text, voice, and audio communication channels
Web-Augmented Intelligence
- Firecrawl Integration: Real-time web search for latest medical information
- Content Scraping: Extract reliable health information from verified sources
- Knowledge Synthesis: Combines AI reasoning with current medical literature
🔨 How we built it
Architecture & Technology Stack
Backend (FastAPI + Python)
# Core Technologies
- FastAPI 0.115.0 # High-performance async API framework
- SQLAlchemy 2.0.23 # ORM for database management
- OpenAI API 1.54.0 # GPT-4o chat & Whisper transcription
- ElevenLabs 1.8.0 # Neural voice synthesis
- Firecrawl-py 1.5.0 # Intelligent web scraping
- Pydantic 2.9.0 # Data validation & settings
- Python-Jose # JWT authentication
- Bcrypt & Passlib # Password hashing
Frontend (Next.js + React)
// Modern Web Stack
- Next.js 16.0.1 # React framework with App Router
- React 19.2.0 # Latest React with compiler optimizations
- TypeScript 5 # Type-safe development
- shadcn/ui # Accessible component library
- Radix UI # Headless UI primitives
- TailwindCSS 4 # Utility-first styling
- Lucide React # Modern icon library
Implementation Highlights
Real-Time Voice System
- WebSocket connection for bidirectional streaming
- Audio chunking and base64 encoding for transmission
- Pipeline:
User Voice → Whisper → GPT-4o → ElevenLabs → Audio Response - Conversation context management (last 10 messages)
Smart Caregiver Matching
- Multi-factor scoring algorithm:
match_score = f(location, specialization, experience, ratings, mode) - SQL query optimization with joins and filters
- Location-based proximity using city/state/country hierarchies
- Symptom-to-specialty mapping
- Multi-factor scoring algorithm:
Secure Authentication
- JWT token-based authentication with refresh tokens
- Role-based access control (Patient, Caregiver)
- Bcrypt password hashing (12 rounds)
- Multi-step caregiver onboarding with verification
Database Design
- User model with polymorphic roles
- CaregiverProfile with specialization metadata
- Conversation & Message models for chat history
- SQLite for development, PostgreSQL-ready schema
Modern Frontend Architecture
- Server-side and client-side rendering (SSR/CSR)
- Protected routes with AuthContext
- Real-time WebSocket integration
- Responsive design with mobile-first approach
- Component composition with shadcn/ui primitives
🚧 Challenges we ran into
1. Real-Time Audio Processing
Challenge: Streaming audio over WebSocket with low latency while maintaining quality.
Solution:
- Implemented base64 encoding/decoding for binary audio data
- Chunked audio processing to reduce memory overhead
- Added ping/pong heartbeat to maintain connection stability
- Optimized buffer sizes for smooth playback
2. Context-Aware AI Responses
Challenge: Maintaining conversation context across multiple turns without token limit issues.
Solution:
- Implemented rolling window of last 10 messages
- Crafted system prompts specific to medical domain
- Balanced context length vs. response quality
- Added conversation reset mechanisms
3. Caregiver Matching Accuracy
Challenge: Creating a fair and accurate scoring system for diverse caregiver profiles.
Solution:
# Weighted scoring algorithm
score = (
location_match * 0.35 + # Proximity is crucial
specialization_match * 0.30 + # Right expertise matters
experience_score * 0.15 + # Years of practice
rating_score * 0.15 + # Patient satisfaction
consultation_mode * 0.05 # Mode preference
)
4. Authentication & Authorization
Challenge: Implementing secure authentication with dual user roles (patients vs. caregivers).
Solution:
- JWT with HttpOnly cookies for web security
- Separate registration flows with role-specific validation
- Multi-step caregiver verification process
- Protected routes with role-based middleware
5. TypeScript & Type Safety
Challenge: Ensuring type safety across frontend-backend communication.
Solution:
- Defined comprehensive TypeScript interfaces
- Created Pydantic schemas matching frontend types
- Used Zod for runtime validation
- Implemented API client with type inference
6. Voice Call State Management
Challenge: Coordinating audio recording, playback, and UI state.
Solution:
- Implemented finite state machine for call states
- Used React hooks for audio lifecycle management
- Added visual indicators for recording/speaking states
- Graceful error handling and reconnection logic
🏆 Accomplishments that we're proud of
Fully Functional Voice AI: Successfully integrated Whisper, GPT-4o, and ElevenLabs into a seamless voice conversation experience that feels natural and responsive.
Production-Ready Architecture: Built a scalable, maintainable codebase with proper separation of concerns, error handling, and documentation.
Smart Matching Algorithm: Developed a sophisticated caregiver matching system that considers multiple factors to find the best healthcare provider match.
Real-Time Communication: Implemented WebSocket-based real-time chat and voice features that work reliably across different network conditions.
Beautiful, Accessible UI: Created a modern, professional healthcare interface using shadcn/ui that's both visually appealing and accessible to all users.
Comprehensive Feature Set: Despite time constraints, delivered a complete ecosystem including:
- AI chat assistant
- Voice conversations
- Caregiver matching
- Patient-caregiver messaging
- Authentication system
- Web search integration
Security Best Practices: Implemented industry-standard security with JWT, bcrypt, CORS, and environment-based configuration.
📚 What we learned
Technical Learnings
WebSocket Mastery: Deepened understanding of real-time bidirectional communication, connection lifecycle management, and binary data streaming.
AI Integration Patterns: Learned effective strategies for:
- Prompt engineering for medical contexts
- Managing API rate limits and costs
- Handling AI service failures gracefully
- Balancing response quality vs. latency
Modern React Patterns: Explored React 19 features, server components, and the new Next.js App Router architecture.
Audio Processing: Gained hands-on experience with Web Audio API, audio encoding/decoding, and streaming audio data.
Type-Safe Full-Stack Development: Appreciated the value of TypeScript and Pydantic for catching errors at development time rather than runtime.
Domain Learnings
Healthcare UX: Understood the unique requirements of medical interfaces - clarity, accessibility, and trustworthiness are paramount.
Medical Information Handling: Learned about the importance of disclaimers, accuracy, and appropriate escalation to human professionals.
User Privacy: Recognized the critical importance of HIPAA-like considerations in healthcare applications.
Process Learnings
Iterative Development: Started with MVP features and progressively added complexity based on what worked.
API-First Design: Designing the API contracts early streamlined parallel frontend-backend development.
Documentation Matters: Comprehensive README and inline comments saved countless debugging hours.
🚀 What's next for Medi AI
Short-term Enhancements (1-3 months)
Enhanced AI Capabilities
- Multi-language support for global accessibility
- Image recognition for analyzing medical reports/symptoms
- Integration with more AI models (Anthropic Claude, Google Med-PaLM)
Advanced Features
- Appointment scheduling and calendar integration
- Prescription management and reminders
- Health records storage with encryption
- Integration with wearable devices (FitBit, Apple Health)
Improved Matching
- Machine learning-based recommendation system
- Patient-caregiver compatibility scoring
- Insurance and payment integration
- Availability calendar for real-time booking
Medium-term Goals (3-6 months)
Mobile Applications
- React Native apps for iOS and Android
- Push notifications for messages and reminders
- Offline mode for viewing past conversations
Telemedicine Integration
- Video consultation with WebRTC
- Screen sharing for reviewing reports
- Virtual waiting rooms
Analytics & Insights
- Dashboard for caregivers showing patient analytics
- Health trend tracking for patients
- AI-powered health predictions
Compliance & Security
- HIPAA compliance certification
- End-to-end encryption for messages
- Audit logs and compliance reporting
- Multi-factor authentication
Long-term Vision (6-12 months)
AI Hospital
- Full virtual hospital experience
- Multi-specialty consultations
- Emergency triage system
- Integration with physical hospitals
Research & Development
- Contribute to medical AI research
- Partner with healthcare institutions
- Clinical trials for AI-assisted diagnosis
Global Expansion
- Multi-country deployment
- Localization for different healthcare systems
- Partnerships with insurance providers
- Integration with national health systems
Community Features
- Patient support groups
- Health education content
- Caregiver collaboration tools
- Medical professional networking
Built With
- axios
- bcrypt
- elevenlabs
- fastapi
- firecrawl
- html
- javascript
- jwt-authentication
- lucide-react
- next.js
- node.js
- openai-gpt-4o
- openai-whisper
- passlib
- postgresql
- pydantic
- python
- python-jose
- radix-ui
- react
- rest-api
- shadcn/ui
- sqlalchemy
- sqlite
- tailwindcss
- typescript
- uvicorn
- websocket
Log in or sign up for Devpost to join the conversation.