💡 Inspiration

Healthcare accessibility remains a critical challenge worldwide. Many people struggle to:

  • Get immediate medical guidance during off-hours or emergencies
  • Find the right healthcare provider for their specific needs
  • Overcome language or communication barriers when describing symptoms
  • Navigate complex medical information spread across the web

We envisioned a world where quality healthcare guidance is available 24/7, where finding the right doctor is as simple as having a conversation, and where technology breaks down barriers rather than creating them. This vision inspired us to build Medi AI - an intelligent medical assistant that combines cutting-edge AI with human expertise to make healthcare more accessible and responsive.

🎯 What it does

Medi AI is a comprehensive healthcare platform featuring:

AIRA - AI Responsive & Intelligent Assistant

An advanced medical AI assistant powered by OpenAI's GPT-4o that provides:

  • 24/7 Medical Guidance: Instant symptom analysis and health recommendations
  • Natural Voice Conversations: Speak naturally using Whisper speech-to-text and ElevenLabs voice synthesis
  • Contextual Understanding: Maintains conversation history for coherent, personalized interactions
  • Emergency Support: Recognizes urgent situations and provides appropriate guidance

Intelligent Caregiver Matching

A sophisticated matching algorithm that connects patients with healthcare providers based on:

  • Geographic Proximity: City, state, and country-based filtering
  • Specialization Matching: Aligns symptoms with medical specialties
  • Consultation Preferences: Chat, video, or in-person options
  • Match Scoring: Weighted algorithm considering experience, ratings, and availability

Real-Time Communication Platform

  • WebSocket-Based Chat: Instant messaging between patients and caregivers
  • Voice Call Integration: Real-time voice conversations with AI transcription
  • Multi-Modal Support: Text, voice, and audio communication channels

Web-Augmented Intelligence

  • Firecrawl Integration: Real-time web search for latest medical information
  • Content Scraping: Extract reliable health information from verified sources
  • Knowledge Synthesis: Combines AI reasoning with current medical literature

🔨 How we built it

Architecture & Technology Stack

Backend (FastAPI + Python)

# Core Technologies
- FastAPI 0.115.0         # High-performance async API framework
- SQLAlchemy 2.0.23       # ORM for database management
- OpenAI API 1.54.0       # GPT-4o chat & Whisper transcription
- ElevenLabs 1.8.0        # Neural voice synthesis
- Firecrawl-py 1.5.0      # Intelligent web scraping
- Pydantic 2.9.0          # Data validation & settings
- Python-Jose             # JWT authentication
- Bcrypt & Passlib        # Password hashing

Frontend (Next.js + React)

// Modern Web Stack
- Next.js 16.0.1          # React framework with App Router
- React 19.2.0            # Latest React with compiler optimizations
- TypeScript 5            # Type-safe development
- shadcn/ui               # Accessible component library
- Radix UI                # Headless UI primitives
- TailwindCSS 4           # Utility-first styling
- Lucide React            # Modern icon library

Implementation Highlights

  1. Real-Time Voice System

    • WebSocket connection for bidirectional streaming
    • Audio chunking and base64 encoding for transmission
    • Pipeline: User Voice → Whisper → GPT-4o → ElevenLabs → Audio Response
    • Conversation context management (last 10 messages)
  2. Smart Caregiver Matching

    • Multi-factor scoring algorithm: match_score = f(location, specialization, experience, ratings, mode)
    • SQL query optimization with joins and filters
    • Location-based proximity using city/state/country hierarchies
    • Symptom-to-specialty mapping
  3. Secure Authentication

    • JWT token-based authentication with refresh tokens
    • Role-based access control (Patient, Caregiver)
    • Bcrypt password hashing (12 rounds)
    • Multi-step caregiver onboarding with verification
  4. Database Design

    • User model with polymorphic roles
    • CaregiverProfile with specialization metadata
    • Conversation & Message models for chat history
    • SQLite for development, PostgreSQL-ready schema
  5. Modern Frontend Architecture

    • Server-side and client-side rendering (SSR/CSR)
    • Protected routes with AuthContext
    • Real-time WebSocket integration
    • Responsive design with mobile-first approach
    • Component composition with shadcn/ui primitives

🚧 Challenges we ran into

1. Real-Time Audio Processing

Challenge: Streaming audio over WebSocket with low latency while maintaining quality.

Solution:

  • Implemented base64 encoding/decoding for binary audio data
  • Chunked audio processing to reduce memory overhead
  • Added ping/pong heartbeat to maintain connection stability
  • Optimized buffer sizes for smooth playback

2. Context-Aware AI Responses

Challenge: Maintaining conversation context across multiple turns without token limit issues.

Solution:

  • Implemented rolling window of last 10 messages
  • Crafted system prompts specific to medical domain
  • Balanced context length vs. response quality
  • Added conversation reset mechanisms

3. Caregiver Matching Accuracy

Challenge: Creating a fair and accurate scoring system for diverse caregiver profiles.

Solution:

# Weighted scoring algorithm
score = (
    location_match * 0.35 +      # Proximity is crucial
    specialization_match * 0.30 + # Right expertise matters
    experience_score * 0.15 +     # Years of practice
    rating_score * 0.15 +         # Patient satisfaction
    consultation_mode * 0.05      # Mode preference
)

4. Authentication & Authorization

Challenge: Implementing secure authentication with dual user roles (patients vs. caregivers).

Solution:

  • JWT with HttpOnly cookies for web security
  • Separate registration flows with role-specific validation
  • Multi-step caregiver verification process
  • Protected routes with role-based middleware

5. TypeScript & Type Safety

Challenge: Ensuring type safety across frontend-backend communication.

Solution:

  • Defined comprehensive TypeScript interfaces
  • Created Pydantic schemas matching frontend types
  • Used Zod for runtime validation
  • Implemented API client with type inference

6. Voice Call State Management

Challenge: Coordinating audio recording, playback, and UI state.

Solution:

  • Implemented finite state machine for call states
  • Used React hooks for audio lifecycle management
  • Added visual indicators for recording/speaking states
  • Graceful error handling and reconnection logic

🏆 Accomplishments that we're proud of

  1. Fully Functional Voice AI: Successfully integrated Whisper, GPT-4o, and ElevenLabs into a seamless voice conversation experience that feels natural and responsive.

  2. Production-Ready Architecture: Built a scalable, maintainable codebase with proper separation of concerns, error handling, and documentation.

  3. Smart Matching Algorithm: Developed a sophisticated caregiver matching system that considers multiple factors to find the best healthcare provider match.

  4. Real-Time Communication: Implemented WebSocket-based real-time chat and voice features that work reliably across different network conditions.

  5. Beautiful, Accessible UI: Created a modern, professional healthcare interface using shadcn/ui that's both visually appealing and accessible to all users.

  6. Comprehensive Feature Set: Despite time constraints, delivered a complete ecosystem including:

    • AI chat assistant
    • Voice conversations
    • Caregiver matching
    • Patient-caregiver messaging
    • Authentication system
    • Web search integration
  7. Security Best Practices: Implemented industry-standard security with JWT, bcrypt, CORS, and environment-based configuration.

📚 What we learned

Technical Learnings

  1. WebSocket Mastery: Deepened understanding of real-time bidirectional communication, connection lifecycle management, and binary data streaming.

  2. AI Integration Patterns: Learned effective strategies for:

    • Prompt engineering for medical contexts
    • Managing API rate limits and costs
    • Handling AI service failures gracefully
    • Balancing response quality vs. latency
  3. Modern React Patterns: Explored React 19 features, server components, and the new Next.js App Router architecture.

  4. Audio Processing: Gained hands-on experience with Web Audio API, audio encoding/decoding, and streaming audio data.

  5. Type-Safe Full-Stack Development: Appreciated the value of TypeScript and Pydantic for catching errors at development time rather than runtime.

Domain Learnings

  1. Healthcare UX: Understood the unique requirements of medical interfaces - clarity, accessibility, and trustworthiness are paramount.

  2. Medical Information Handling: Learned about the importance of disclaimers, accuracy, and appropriate escalation to human professionals.

  3. User Privacy: Recognized the critical importance of HIPAA-like considerations in healthcare applications.

Process Learnings

  1. Iterative Development: Started with MVP features and progressively added complexity based on what worked.

  2. API-First Design: Designing the API contracts early streamlined parallel frontend-backend development.

  3. Documentation Matters: Comprehensive README and inline comments saved countless debugging hours.

🚀 What's next for Medi AI

Short-term Enhancements (1-3 months)

  1. Enhanced AI Capabilities

    • Multi-language support for global accessibility
    • Image recognition for analyzing medical reports/symptoms
    • Integration with more AI models (Anthropic Claude, Google Med-PaLM)
  2. Advanced Features

    • Appointment scheduling and calendar integration
    • Prescription management and reminders
    • Health records storage with encryption
    • Integration with wearable devices (FitBit, Apple Health)
  3. Improved Matching

    • Machine learning-based recommendation system
    • Patient-caregiver compatibility scoring
    • Insurance and payment integration
    • Availability calendar for real-time booking

Medium-term Goals (3-6 months)

  1. Mobile Applications

    • React Native apps for iOS and Android
    • Push notifications for messages and reminders
    • Offline mode for viewing past conversations
  2. Telemedicine Integration

    • Video consultation with WebRTC
    • Screen sharing for reviewing reports
    • Virtual waiting rooms
  3. Analytics & Insights

    • Dashboard for caregivers showing patient analytics
    • Health trend tracking for patients
    • AI-powered health predictions
  4. Compliance & Security

    • HIPAA compliance certification
    • End-to-end encryption for messages
    • Audit logs and compliance reporting
    • Multi-factor authentication

Long-term Vision (6-12 months)

  1. AI Hospital

    • Full virtual hospital experience
    • Multi-specialty consultations
    • Emergency triage system
    • Integration with physical hospitals
  2. Research & Development

    • Contribute to medical AI research
    • Partner with healthcare institutions
    • Clinical trials for AI-assisted diagnosis
  3. Global Expansion

    • Multi-country deployment
    • Localization for different healthcare systems
    • Partnerships with insurance providers
    • Integration with national health systems
  4. Community Features

    • Patient support groups
    • Health education content
    • Caregiver collaboration tools
    • Medical professional networking

Built With

Share this project:

Updates