Inspiration

We wanted to create a virtual doctor that can see your face, detect your emotions, and have a natural conversation about your health concerns. The goal was to make medical consultations more accessible while using facial recognition and emotion detection to provide personalized, empathetic responses.

What it does

Auralis is an AI-powered virtual doctor that conducts face-to-face video consultations:

  • 3D Avatar Doctor: Interactive, lip-synced doctor avatar with customizable appearance
  • Real-time Emotion Detection: Analyzes facial expressions during consultation to understand patient emotional state
  • Intelligent Conversations: Powered by Google Gemini AI for natural, context-aware medical consultations
  • Voice Interaction: Natural speech-to-text and text-to-speech via ElevenLabs for seamless conversation
  • Emotion-Aware Responses: AI adapts its communication style based on detected emotions
  • Session Insights: Post-consultation summary with emotion timeline, key concerns, and recommendations
  • Professional Report PDF Generation: Generates a professional report PDF with the consultation summary, key concerns, and recommendations

How we built it

Frontend:

  • Next.js 16
  • Three.js (3D avatar rendering)
  • face-api.js (real-time emotion detection)
  • Framer Motion (animations)
  • Tailwind CSS
  • Cloudflare Pages for deployment

Backend:

  • FastAPI
  • Google Gemini 2.5 Flash (AI conversations)
  • ElevenLabs (TTS/STT)

Challenges we ran into

  1. Audio Synchronization: Coordinating speech recognition, AI responses, and lip-sync animation required careful state management to prevent overlapping audio

  2. Emotion Detection Accuracy: Balancing sensitivity in emotion mismatch detection to catch important signals without false positives

  3. API Response Times: Gemini's latency could break conversation flow. We switched to Gemini 2.5 Flash for faster responses

  4. Medical Content Filtering: Gemini's safety filters blocked legitimate medical discussions. We configured safety thresholds to allow medical conversations while maintaining boundaries

  5. Branch Coordination: Coordinating 4 team members working on parallel features (frontend UI, 3D avatar, backend API, audio pipeline) required careful synchronization

Accomplishments that we're proud of

  • Emotion Intelligence: Implemented emotion mismatch detection that identifies when patients say "I'm fine" while looking distressed

  • Natural Conversation Flow: Created a seamless voice interface that feels responsive and natural

  • Complete End-to-End System: Built a fully functional system spanning speech processing, AI reasoning, emotion analysis, 3D rendering, and medical summarization

  • Production-Ready Architecture: Modular FastAPI backend and component-based frontend that are maintainable and scalable

  • Cloudflare Deployment: Despite being first time using Cloudflare and having some challenges, we successfully deployed frontend to Cloudflare Pages for fast, global access

What we learned

  • Real-time audio processing and synchronization
  • 3D rendering and animation with Three.js
  • Prompt engineering for medical AI applications
  • Multi-modal AI integration (text, voice, vision)
  • Git branching strategies for parallel development
  • API contract design for frontend-backend integration
  • Healthcare AI must balance accuracy with empathy
  • Emotion detection adds a critical dimension to remote healthcare
  • A good git workflow is SUPER important for team collaboration

What's next for Auralis

  • Emergency Detection: Keyword-based flagging for critical symptoms with escalation protocols
  • Multilingual Support: Leverage ElevenLabs' 29-language support
  • Visual Diagnosis: Integrate Gemini's vision capabilities to analyze photos of rashes, injuries, or swelling
  • Health Data Integration: Connect with Apple Health/Google Fit
  • HIPAA Compliance: Implement end-to-end encryption and audit logging
  • Clinical Validation: Partner with medical institutions to validate AI recommendations

Built With

Share this project:

Updates