Inspiration

Millions of people search for health information online but encounter misinformation or confusing medical jargon, especially in regional languages. We wanted to build an open-source AI tool that makes reliable healthcare knowledge accessible to everyone, regardless of language or technical background.

What it does

DoctorAI-QA v2 is a multilingual AI healthcare assistant that:

  • Answers health questions in 5 languages — English, Hindi, Telugu, Marathi, and Arabic
  • Auto-detects the user's language from both native script and romanized text
  • Reads answers aloud using text-to-speech in the detected language
  • Provides a 3-step Symptom Checker to assess urgency (High / Medium / Low)
  • Shows confidence scores and reasoning for every response
  • Flags safety disclaimers contextually — more urgent for serious symptoms
  • Tracks consultation history with a confidence trend graph

How we built it

  • Stage 1: Fine-tuned GPT-OSS 20B on healthcare datasets using LoRA + Unsloth in Google Colab. Deployed a working Gradio demo on HuggingFace Spaces.
  • Stage 2: Rebuilt the entire interface with Streamlit. Added multilingual NLP detection, gTTS voice output, symptom checker flow, severity detection, and an analytics dashboard.

Tech Stack:

  • Base model: unsloth/gpt-oss-20b-unsloth-bnb-4bit
  • Fine-tuning: LoRA via Unsloth + HuggingFace TRL
  • Interface: Streamlit + custom CSS (glassmorphism dark theme)
  • Multilingual TTS: gTTS (supports hi, te, mr, ar, en)
  • API: OpenRouter (LLaMA 3 8B for live inference)
  • Hosting: HuggingFace (model weights) + Streamlit Cloud (demo)
  • License: Apache-2.0

Challenges we ran into

  • Romanized language detection (e.g. "naku jwaram undi" → Telugu) required custom keyword mapping since standard NLP libraries fail on mixed scripts
  • Making TTS work across 5 languages including Arabic and Telugu in a browser environment
  • Keeping the model lightweight enough for real-time responses while maintaining answer quality
  • Designing a UI that feels premium inside Streamlit's constraints

Accomplishments we're proud of

  • Auto-detects 5 languages from both native script and romanized Roman text
  • Built a fully working multilingual healthcare AI accessible from any browser
  • Symptom checker provides actionable urgency guidance without any medical diagnosis
  • Complete open-source stack — anyone can fork, run, or extend it

What we learned

  • Romanized text detection requires domain-specific keyword lists, not generic language models
  • Educational framing + safety disclaimers are essential for responsible AI in healthcare
  • Streamlit can support surprisingly rich UIs with custom CSS injection
  • LoRA fine-tuning on domain-specific data significantly improves response relevance

What's next for DoctorAI-QA

  • Expand to Bangla, Tamil, and Punjabi
  • Add mental health and preventive care datasets
  • Integrate voice input (speech-to-text) for hands-free use
  • Deploy a dedicated API for third-party integrations

Built With

Share this project:

Updates