DoctorAI-QA v2

Inspiration

Millions of people search for health information online but encounter misinformation or confusing medical jargon, especially in regional languages. We wanted to build an open-source AI tool that makes reliable healthcare knowledge accessible to everyone, regardless of language or technical background.

What it does

DoctorAI-QA v2 is a multilingual AI healthcare assistant that:

Answers health questions in 5 languages — English, Hindi, Telugu, Marathi, and Arabic
Auto-detects the user's language from both native script and romanized text
Reads answers aloud using text-to-speech in the detected language
Provides a 3-step Symptom Checker to assess urgency (High / Medium / Low)
Shows confidence scores and reasoning for every response
Flags safety disclaimers contextually — more urgent for serious symptoms
Tracks consultation history with a confidence trend graph

How we built it

Stage 1: Fine-tuned GPT-OSS 20B on healthcare datasets using LoRA + Unsloth in Google Colab. Deployed a working Gradio demo on HuggingFace Spaces.
Stage 2: Rebuilt the entire interface with Streamlit. Added multilingual NLP detection, gTTS voice output, symptom checker flow, severity detection, and an analytics dashboard.

Tech Stack:

Base model: unsloth/gpt-oss-20b-unsloth-bnb-4bit
Fine-tuning: LoRA via Unsloth + HuggingFace TRL
Interface: Streamlit + custom CSS (glassmorphism dark theme)
Multilingual TTS: gTTS (supports hi, te, mr, ar, en)
API: OpenRouter (LLaMA 3 8B for live inference)
Hosting: HuggingFace (model weights) + Streamlit Cloud (demo)
License: Apache-2.0

Challenges we ran into

Romanized language detection (e.g. "naku jwaram undi" → Telugu) required custom keyword mapping since standard NLP libraries fail on mixed scripts
Making TTS work across 5 languages including Arabic and Telugu in a browser environment
Keeping the model lightweight enough for real-time responses while maintaining answer quality
Designing a UI that feels premium inside Streamlit's constraints

Accomplishments we're proud of

Auto-detects 5 languages from both native script and romanized Roman text
Built a fully working multilingual healthcare AI accessible from any browser
Symptom checker provides actionable urgency guidance without any medical diagnosis
Complete open-source stack — anyone can fork, run, or extend it

What we learned

Romanized text detection requires domain-specific keyword lists, not generic language models
Educational framing + safety disclaimers are essential for responsible AI in healthcare
Streamlit can support surprisingly rich UIs with custom CSS injection
LoRA fine-tuning on domain-specific data significantly improves response relevance