Inspiration
In high-volume clinics across South Asia, doctors struggle with language barriers (local dialects vs. formal English records) and the risk of drug-drug interactions due to time pressure. We built PulseScriptAI to bridge this gap.
What it does
PulseScriptAI is an intelligent medical scribe. It listens to consultations in Urdu, Hindi, or English, translates medical terms (e.g., “Bukhar” → Fever), and structures them into professional clinical notes.
Most importantly, it uses Gemini 3’s reasoning to perform live Drug-Drug Interaction (DDI) checks, ensuring patient safety in real time.
How we built it
- Frontend: Next.js 14 with Tailwind CSS for a fast, responsive doctor dashboard
- Backend: FastAPI (Python) managing the logic and data flow
- AI Engine: Google Gemini 3 API
- Leveraged long-context window
- Advanced reasoning for medical translation and safety enforcement
- Leveraged long-context window
- Database: SQLite for secure patient history tracking
Challenges we ran into
Handling “Hinglish” and “Urdu-English” code-switching was tough. We had to fine-tune system prompts to ensure Gemini correctly identified clinical entities from informal speech.
Accomplishments that we're proud of
- Successfully implemented a real-time safety guard that flags dangerous medication combinations in seconds
- Potentially saving lives in high-pressure clinical environments
What we learned
We discovered the power of Gemini 3’s multimodal reasoning, especially its ability to handle complex JSON extraction from messy, multilingual audio transcripts better than previous models.
What's next for PulseScriptAI
- Deploying the system as a secure web application for real-world clinical testing
- Expanding drug safety checks with richer medical knowledge sources
- Improving speech-to-text accuracy for noisy clinic environments
- Exploring integration with Electronic Health Record (EHR) systems
Gemini 3 Integration
PulseScriptAI is built around the Gemini 3 API as its core intelligence engine. The system uses Gemini 3’s advanced reasoning capabilities to transform unstructured, multilingual doctor-patient conversations into structured clinical documentation in real time.
Gemini 3 processes mixed-language transcripts containing Urdu, Hindi, and English, including informal and code-switched speech commonly used in South Asian clinics. Using its long-context understanding, the model identifies clinical entities such as symptoms, diagnoses, medications, dosages, and patient history, and outputs them in a structured JSON format suitable for electronic medical records.
A key capability enabled by Gemini 3 is real-time Drug-Drug Interaction (DDI) analysis. As medications are extracted, Gemini 3 reasons over the patient’s existing medication history and flags potentially unsafe combinations instantly, allowing doctors to intervene during the consultation.
Gemini 3’s low-latency responses make live clinical use possible, while its reasoning capabilities are essential for safety-critical medical decision support.
Built With
- ai
- fastapi
- gemini-3-api
- next.js
- python
- sqlite
- studio
- tailwind-css
- typescript
Log in or sign up for Devpost to join the conversation.