Inspiration

In high-volume clinics across South Asia, doctors struggle with language barriers (local dialects vs. formal English records) and the risk of drug-drug interactions due to time pressure. We built PulseScriptAI to bridge this gap.

What it does

PulseScriptAI is an intelligent medical scribe. It listens to consultations in Urdu, Hindi, or English, translates medical terms (e.g., “Bukhar” → Fever), and structures them into professional clinical notes.

Most importantly, it uses Gemini 3’s reasoning to perform live Drug-Drug Interaction (DDI) checks, ensuring patient safety in real time.

How we built it

  • Frontend: Next.js 14 with Tailwind CSS for a fast, responsive doctor dashboard
  • Backend: FastAPI (Python) managing the logic and data flow
  • AI Engine: Google Gemini 3 API
    • Leveraged long-context window
    • Advanced reasoning for medical translation and safety enforcement
  • Database: SQLite for secure patient history tracking

Challenges we ran into

Handling “Hinglish” and “Urdu-English” code-switching was tough. We had to fine-tune system prompts to ensure Gemini correctly identified clinical entities from informal speech.

Accomplishments that we're proud of

  • Successfully implemented a real-time safety guard that flags dangerous medication combinations in seconds
  • Potentially saving lives in high-pressure clinical environments

What we learned

We discovered the power of Gemini 3’s multimodal reasoning, especially its ability to handle complex JSON extraction from messy, multilingual audio transcripts better than previous models.

What's next for PulseScriptAI

  • Deploying the system as a secure web application for real-world clinical testing
  • Expanding drug safety checks with richer medical knowledge sources
  • Improving speech-to-text accuracy for noisy clinic environments
  • Exploring integration with Electronic Health Record (EHR) systems

Gemini 3 Integration

PulseScriptAI is built around the Gemini 3 API as its core intelligence engine. The system uses Gemini 3’s advanced reasoning capabilities to transform unstructured, multilingual doctor-patient conversations into structured clinical documentation in real time.

Gemini 3 processes mixed-language transcripts containing Urdu, Hindi, and English, including informal and code-switched speech commonly used in South Asian clinics. Using its long-context understanding, the model identifies clinical entities such as symptoms, diagnoses, medications, dosages, and patient history, and outputs them in a structured JSON format suitable for electronic medical records.

A key capability enabled by Gemini 3 is real-time Drug-Drug Interaction (DDI) analysis. As medications are extracted, Gemini 3 reasons over the patient’s existing medication history and flags potentially unsafe combinations instantly, allowing doctors to intervene during the consultation.

Gemini 3’s low-latency responses make live clinical use possible, while its reasoning capabilities are essential for safety-critical medical decision support.

Built With

Share this project:

Updates

posted an update

Technical Improvements

  • Optimized AI prompts for informal clinical speech
  • Leveraged long-context reasoning for better medication history analysis
  • Improved response latency for near real-time doctor feedback

Log in or sign up for Devpost to join the conversation.

posted an update

Latest Update

  • Implemented end-to-end multilingual medical note generation (Urdu, Hindi, English)
  • Added real-time Drug-Drug Interaction (DDI) safety checks during note creation
  • Improved handling of mixed-language (Hinglish / Urdu-English) consultations
  • Finalized FastAPI backend + Next.js dashboard integration
  • Polished prompts and JSON extraction for more accurate clinical structuring

Log in or sign up for Devpost to join the conversation.