Inspiration

Doctor–patient conversations are often rushed, and patients forget critical instructions about medicines, diet, and follow-ups. Language barriers and handwritten prescriptions make it even harder. We wanted to build an AI assistant that listens, understands, and helps patients follow medical advice correctly.

What it does

CareConnect is an AI-powered doctor–patient companion that records consultations, converts voice to text, and generates a clear summary for patients. It analyzes prescriptions using the camera, translates medical instructions across languages, and provides smart medication reminders based on context (e.g., reminding BP medicine after meals). It also learns from historical prescriptions to provide helpful insights and recommendations.

How we built it

Gemini CareConnect: The Persistent Medical Companion Gemini CareConnect is an immersive, real-time AI medical companion designed to solve the universal "continuity problem" in healthcare: the moment a patient leaves a clinic, they often forget up to 80% of their doctor's instructions Unlike traditional medical "scribes" that simply record notes for after-visit review, CareConnect utilizes a proactive, multi-agent architecture to support both doctors and patients during the consultation and long after they return home

Core Features and Functionality Real-Time Clinical Scribe: Utilizing the low-latency Gemini Live API, the agent records and summarizes natural doctor-patient dialogue, converting complex consultations into actionable summaries

Vision-Based Prescription Analysis: A "killer feature" that allows patients to scan physical prescriptions. The agent instantly extracts drug names, dosages, and frequencies, eliminating the need for manual data entry

Proactive Safety Orchestrator: The system acts as an active third participant. If a doctor prescribes a medication that conflicts with the patient’s historical health data or allergies, the agent can "politely interrupt" to flag the interaction in real-time

Context-Aware Reminders: Instead of static alarms, CareConnect provides intelligent reminders. For example, it might ask, "I see you're about to take your BP medicine—did you eat lunch yet?" to ensure adherence to clinical instructions

Multilingual Support: Live translation features (e.g., English to Mandarin or Spanish) enable the AI to act as a bridge for non-native speakers in high-stress medical environments

Technologies Used The project is built on a robust, Google Cloud-native stack designed for scale and security

AI Frameworks: Agent Development Kit (ADK) for multi-agent orchestration and the Gemini Live API for bidirectional voice and video streaming

Protocols: Model Context Protocol (MCP) for connecting the agent to medical tools and the Agent2Agent (A2A) Protocol for seamless communication between specialized sub-agents

Backend Hosting: Hosted on Google Cloud Run for automatic scaling, with Cloud Functions serving as a secure gateway for AI requests to keep API keys server-side

Data & Identity: Firestore for real-time medical history storage and Firebase Authentication with Role-Based Access Control (RBAC) to strictly separate Doctor and Patient interfaces

Frontend: A high-performance mobile UI built with Flutter

Data Sources Historical Health Records: The agent leverages the patient’s past prescriptions and medical reports stored within the Firebase database to identify trends and conflicts

Live Multimodal Input: Real-time audio from consultations and camera-based "Vision" input from physical documents

Medical Knowledge Bases: Integrated via tools to ensure safety checks are grounded in established clinical interaction data

Findings and Learnings Throughout development, our primary finding was that proactive intervention is the true "Gemini Moment"

We discovered that: Continuity is More Than a Chatbot: Moving from a "text-box" paradigm to a persistent "companion" creates a meaningfully different user experience that fosters patient confidence

Multimodal Integration is Technically Challenging but Essential: Orchestrating voice, vision, and text simultaneously through WebSockets requires precise handling, but it is what allows the agent to handle natural interruptions gracefully—a key metric for technical execution

Safety and Trust are Paramount: In medical AI, transparency is vital. We learned to implement strict "informational assistant" disclaimers, ensuring clinical decisions remain with the human provider while the AI focuses on improving adherence and understanding

Challenges we ran into

  • Interpreting handwritten prescriptions accurately
  • Maintaining context during long doctor–patient conversations
  • Ensuring translation keeps medical meaning intact
  • Designing two different experiences for doctors and patients while keeping the workflow simple

Accomplishments that we're proud of

  • Real-time consultation summarization
  • Camera-based prescription understanding
  • Context-aware medication reminders
  • Seamless doctor and patient interfaces powered by live AI

What we learned

Healthcare interactions require clarity, accuracy, and trust. Even simple AI features like summarizing instructions or reminding patients at the right time can significantly improve treatment adherence.

What's next for CareConnect

  • Integrate wearable and health report data for deeper insights
  • Add follow-up question suggestions for patients after consultations
  • Enable EHR integrations with clinics
  • Expand multilingual support for global accessibility.

Built With

Share this project:

Updates