Inspiration
Doctor–patient conversations are often rushed, and patients forget critical instructions about medicines, diet, and follow-ups. Language barriers and handwritten prescriptions make it even harder. We wanted to build an AI assistant that listens, understands, and helps patients follow medical advice correctly.
What it does
CareConnect is an AI-powered doctor–patient companion that records consultations, converts voice to text, and generates a clear summary for patients. It analyzes prescriptions using the camera, translates medical instructions across languages, and provides smart medication reminders based on context (e.g., reminding BP medicine after meals). It also learns from historical prescriptions to provide helpful insights and recommendations.
How we built it
Gemini CareConnect: The Persistent Medical Companion Gemini CareConnect is an immersive, real-time AI medical companion designed to solve the universal "continuity problem" in healthcare: the moment a patient leaves a clinic, they often forget up to 80% of their doctor's instructions Unlike traditional medical "scribes" that simply record notes for after-visit review, CareConnect utilizes a proactive, multi-agent architecture to support both doctors and patients during the consultation and long after they return home
Core Features and Functionality Real-Time Clinical Scribe: Utilizing the low-latency Gemini Live API, the agent records and summarizes natural doctor-patient dialogue, converting complex consultations into actionable summaries
Vision-Based Prescription Analysis: A "killer feature" that allows patients to scan physical prescriptions. The agent instantly extracts drug names, dosages, and frequencies, eliminating the need for manual data entry
Proactive Safety Orchestrator: The system acts as an active third participant. If a doctor prescribes a medication that conflicts with the patient’s historical health data or allergies, the agent can "politely interrupt" to flag the interaction in real-time
Context-Aware Reminders: Instead of static alarms, CareConnect provides intelligent reminders. For example, it might ask, "I see you're about to take your BP medicine—did you eat lunch yet?" to ensure adherence to clinical instructions
Multilingual Support: Live translation features (e.g., English to Mandarin or Spanish) enable the AI to act as a bridge for non-native speakers in high-stress medical environments
Technologies Used The project is built on a robust, Google Cloud-native stack designed for scale and security
AI Frameworks: Agent Development Kit (ADK) for multi-agent orchestration and the Gemini Live API for bidirectional voice and video streaming
Protocols: Model Context Protocol (MCP) for connecting the agent to medical tools and the Agent2Agent (A2A) Protocol for seamless communication between specialized sub-agents
Backend Hosting: Hosted on Google Cloud Run for automatic scaling, with Cloud Functions serving as a secure gateway for AI requests to keep API keys server-side
Data & Identity: Firestore for real-time medical history storage and Firebase Authentication with Role-Based Access Control (RBAC) to strictly separate Doctor and Patient interfaces
Frontend: A high-performance mobile UI built with Flutter
Data Sources Historical Health Records: The agent leverages the patient’s past prescriptions and medical reports stored within the Firebase database to identify trends and conflicts
Live Multimodal Input: Real-time audio from consultations and camera-based "Vision" input from physical documents
Medical Knowledge Bases: Integrated via tools to ensure safety checks are grounded in established clinical interaction data
Findings and Learnings Throughout development, our primary finding was that proactive intervention is the true "Gemini Moment"
We discovered that: Continuity is More Than a Chatbot: Moving from a "text-box" paradigm to a persistent "companion" creates a meaningfully different user experience that fosters patient confidence
Multimodal Integration is Technically Challenging but Essential: Orchestrating voice, vision, and text simultaneously through WebSockets requires precise handling, but it is what allows the agent to handle natural interruptions gracefully—a key metric for technical execution
Safety and Trust are Paramount: In medical AI, transparency is vital. We learned to implement strict "informational assistant" disclaimers, ensuring clinical decisions remain with the human provider while the AI focuses on improving adherence and understanding
Challenges we ran into
- Interpreting handwritten prescriptions accurately
- Maintaining context during long doctor–patient conversations
- Ensuring translation keeps medical meaning intact
- Designing two different experiences for doctors and patients while keeping the workflow simple
Accomplishments that we're proud of
- Real-time consultation summarization
- Camera-based prescription understanding
- Context-aware medication reminders
- Seamless doctor and patient interfaces powered by live AI
What we learned
Healthcare interactions require clarity, accuracy, and trust. Even simple AI features like summarizing instructions or reminding patients at the right time can significantly improve treatment adherence.
What's next for CareConnect
- Integrate wearable and health report data for deeper insights
- Add follow-up question suggestions for patients after consultations
- Enable EHR integrations with clinics
- Expand multilingual support for global accessibility.

Log in or sign up for Devpost to join the conversation.