Inspiration
As college students living away from home, we often found ourselves unsure of what to do in medical situations—whether it was managing symptoms, navigating care, or dealing with emergencies. We wanted a tool that could support us in moments of uncertainty, keep track of our health when we might forget, and help bridge the gap between us and professional care when we need it most.
On top of that, we’ve seen family members struggle with diagnoses that were either confusing or missed key context. We built this assistant to bridge that gap - making care more accessible, understandable, and accurate for both patients and providers. Thus, MediSync was born.
What it does
Our AI voice assistant passively captures key health-related audio throughout the day—tracking symptoms, medications, and events to build a personal health memory. It supports patients with real-time care guidance, helps doctors retrieve relevant context from a patient’s history, and automates post-visit summaries in clear, simple language to improve understanding and care outcomes.
How we built it
We built a comprehensive Medical AI Assistant using a modern tech stack designed for healthcare professionals. The system consists of a Python Flask web application that integrates with Vapi speech processing, Letta's agent-based AI platform, and Google's Gemini 2.5 Flash model for intelligent medical conversations.
Core Architecture:
Pipeline: Audio input through wearable microphone is processed through Vapi using Groq. This input is then filtered to pass only medical-related text into the Letta agent which uses the Gemini 2.5 Flash model. The input is processed and then the output results are displayed on the user end chat. The Doctor View enables patient information processing for the doctor as well as communicating information with the patient remotely. The Patient View enables patient interaction with the model to build context and memory around their medical condition. Both views enable the agent to assist in better medical care. Backend: Python Flask application with a patient management system that creates dedicated AI agents for each patient using their medical history and demographic information. Deployed on ngrok. AI Engine: Integrated with Letta API to create persistent, context-aware AI agents that maintain patient-specific memory and can attach medical documents for enhanced context Model: Leveraged Google's Gemini 2.5 Flash for cost-effective, high-performance medical AI assistance with strong reasoning capabilities Document Processing: Built a robust file upload system that processes medical documents (PDFs, text files) and attaches them to patient agents as a memory block for contextual awareness
Key Features:
Patient-Specific Agents: Each patient gets a dedicated AI agent with personalized medical history and context Secure Data Management: Environment-based API key management with comprehensive .gitignore for sensitive data protection Multi-Interface Access: Both command-line interface for healthcare providers and web interface for broader accessibility Document Integration: Healthcare providers can upload patient records, lab results, and medical documents directly to patient contexts Real-time Interaction: Interactive chat interface with the AI assistant for immediate medical consultation support We implemented comprehensive error handling and retry logic for reliable API interactions, ensuring the system works reliably in clinical environments.
Challenges we ran into
We ran into several challenges during development. First, audio transcription quality was inconsistent, especially with medical terminology, varying accents, and background noise, which impacted the accuracy of downstream summarization and memory logging. We had to experiment with different transcription engines and apply post-processing filters to improve reliability. Second, we encountered hardware-related issues while prototyping a small wearable audio device. Capturing clean audio data in real-world environments was difficult due to interference and mic sensitivity. Sending that data wirelessly to our backend introduced latency and occasional packet loss, which disrupted real-time processing. Generating transcripts from the wearable in real time required careful optimization of both network protocols and buffer handling to reduce delays. Additionally, document and file uploads, such as past medical records, were slow to process, which added latency to the memory update pipeline. To mitigate this, we parallelized preprocessing and upload logic. Finally, working with large LLMs like Claude 4 required us to be mindful of both cost and speed. We implemented fast filtering and token-trimming strategies on inputs and outputs to ensure only the most relevant content was processed, significantly improving efficiency without compromising quality.
Accomplishments that we're proud of
We’re proud of building a full-stack, functional AI healthcare assistant in just a few days - from integrating real-time voice processing with LLM-powered medical reasoning to designing a user interface that works for both patients and doctors. We created a system that can intelligently log health data, personalize care, and support clinical decision-making. We also tackled technical challenges like wearable audio streaming and efficient document memory attachment, and still delivered a smooth end-to-end demo experience.
What we learned
We learned how to navigate the complexity of building real-time, privacy-sensitive AI systems in healthcare—from selecting fast, accurate transcription tools to designing LLM prompts that avoid hallucination in clinical contexts. We also gained deeper insights into agent memory, token efficiency, and user-centered design for patients and doctors. Most importantly, we learned how much potential AI has in making healthcare feel more human, accessible, and proactive.
What's next for MediSync
Next, we plan to refine the wearable experience to make passive health monitoring more reliable and user-friendly. We’re exploring HIPAA-compliant cloud integrations to securely sync data with electronic health records (EHRs), and adding multilingual support for broader accessibility.
Built With
- css
- gemini
- groq
- hardware
- html
- javascript
- letta
- ngrok
- python
- scss
- typescript
- vapi
Log in or sign up for Devpost to join the conversation.