🌟Inspiration
The idea for CareCase was inspired by a belief that clinical education could be made more accessible, more human, and more empowering.
As medical training increasingly moves toward technology, we wanted to build something that not only sharpens diagnostic skills, but also trains the one thing AI can't replace — empathetic communication.
At the same time, we were deeply inspired by students in lesser-privileged parts of the world, who fight every day to learn and grow despite limited resources.
We saw how strong their drive was — even without stable Wi-Fi, expensive devices, or perfect facilities — and it made us ask: what if we could give them a full interactive training platform in one small, portable box?
CareCase was designed to meet that challenge — a lightweight, offline-ready simulation system that brings real-world medical scenarios to students anywhere, bridging the gap between technology and opportunity.
Because we believe the future of healthcare shouldn’t be limited by geography or privilege — it should be accessible to anyone with the heart to serve.
💡What it does
CareCase is an AI-powered clinical simulation platform designed to help medical students, nursing trainees, and healthcare learners sharpen their diagnostic reasoning and patient communication skills. Students interact with virtual patients in realistic clinical scenarios, using live voice input and receiving dynamic AI-generated responses. CareCase evaluates both what the student says and how they engage, offering real-time feedback on medical knowledge, empathy, and communication skills.
Beyond the core simulation, CareCase now includes two major enhancements:
Medicine Appendix: A searchable, categorized reference of thousands of medicines, allowing students to quickly review drug details during or after simulations, just like in real-world clinical environments.
Diagnosis Trainer: A quiz-style feature where students are presented with dynamically generated patient profiles (using GPT when online) and must make diagnostic decisions based on symptoms, vitals, and clinical indicators. This challenges students to practice critical thinking in a fast-paced, interactive way.
We also built in accessibility features from the ground up, including font scaling, contrast toggling for better visibility, and language adaptability. CareCase is designed to accommodate learners of all backgrounds, languages, and abilities, making medical education more inclusive and accessible for everyone.
🛠️How we built it
We built CareCase as a full-stack application focused on real-time, offline-capable AI interaction.
The backend is powered by FastAPI, with MongoDB Atlas handling patient scenarios, medical datasets, interaction logs, and more. We integrated OpenAI's GPT for dynamic, context-sensitive dialogue generation, while also developing a custom offline scoring system using sentence-transformers (MiniLM).
On the frontend, we built an interactive desktop application with Tkinter, utilizing Whisper.cpp for local speech-to-text (STT) and Edge-TTS for fast, realistic text-to-speech (TTS) output.
We incorporated computer vision functionality with Mediapipe and FER models to track gaze behavior and emotional expression, adding richer feedback and analysis.
CareCase was designed with lightweight, hardware-aware architecture to ensure smooth operation on ARM-based devices like the Raspberry Pi 5, supporting both online and offline modes seamlessly.
🚧Challenges we ran into
Throughout the development of CareCase, we ran into some major challenges that pushed us to adapt quickly.
Getting smooth and accurate speech-to-text was harder than expected, especially when trying to keep conversations natural and responsive.
We also struggled with TTS voice compatibility — a lot of voice models we wanted to use were restricted by region, so we had to find workarounds to make sure users could still hear high-quality output.
Camera calibration for different setups became a big focus too; we needed a fast way to map facial landmarks to calibration dots without slowing the user down, which forced us to rethink our whole timing and detection pipeline.
Mapping gaze direction in real-time for attention tracking took a lot of tuning, and getting facial expression analysis to run accurately on the Raspberry Pi 5 was another huge hurdle — we ended up building a custom Python 3.9 environment just to make TensorFlow and FER work reliably.
Even after that, building the offline scoring system brought its own problems.
Our first approach with cosine similarity didn’t handle full sentences well, so we switched to sentence-transformer embeddings, which worked better but sometimes gave negative scores — we had to create a system to round those safely so grading stayed consistent.
Every step taught us something new about building AI interactions that feel natural, even when the system has to run completely offline.
🏆Accomplishments that we're proud of
One of the biggest accomplishments we're proud of is building a fully functional offline audio simulation loop — where users can speak naturally to a virtual patient, and get dynamic, intelligent responses without needing an internet connection.
We successfully integrated real-time speech recognition, AI-driven conversation, and feedback scoring into a lightweight system that runs entirely on local hardware.
We also designed a calibration workflow that personalizes gaze and attention tracking per user, bringing a new level of realism to simulation feedback.
Beyond the technical side, we're proud that CareCase is more than just a tool — it’s a portable platform that can open doors for students anywhere, even in environments with limited or no access to Wi-Fi or expensive resources.
Bringing all of these complex pieces together into a small, responsive, and human-centered system is something we’re truly excited about.
📝What we learned
Throughout the development of CareCase, we learned how to integrate multiple AI tools across speech, vision, and language to create a dynamic and realistic clinical simulation platform. We gained experience working with:
Speech-to-Text with Whisper.cpp for lightweight, offline voice transcription, optimized for low-resource devices like the Raspberry Pi 5.
Text-to-Speech with Edge-TTS, enabling fast, realistic audio responses from AI-generated lines while caching for offline use.
SentenceTransformer (MiniLM) models to perform offline matching and scoring of user responses based on meaning rather than exact wording, building a more intelligent feedback system.
Facial Expression Recognition (FER) models to track emotion trends during simulation sessions without interrupting the user experience.
Computer Vision and Gaze Tracking to detect real-time eye contact and gaze direction, allowing us to assess engagement during clinical interactions.
This project challenged us to think deeply about how to make clinical education more accessible, realistic, and intelligent, and how to engineer AI systems that are practical to deploy in real-world, constrained environments.
🚀What's next for CareCase
Looking ahead, we’re excited to push CareCase beyond just software improvements and into real-world hardware innovation.
We plan to explore making the system even more accessible and sustainable by incorporating features like solar-powered operation, allowing it to be deployed in areas with limited electricity.
We’re also aiming to expand CareCase into a connected platform by building end-to-end video streaming capabilities — letting students and instructors from different parts of the world train, collaborate, and share simulations in real-time.
Beyond technical upgrades, we hope to deepen the AI models driving patient behavior, adding even more realistic clinical reasoning challenges and communication nuances.
Ultimately, we see CareCase growing into a truly global, lightweight training ecosystem that helps bridge gaps in medical education wherever they're found.
Log in or sign up for Devpost to join the conversation.