Inspiration
Imagine you're in a foreign country, surrounded by people speaking a language that sounds like random noises to you. Now, picture yourself in a healthcare setting, trying to understand critical information about your health in that same incomprehensible language. This is the reality for millions of patients every day. MediSpeak is here to change that. By combining real-time AI-powered language translation with culturally adaptive health literacy tools, MediSpeak ensures patients can understand and engage with their healthcare providers, breaking down language barriers and empowering informed decision-making. It's not just translation—it's clarity, trust, and care.
What it does
MediSpeak revolutionizes communication in healthcare by bridging the language gap between doctors and patients. Powered by advanced AI integration, MediSpeak provides real-time translation across 194 languages, ensuring that patients and providers can understand each other effortlessly.
But MediSpeak goes beyond simple translation. For patients, it helps articulate their thoughts clearly, enabling them to express symptoms and concerns in ways that healthcare providers can easily interpret. For doctors, MediSpeak simplifies the cognitive burden of tailoring information delivery to non-native speakers by offering three levels of text transformations:
No Background Knowledge: Simplified explanations for patients with minimal understanding of medical concepts.
Intermediate Background Knowledge: Balanced detail for patients with some familiarity with healthcare terminology.
High Domain Knowledge: Technical language for patients or professionals with advanced understanding.
Additionally, MediSpeak provides confidence scores for translations, reassuring both parties that their communication is accurate and meaningful. Lastly, figures are provided to the doctor to assist in the communication with their patients. By combining translation, articulation assistance, and tailored content delivery, MediSpeak empowers both patients and doctors to make informed decisions while fostering trust and clarity in healthcare interactions.
How we built it
We started by defining the core functionalities of MediSpeak and creating a Minimum Viable Product (MVP) that we would feel confident submitting. This MVP focused on real-time translation and basic health literacy tools, providing diagrams, and ensuring the app could effectively bridge language barriers in healthcare.
To ensure cross-platform compatibility, we chose Flutter for the frontend and Django for the backend. Flutter allowed us to create a sleek, responsive interface for both web and mobile platforms, while Django provided a robust backend for managing data and integrating advanced AI models.
As a team, we adopted a collaborative workflow, rotating between frontend and backend development to maintain equal contributions. Our thorough initial planning process helped us stay focused throughout the 24-hour hackathon, setting clear milestones and prioritizing tasks.
Once the MVP was complete, we shifted our focus to implementing additional features that significantly enhanced the overall user experience. These included:
Confidence Scores: Providing users with feedback on translation accuracy to foster trust in communication.
Three Levels of Text Transformation: Allowing doctors to choose between simplified, intermediate, or technical explanations based on their patients' understanding.
Thought Articulation Tools: Helping patients express their concerns clearly to healthcare providers.
Challenges we ran into
Flutter Our primary challenge was working with Flutter's static asset system. Flutter requires all assets to be declared in the pubspec.yaml file and bundled at compile time, making dynamic content loading extremely difficult. This created significant hurdles when we needed to generate and display images on-the-fly, as Flutter doesn't support runtime directory listing or dynamically loading files added after the app is built.
API Integration Complexities Integrating Google's Gemini APIs with our Django backend required substantial engineering effort. We encountered inconsistencies between Gemini's responses in AI Studio versus the actual API implementations, where API results were often lower quality than what we observed in the development environment. These discrepancies required additional validation and error handling throughout our codebase.
Image Generation Limitations Initially, we experimented with Google's Gemini for image generation to create medical diagrams and illustrations. However, we found several limitations:
Accuracy issues with the generated images, particularly for medical content
Policy violations when requesting certain types of medical illustrations
Inconsistent quality in the generated outputs
After extensive testing, we determined that scraping Google search results provided more reliable and medically accurate images than generating them with AI. This decision required implementing an additional scraping component but ultimately delivered better results for our users.
Cross-Platform Compatibility Ensuring our application worked consistently across web and iOS platforms presented additional challenges. The code that functioned well on web browsers often failed on iOS due to differences in file system access and asset handling, requiring platform-specific implementations and extensive testing.
Accomplishments that we're proud of
One of our biggest accomplishments was successfully leveraging technologies we had little to no prior experience with. From building the frontend in Flutter to integrating Google's Gemini APIs and Google Cloud Translate.
Another accomplishment was creating a polished and intuitive user interface. Despite having minimal design experience, we dedicated significant time to crafting an engaging UI that enhances the user experience. The gradient backgrounds, typewriter animations, and interactive elements were definitely challenging to figure out, but we're super proud that we pulled it off in the end.
What we learned
We developed essential UI/UX design skills despite having limited prior experience in this area. Creating the smooth typewriter animation effect, designing the custom language dropdown, and implementing the stacked card interface taught us how to balance aesthetics with functionality.
Our experience integrating with Google Gemini and Google Cloud Translate provided insights into working with state-of-the-art model apis. We learned to effectively prompt the model to generate responses with high accuracy and structure.
Perhaps most importantly, we developed a deeper understanding of the planning process for technical projects. Starting with a clear MVP definition and then expanding to additional features proved to be an effective approach for a time-constrained hackathon. This methodology kept us focused while allowing us to create a more polished final product.
What's next for MediSpeak
First, we plan to curate our own dataset of medical images and illustrations that could help both doctors and patients better understand medical situations. This specialized visual library would complement our text translations by providing visual context for complex medical concepts.
Second, we intend to expand MediSpeak to Android devices. While our Flutter implementation already provides cross-platform capabilities, we need to optimize the application specifically for Android.
Finally, we aim to expand our speech-to-text capabilities to support more languages. Currently, our application can process a wide range of languages through, but we want to enhance our speech recognition models to be able to handle more languages
Log in or sign up for Devpost to join the conversation.