💡 Inspiration

AI is increasingly being leveraged in various aspects of life, including disease diagnosis. However, doctors still spend a significant amount of time talking with patients to identify their symptoms. According to research, doctors spend an average of 18 minutes per patient (Neprash et al., 2021). This face-to-face interaction can be time-consuming and inefficient, as it often relies heavily on patients' memories to provide information about past symptoms and illnesses.

To address this issue, MedTrak offers a convenient tool for both patients and doctors that reduces the time required for diagnosis and improves the accuracy of information. MedTrak enables the real-time recording of symptoms and visualizes patient data, allowing doctors to save approximately 4.5 hours each day, considering that most doctors see between 11 and 20 patients daily (The Physicians Foundation, 2018). With accurate information at their fingertips, patients are better equipped to share their medical history and stay connected with their doctors.

Our primary inspiration for developing MedTrak stems from two key observations: first, the current pipeline for patients receiving analysis during appointments is highly inefficient, primarily relying on memory; second, these inefficiencies waste valuable time during appointments, as doctors spend a majority of their time assessing symptoms and root causes.

By creating MedTrak, we aim to streamline this process, allowing patients to receive better care and follow-up support during appointments. Our tool provides doctors with essential information upfront, helping narrow down potential causes so they can concentrate on offering precise support instead of spending excessive time on analysis. As a result, doctors can see more patients, and patients can engage meaningfully by asking critical questions that contribute to their health.

🤒 What it does

MedTrak is a patient-clinician engagement tool designed to significantly enhance how patients track their symptoms and how clinicians review that information before and during appointments. This tool addresses the issue of "fuzzy recall," where patients struggle to remember details about symptoms that occurred days or weeks prior. By improving symptom tracking, MedTrak reduces the time clinicians spend assessing primary causes during visits, allowing them to focus more on health promotion efforts and care for more patients. With an overview of a patient's symptoms and a concise summary of potential causes, the evaluation process becomes more efficient.

MedTrak serves two primary functions: assisting patients in capturing their symptom data easily and converting that raw data into a concise, actionable summary for healthcare providers.

For Patients: Easy Symptom Capture

MedTrak gives patients a simple, visual way to capture symptoms as they happen in real life. This raw data is automatically time-stamped and logged, creating a comprehensive "media vault."

  • Capture Tools: Patients can use one-tap options to take photos, record videos, or use voice logs to capture specific symptoms, such as the progression of a rash, a cough or wheeze, or changes in their gait.
  • Structured Notes: Voice recordings are transcribed and turned into structured notes that highlight essential details like the onset, duration, severity, and triggers of the symptoms.
  • Guidance: The app can even guide patients with "before/after" guided photos to ensure consistency in angle and lighting, which is crucial for tracking changes.

For Clinicians: Smart Timeline and Triage

The real strength of MedTrak lies in its use of Natural Language Processing (NLP) as an AI backbone, which transforms the patient's raw media into a valuable clinical tool.

  • Smart Timeline: The media is stitched together into a chronological strip, and the system automatically performs change detection.
  • Triage Brain: A back-end clinical chatbot acts as a "triage brain." It asks the patient a few targeted questions to narrow down the likely root cause (e.g., infectious rash vs. drug eruption).
  • Clinician Summary: This process produces a concise, 60-second read summary for the provider. This summary provides the clinician with a quick overview of the symptom journey including urgency recommendations (e.g., Urgent, Not Urgent), and the top three possible differentials (the non-diagnostic possibilities).

💻 How we built it

AI/ML Layer
We leveraged OpenAI’s Whisper for audio transcription. Since it requires audio to be encoded into bytes, we ran into memory challenges—processing longer clips could exceed 1 GB of RAM. To address this, we experimented with running transcription workloads in the cloud. For image analysis, we encoded media into base64 before passing it to LLMs, which improved recognition quality compared to raw file handling. We also explored traditional machine learning approaches for symptom feature extraction, but the datasets available to us were too limited and lacked the complexity needed for reliable performance. In practice, large language models provided better accuracy and scalability. Most of the AI/ML integration involved porting Python-based workflows into our FastAPI backend.

We leveraged OpenAI’s Whisper for audio transcription. To address memory challenges, we experimented with running transcription workloads in the cloud. For image analysis, we encoded media into base64 before passing it to LLMs, which improved recognition quality compared to raw file handling. We also explored traditional machine learning approaches for symptom feature extraction, but the datasets available to us were too limited and lacked the complexity needed for reliable performance. In practice, we actively used large language models to generate follow-up questions and summary of symptoms, as they provided better accuracy and scalability. Then we integrated this result with real-time user input and constructed an end-to-end pipeline from record to summary and analysis. Most of the AI/ML integration involved porting Python-based workflows into our FastAPI backend.

Frontend
The frontend was built with React Native to support cross-platform deployment across iOS, Android, and web. We wrote the app in TypeScript with JSX for component structure and used lightweight styling solutions instead of heavy CSS frameworks to keep the UI flexible and consistent across devices.

Backend
FastAPI served as the backbone for orchestrating AI/ML workflows and connecting the frontend to the database. This included endpoints for media upload, transcription, and data analysis.

Database
We configured Firebase to manage authentication, real-time updates, and media storage. Its integration simplified secure login for both patients and providers and reduced overhead in managing user state across platforms. In short, we combined cross-platform mobile development (React Native + TypeScript), scalable backend services (FastAPI), and AI-powered workflows (Whisper, LLMs, computer vision pipelines) with Firebase as the data layer to deliver MedTrak’s core functionality.

🚧 Challenges we ran into

  • Memory challenges with audio transcription: While using OpenAI’s Whisper for audio transcription, we ran into memory challenges since it requires audio to be encoded into bytes—processing longer clips could exceed 1 GB of RAM. This was a major issue due to the limitations of achieving high resolution transcriptions while balancing memory usage due to higher accuracy with larger models.

  • Suboptimal result of traditional machine learning: We first built our own structure of extracting meaningful information from the records, which includes the process of: 1) calculating cosine similarities of text embeddings to extract main symptoms, 2) train a logistic regression model for disease with public datasets, and 3) select potential symptoms that can differentiate case by entropy minimization. However, the public dataset we used was not a best fit for our problem, and gave suboptimal results. We addressed this problem by leveraging LLM with carefully crafted prompts for each task.

  • Insufficient knowledge and resources in symptom analysis: Doctors often rely on implicit knowledge for diagnoses, which makes it difficult to find explicit sources of information about the diagnosis process. Since MedTrak is not made for medical advice or professional diagnosis, it was difficult to think of intermediate analysis that would help doctors. We addressed this problem by keeping our analysis simple, we focused on our core features of summarizing and visualizing, instead of producing excessive analysis with imagining.

  • Preprocessing Data and Output Validation for LLM: Large Language Models (LLMs) are significantly influenced by their input. Therefore, we spent considerable time experimenting with various input formats and prompts to identify the settings that yield the best results. Additionally, to mitigate the risk of LLMs generating undesirable outputs, we incorporated custom code during the final result generation. This approach helps us ensure the reliability and quality of the outputs.

🏆 Accomplishments that we're proud of

  • Generate Follow-Up Questions to Specify the User’s Case: In typical record-keeping apps, doctors only receive the information that patients have recorded and cannot ask in-context questions in real time. However, in MedTrak, we utilize large language models (LLMs) to analyze patients’ records and pose follow-up questions to clarify their cases. This approach makes the record more informative, resembling an in-person conversation between a doctor and a patient.

  • Enable Audio and Image Recording for Clear Understanding: Some symptoms, such as cough and rash, can be difficult to describe in text. To provide richer information for these symptoms, MedTrak allows for audio and image inputs in addition to plain text. These types of information are not only recorded but also transcribed to text for further analysis combined with other types of information. Additionally, voice recording simplifies the process for patients even when making explanatory records.

  • Visualization of Text Information: Most symptom descriptions are based on text, which can be hard to interpret and are time-consuming to read and process. MedTrak’s visual summaries and analyses make it easier for doctors to quickly understand patients’ current statuses at a glance, thus reducing the time spent on symptom identification.

🧐 What we learned

  • Navigating a New Programming Language and Service: We learned how to navigate a new programming language and integrate Firebase into our app. Since we were not familiar with uploading data, this was our first experience using Firebase. It took some time to get accustomed to it, but we eventually figured it out. From this experience, we discovered that we can confront unfamiliar challenges and overcome them.

  • Narrowing Scope Under Time Pressure: Initially, we had many ideas and wanted to implement numerous features. However, we realized that to build a functional product within 36 hours, it was crucial to define the core features that represent our product and focus only on the minimum necessary functionality.

  • Collaborating Across Roles: Our team had various strengths—some members excelled in frontend development, while others specialized in backend or AI/ML. We had to learn how to divide tasks effectively and communicate our progress under tight deadlines. This experience highlighted the importance of clear role assignments and regular check-ins to keep the project on track.

🔜 What's next for MedTrak

  • Voice/Image feature detection: We plan to enhance our AI models to automatically detect key features in voice and image recordings, such as cough patterns or skin changes. This will make symptom timelines richer and more clinically actionable.

  • Push alerts from doctors when immediate action is needed: We will develop a system for clinicians to trigger instant push notifications when they identify red flags in a patient’s submissions. This will ensure urgent cases receive prompt attention.

  • Link data from health apps: steps, heart rate: We aim to integrate wearable and health app data, including steps, heart rate, and sleep patterns, into the symptom timeline. This will provide doctors with more context regarding lifestyle and physiological factors influencing health.

  • Include a privacy plan for protecting personal data: We plan to establish a robust privacy and governance framework with encryption, consent controls, and strict retention policies. This will ensure that patients have clarity and control over how their data is used and shared.

Built With

Share this project:

Updates