SymptoScan
Tagline: AI-driven symptom reporting from voice, touch, and image.
Inspiration
We were inspired by a common and frustrating problem in healthcare: patients often struggle to clearly explain what they’re feeling, and doctors waste valuable time trying to piece together vague or incomplete symptom descriptions. At the same time, family doctors spend up to 70% of their time on paperwork and documentation—time that could be better spent with patients. This inefficiency slows down care and leads to burnout. SymptoScan aims to close that gap by letting patients communicate symptoms in a natural way, while helping clinicians receive structured, medically formatted summaries that dramatically reduce administrative burden.
What it does
SymptoScan is an AI-powered web app that helps users describe their symptoms through natural inputs—speaking, pointing on a 3D body map, and uploading photos of visible issues like rashes. The app listens to a patient describe how they feel, transcribes that speech into text, lets them pinpoint specific body parts where symptoms occur, and analyzes uploaded images using computer vision models. Then, a large language model takes all of that information and generates a structured clinical report in clear, professional medical language. Patients also receive a simplified summary that explains what might be going on, always with a clear disclaimer that it’s not a diagnosis.
How we built it
We built the frontend using React and integrated a rotatable 3D body map using Three.js. The speech input is handled by OpenAI’s Whisper model, which runs locally and provides high-quality transcription. For image uploads, we used a pretrained vision model that generates short descriptive captions, which are passed to the language model along with the speech transcription and body map data. The backend, written in Python, coordinates all these inputs and feeds them into GPT-4 to generate a structured report. We used HTML-to-PDF tools to format and export the final output as a clean, professional-looking PDF.
Challenges we ran into
One of the biggest challenges was making the 3D body map both intuitive and precise. Figuring out how to map user clicks to anatomically accurate regions, including differentiating front and back, took time and trial-and-error. Another major hurdle was latency—processing images and LLM outputs can take several seconds, so we had to optimize our API calls and caching strategies. Getting consistent, clinically appropriate output from the language model also required careful prompt design and iteration.
Accomplishments that we're proud of
We’re especially proud that SymptoScan doesn’t just work—it actually speeds up the clinical workflow. Several of our friends in medical school and residency reviewed the prototype and confirmed that it would save doctors considerable time by pre-structuring symptom data before the appointment. It was rewarding to see how our system took scattered patient input and converted it into a clean, structured clinical note. Bringing together multiple AI technologies—speech recognition, vision models, and large language models—into one tool that can genuinely support real-world care felt like a major achievement.
What we learned
We learned a lot about multimodal AI integration, particularly how to pass outputs between models in a way that maintains context and accuracy. We also gained insight into how important interface design is for usability—especially in healthcare, where clarity and accessibility are critical. Prompt engineering turned out to be more important than we expected; getting the language model to consistently output structured, medically sound text took experimentation and refinement.
What's next for SymptoScan
Next, we’d like to improve the precision of symptom localization on the 3D model and add support for multiple languages. We’re also exploring the integration of diagnostic suggestion models—always with a clear emphasis on safety and disclaimers. Eventually, we want to partner with healthcare providers to test the tool in real-world settings, where it could help patients communicate more clearly and doctors make faster, more informed decisions.
Built With
- fastapi
- java
- openai
- pillow
- python
- sqlalchemy
- uvicorn
Log in or sign up for Devpost to join the conversation.