Inspiration
Healthcare communication often breaks down at the most critical moment — when patients try to explain where and how they’re experiencing pain. This problem becomes even more severe in telemedicine, across language barriers, or for patients with accessibility needs. We were inspired to create Gestura after realizing how unnatural it is to describe physical pain using only words or forms, when in real life we instinctively point. We wanted to build a more human, inclusive, and intuitive way for patients to communicate with clinicians.
What it does
Gestura is an AI-powered, gesture-based medical screening platform that allows patients to communicate pain non-verbally. Using only a camera, patients point to areas of concern on their body and use a simple pinch gesture to confirm selections. These inputs are mapped to anatomical regions, visualized on a 3D human model, and converted into structured, clinician-ready summaries using generative AI. The system supports multiple saved body parts, real-time visual feedback, and multilingual audio guidance to improve accessibility and clarity.
How we built it
We built Gestura using a Flask backend that orchestrates real-time computer vision, state management, and AI services. MediaPipe Holistic is used for full-body and hand tracking, allowing us to detect pointing and pinch gestures without any wearables. Custom geometric regions map pose landmarks to anatomical body parts, which are synced with a Three.js-powered 3D digital twin. We integrated Google Gemini to generate structured medical summaries from gesture data and ElevenLabs to provide multilingual voice instructions. Significant effort went into optimizing performance to maintain real-time responsiveness.
Challenges we ran into
One of our biggest challenges was achieving reliable, real-time gesture detection while maintaining smooth performance. Pinch detection needed to work consistently at different distances from the camera, which required scaling gesture thresholds dynamically. Mapping 2D camera input to meaningful anatomical regions and a 3D model also required careful calibration. Additionally, coordinating real-time computer vision with a web-based UI and AI services under hackathon time constraints was a major technical challenge.
Accomplishments that we're proud of
We’re proud of building a fully functional, end-to-end system that combines computer vision, 3D visualization, and generative AI into a cohesive user experience. Gestura works with no specialized hardware, supports multiple saved injury points, provides immediate visual feedback, and generates clinician-ready summaries. Most importantly, we created a solution that meaningfully improves accessibility and communication in healthcare — not just a technical demo.
What we learned
Through building Gestura, we learned how to design real-time computer vision systems that balance performance and accuracy, how to map human gestures into structured data, and how to integrate generative AI in a way that adds real value. We also learned the importance of designing with accessibility and human-centered interaction in mind, especially in high-stakes domains like healthcare.
What's next for Gestura
Next, we plan to expand Gestura by adding pain intensity and annotation tools, improving mobile device support, and integrating with clinical standards such as EHR and FHIR systems. We also want to conduct usability testing with real patients and clinicians, explore privacy-preserving deployment options, and enhance the 3D model with richer visualization such as heatmaps and timelines. Our long-term goal is to make Gestura a seamless part of telemedicine and in-person clinical workflows.

Log in or sign up for Devpost to join the conversation.