Inspiration
In classrooms, blind and visually impaired students often struggle to follow along when teachers use presentation slides. They cannot see the images, charts, or text that form a key part of the lesson. This gap in accessibility inspired us to build a solution that allows every student to learn equally and independently.
What We Built
We created a mobile app that uses a phone camera and AI-powered slide recognition to make classroom presentations accessible. The app detects slides in real time and helps students access the slide content through audio when requested. Students can ask questions like “What is in the image?” or “What are the listed points?”, and the app will respond instantly, describing the content in a clear and concise way. No special hardware or setup is needed, just a smartphone and our app, which turns any classroom into an inclusive learning space.
OpenCV for Computer Vision: Used OpenCV to detect and segment slides from the live camera feed.
OCR and Visual Understanding: Combined EasyOCR and image processing with vision language models to extract text, recognize enumerations, and describe images.
Generative AI: Integrated GenAI models to interpret slide content, summarize information, and answer student queries conversationally.
Audio Output: Used text-to-speech to deliver spoken answers through connected earbuds, ensuring no overlap with the teacher’s voice.
Real-Time Processing: Optimized frame handling and inference pipelines to achieve live responsiveness on mobile devices.
What We Learned
We learned how to combine computer vision, OCR, and generative AI into a single accessible tool. We also realized that accessibility design requires both empathy and technical precision to make technology truly inclusive.
Log in or sign up for Devpost to join the conversation.