Inspiration
All four of us are engineering students from different Universities. We all share the same frustrations of attending slow lectures, where students are not engaged, and professors are forced to memorize lecture notes, and often make mistakes on the board - making learning difficult for students, and having profs feel helpless at times.
We have a strong belief that with AI, profs and students should have a much better learning experience if it were integrated within their lectures.
What it does
🧪 Chemistry Made Visual: Professor sketches a benzene ring? A perfect 3D molecular structure appears instantly, rotatable and zoomable for the entire class to explore.
📊 Math Comes Alive: Write "y = x²" and watch as a precise, interactive 2D graph materializes on the board, complete with labeled axes and key points.
🔍 Error Prevention & Recovery: AI continuously monitors the lecture, gently alerting professors when they skip sections of their notes or make computational errors—like having a teaching assistant that never sleeps.
🎤 Voice-Driven Intelligence: Simply say "I need Newton's second law" while writing "F = " and watch as "F = ma" appears with full explanations and related concepts.
✋ Gesture Control: Navigate, zoom, and manipulate content using natural hand gestures—no additional hardware required.
How we built it
Frontend: React-based web application with MediaPipe for real-time hand tracking, with a webcam attached to a projector for seamless overlay.
Backend: FastAPI server powered by Cohere's Aya Vision for understanding whiteboard content, multimodal AI that combines audio transcription with visual analysis, and a SQLite database for content caching and session management.
AI Integration: Custom knowledge base covering physics, chemistry, mathematics, and biology with intelligent context detection, voice-to-text processing for natural language commands, and spatial mapping for precise overlay positioning.
Key Technologies: Cohere Aya Vision, MediaPipe Hand Tracking, React.js, FastAPI, WebRTC for camera access, and real-time gesture processing.
Challenges we ran into
Precision Calibration: Achieving pixel-perfect alignment between camera input and projector output required developing a sophisticated coordinate transformation system that accounts for perspective distortion and varying hardware setups. Real-Time Performance: Balancing AI processing speed with accuracy was crucial—we optimized our pipeline to deliver suggestions in under 300ms while maintaining educational quality through perceptual hashing and smart caching. Multimodal Integration: Combining voice commands, visual analysis, and gesture control into a cohesive experience required careful prioritization—we made voice input the primary signal while using visual context as supporting information. Educational Context Understanding: Teaching AI to differentiate between subjects and provide contextually appropriate suggestions required building custom knowledge bases and training the system to recognize academic patterns.
Accomplishments that we're proud of
🎯 Sub-300ms Response Time: Achieved real-time AI assistance that feels instantaneous and natural. 🖐️ Touch-Free Interaction: Created a completely gesture-based interface that requires no additional hardware beyond a camera and projector. 🧠 Intelligent Context Switching: Built an AI that automatically recognizes when you're doing physics vs. chemistry and provides relevant suggestions accordingly. 📚 Comprehensive Knowledge Integration: Developed subject-specific knowledge bases that provide not just answers, but educational explanations and related concepts. 🎤 Voice-First Design: Pioneered an audio-driven approach where natural speech commands seamlessly integrate with visual whiteboard content. ⚡ Production-Ready Performance: Built a system that's robust enough for real classroom use with 60 FPS overlay updates and intelligent caching.

Log in or sign up for Devpost to join the conversation.