NeuroCursor was born out of a real accessibility need: hands-free computer control. After I noted how hindered people with injuries, disabilities, and generalized restricted movement struggle with basic computer functions, needing to open apps, click, type and scroll without hands, I knew I wanted to create something that would reinstate independence - and all one would need is a laptop and a webcam.
Thus, NeuroCursor was created. NeuroCursor allows users to control their computers purely based on facial gestures, head movement, and expressive AI voice feedback. MediaPipe Face Mesh helped me achieve desired facial landmarks necessary for mouth opening, cheek puffing, smiling and head angles. These gestures were translated into cursor movement, clicks and scrolls in real-time such that no set-up or non-deliberate physical control is required at all.
I learned an incredible amount while creating NeuroCursor. I gained an understanding of facial landmarking and learned the difference between intentional gestures and mere natural facial expressions via thresholding, calibration and smoothing. I developed a strong understanding for gesture-based design that minimizes strain and unintentional gestures and the difficulty involved in merging computer vision, human-computer interaction and real-time input ability.
An especially impactful portion of the project was integrating Fish Audio. As an AI with expressive TTS, it helped me teach NeuroCursor to speak state changes, actions, alerts and commands through natural-sounding and human-like speech instead of on-screen instructions. This makes the entire experience less frustrating as it directly tells the user when a click was executed, when calibration is complete or when their face isn't detected. With Fish's voice cloning capabilities, NeuroCursor can also speak in the user’s voice for an extra comfortable feel. Thus, audio feedback is rendered throughout the experience as a more intuitive and accessible interface - the user might not be able to see where the cursor is going, but they can be told.
There were many challenges throughout development. Understanding precision for gesture detection required a lot of testing as any small lighting change, angle or user difference would dramatically alter prediction response. To achieve smooth cursor movement, proper filtering was required with dead zones developed through testing to eliminate excess jittering. Real-time responsiveness between gesture detection and audio output relied on cache abilities and prompt execution to mitigate delays in TTS responses when possible. Finally, finding gestures that felt natural but distinguishable enough to avoid false positives took a lot of tries.
But the end result is a fully hands-free user control system that operates on any general laptop with no added sensors, gloves or specialized devices needed. By combining AI computer vision with expressive voice capability for feedback and commands, NeuroCursor aspires to make human-computer interaction more intuitive, inclusive and empowering for those who lack traditional input device abilities. Ultimately, it aims to restore independence and simplified use for those who need it most.
Log in or sign up for Devpost to join the conversation.