VisLink: Hands-Free Control Software

VisLink is designed to empower individuals with mobility impairments by allowing them to control their computers without physical input devices, requiring only a standard webcam.


Inspiration

Computers are integral to the lives of billions, yet traditional input devices can be a barrier for those with limited hand functionality. While specialized hardware exists, it is often expensive and inaccessible. VisLink requires only a webcam in order to provide an affordable, hands-free control solution for individuals with mobility disabilities.


What it does

  • Head Movement Tracking: Uses Google MediaPipe to track head movements and map them into mouse movements

  • Blink Detection for Clicks: Detects blinks and translates them into clicks. We use a double-blinking system (2 consecutive blinks for a click) in order to reduce false positives

  • Voice Commands: Allows users to perform actions like typing, clicking, and controlling the software with voice commands - reducing the need for physical interaction with the computer

  • Customizations: We offer adjustable settings for mouse sensitivity, blinking intervals, and voice configurations for individual needs

  • Ease of Setup: Designed to be easy to use: Install and Run

Designed to be set up with initial assistance from a caretaker, afterwards VisLink can be fully configured by the user themselves without the need for caretaker intervention.


How we built it

-Primary Language: Python (for both frontend and backend)

  • Key Libraries & Frameworks:
    • OpenCV & MediaPipe: For real time head and blink tracking
    • CustomTkinter: To create an intuitive and accessible UI
    • NumPy & SciPy: To optimize movement tracking algorithms

We built VisLink using Python, OpenCV, and MediaPipe's Face Landmarker, integrating real-time facial landmark detection to track head rotation (roll, pitch, yaw) and eye blinks for cursor control (calibrated using blink flags).

The cursor movement is computed using rotation vectors, mapping head angles to a smoothed movement vector - applied by exponential smoothing with an adjustable alpha factor.

Blink detection uses Eye Aspect Ratio (EAR) calculations on specific eye landmarks, dynamically adjusting thresholds for users with glasses. The system includes adaptive blink interval filtering which allows us to differentiate between an intentional blink signal or a natural blink to prevent false mouse input.

Additionally, we implemented a dead zone filter which eliminates cursor drift from minor head tremors.

Challenges we ran into

Accurate Blink Detection

  • Problem:

    • Blinks, being natural, often caused false positives when used to trigger clicks
    • Users wearing glasses experienced landmark occlusions, distorting our Eye Aspect Ratio (EAR) calculations
  • Solution:

    • Tracked 4 key eye landmarks (top, bottom, inner, outer) and computed the EAR as:

    EAR = (|| top - bottom ||) / (|| inner - outer ||)

    • Implemented a double-blink system to reduce accidental clicks.
    • Dynamically adjusted EAR thresholds during initialization based on the user's natural eye state
    • Compensated for glasses interference by incorporating head tilt data

Mapping Head Movements To Cursor Movements

  • Problem:

    • Raw 3D head tracking data needed to be translated into 2D cursor movements seamlessly
    • Direct mapping resulted in jittery and chaotic cursor movement due to minor head tremors
  • Initial Approach:

    • Used movement vectors from head rotation which caused the cursor to jitter and be unstable
  • Solution:

    • Implemented exponential smoothing to combine new movement vectors with previous data points
    • This method reduced cursor drift during stillness and provided a smooth "linear" motion of the mouse
    • For more details on exponential smoothing, see this article

Accomplishments that we're proud of

  • Technical Accomplishments: We successfully integrated head movement tracking with blink detection and allowed for smooth mouse navigation

  • Team Collaboration: Throughout the development process, our communication remained constant and collaboration remained effective between both frontend and backend teams, minimizing code conflicts and streamlined the development process in order to deliver high quality and maintainable code.

  • User Empowerment: We were able to create a product that was not only functional but also has the potential to scale and deliver real world impacts in industries such as education, healthcare, and entertainment


What we learned

  • Technical Skills:

    • We deepened our understanding of computer vision and real-time tracking algorithms
    • Enhanced our ability to implement algorithms and problem solve
  • Soft Skills

    • Recognized the importance of thorough testing and iterative development, never settling for something less
    • Strengthened collaboration and communication across both teams, allowing for a smooth development process

What's next for VisLink (Visual Link)

  • Future Enhancements:

    • Eye Tracking: In the future we plan on implementing eye tracking rather than head tracking to create an even more seamless user experience - in line with our project name: Visual Link.
    • Automation: Develop a solution that allows VisLink to launch automatically at startup which removes the need for
      caretaker setup and further empowers the user
    • Machine Learning Integration: In the future, we hope to leverage AI for more adaptive and personalized adjustments based off user user movement patterns to help reduce manual configuration
    • Platform Expansion We want to expand VisLink to beyond just PCs but to also include mobile devices, gamign consoles, and assistive technology platforms through an open API
    • Potential Applications: We genuinely believe in the potential of VisLink to make a broader impact on society beyond just every day computing. VisLink could provide significant benefits in fields such as healthcare and education by lowering the physical technology barriers

VisLink is more than just a software - it's a step towards making technology more accessible to everyone, ensuring that the digital landscape serves all members of our community

Built With

Share this project:

Updates