Inspiration
We wanted to try using computer vision especially object detection with YOLO and OpenCV to introduce hands-free interaction with our computer. This will be quite useful for people with disabilities or can be used just for fun.
What it does
It is an application that takes in a camera feed and you can control your mouse cursor with face movement, mouse click with a blink, and mouse scroll with your tongue.
How we built it
The whole project is built with Python and its packages such as OpenCV. We also utilized YOLO to train the object detection model. The application first takes in a live camera feed from the webcam. A YOLO AI model is used to detect the existence of a tongue. OpenCV detects the eyes blinking and obtains the face coordinates. Those facial features control the mouse scroll, mouse clicks, and cursor movements respectively. The processing output is linked to the mouse input using Python GUI for frontend. The GUI was designed using the PySide6 framework.
Challenges we ran into
YOLO model is not accurate enough to be tuned in this 24 hours. Different method (OpenCV) was forced to be used in the middle of the hackathon. Besides, challenge is faced when trying to combine the detection part and the cursor implementation part, dataclass was used to combine them up. Other than that, due to how it is setup, the mouse couldn't move the areas near the sides of the screen, and a fix is implemented.
Accomplishments that we're proud of
We are able to control mouse clicks by blinking our eyes, mouse scroll with our tongue and cursor movement with our face movement within our application.
What we learned
We learned how to use computer vision and train a object detection with YOLO. We also learned to implement a full application using Python GUI and link up the frame processing with mouse inputs.
What's next for MUGSHOT: Mouse Using Gestures and Head Orientation Tracker
We could improve accuracy of the model. More features could be implemented such as different face features can be used to control more functions. Tongue tracking could also be implemented to scroll better. A better GUI will be a plus also.
Log in or sign up for Devpost to join the conversation.