Inspiration
Eye-tracking technology allows our eyes to be used as input, transforming the computer navigation experience. Unfortunately this technology is still behind a pay wall and requires really good hardware to work properly. For a task as simple as switching tabs there is no need for extremely good hardware (a good webcam, infrared technology, etc.). The main idea for this solution is that even a laptop webcam, with the help of software, can be used to detect whether a pupil in an eye is looking to the left, or looking to the right. These inputs can be mapped to the next immediate (2nd) tab or the 3rd tab, respectively.
What it does
The solution uses the laptop's webcam to capture frames of the face. The face is then sent through a face detection solution built by MediaPipe. This then ensures a face is detected, then applies a face mesh of which the eye parts are of interest. The eye parts of the face mesh are then cropped, then an algorithm is used to identify whether the pupil is looking to the left relative to the eye or right relative to the eye. The output (left or right) is mapped to switching to the **next immediate tab (2nd) or the 3rd tab, respectively.
How we built it
I built it using python. I used the opencv library as well as MediaPipe for the face detection and pupil tracking. I also used pyautogui to allow the program to simulate key presses, in this case 'alt' and 'tab'
Challenges we ran into
The eye tracking was difficult to implement as I could not initially figure out how exactly the thresholding works. Perhaps the biggest challenge/issue is one that I have yet to be able to to resolve. The issue is that the program needs to be run every time a tab is to be switched. My original idea was to have a program run as a windows application in the background which continuously checks if a certain key is pressed and if it is, then it goes through the pupil detection and switches a tab accordingly. However, as it is, my code only does this once. The program has to be rerun in order for another tab switch to be made. I have tried putting while loops in various places but have not yet figured out how to make this aspect work. Also the entire span of time it takes for the detection and switching process is too long. Overcoming this possibly requires drastic changes in the implementation and a better understanding of how events are handled in python.
Accomplishments that I'm proud of
The idea is somewhat implemented. The switching actually works. The detection is quite accurate. However, compared to the degree to which the program works, I am really more proud proud of the things I learnt which are discussed below.
What we learned
All about OpenCV, how binary thresholding works, how images are actually stored and manipulated in opencv and how to make the most rudimentary image segmentation algorithm.
What's next for eyeTab
Continuous eye tracking, allowing for the functionality of switching tabs to be used multiple times. Transferring the input needed to start the program from a key press to another visual input such as a 'nod', the idea being that user can switch tabs without even touching the keyboard. Also, making the entire program run faster.
Log in or sign up for Devpost to join the conversation.