Inspiration
I've recently been interested in utilizing computer vision to identify human motions, such as eye movement, head movement, etc. I wanted to adapt this to solve a problem that was identified by IFSI, the translation of different language text. As a result, I built this project which utilizes a combination of both technologies.
What it does
This project utilizes 2 neural networks one to identify the user's gaze location and another one to determine whether or not the user is winking. In addition, a user interface was built using python to display the translated text. A python wrapper for google translate was utilized for the translation services.
How we built it
I utilized weights that I had trained, on large datasets of eyes and their gaze's in order to do the gaze prediction and wink classification. I then combined this with MediaPipe, which is built by Google and identifies the position of the users face. Additionally, pyautogui is utilized to programmatically control the keyboard and mouse.
Challenges we ran into
Some challenges that I ran into were finding a simple and easy method of communicating between the GUI and the eye tracking. I remedied this through a simple text file through which the scripts would write data to and then listen for changes in the text file to retrieve the updated data. While this may not be the best solution, it provided a simple alternative to setting up an API.
Accomplishments that we're proud of
I'm proud of the project that I built, and while there definitely are some bugs, I enjoyed the process and building this project.
What's next for Eye Based Language Translation
There is definitely significant room for improvement for the gaze estimation network, for example, the data that it was trained on is solely from a single subject (myself). As a result, the network doesn't generalize well to other people, however, with additional data collection, this problem can be solved. Additionally, there are occasional errors in the translation service, while the translation service is provided by Google, certain improvements could be made on my end.
Log in or sign up for Devpost to join the conversation.