AI Desktop Navigation

GIF
Artificial Intelligence
Controlling mouse with corresponding gaze direction. Blink for mouse click.
Gaze estimation using Deep Learning.
Naruto Eyes (Fun mode)
Navigation of desktop using Artificial Intelligence. (GUI)

Inspiration

Millions of great people, are not able to use their electronic devices as efficiently as they potentially could, because of some uncontrollable circumstances. Because of that, many potentially brilliant writers, engineers, musicians, etc. could not conveniently follow their passions and create great things for humanity, and enjoy their lives a bit more. We decided that we are willing and we are capable of taking action towards helping those people. We decided to build software that solves this problem for physically disabled people and makes this world a little bit better.

What it does

Our first prototype enables people to navigate around their desktop, using only their eyes. This includes moving the mouse and pressing the buttons. It also has fun Naruto eyes mode.

How we built it

We built the base in Python 3, we used PyQt5 to create a GUI. Then we used OpenVINO and dlib library's pre-trained models and applied some transfer learning on them, in order to estimate the real-time gaze direction. We used OpenCV for detecting blinks and pyautogui library for controlling the mouse movements, and we used Docker for containerizing our application.

Challenges we ran into

First and the biggest challenge was the optimization of the inference of our deep learning models in order to make our app run faster. Solving all the version compatibility issues(we ended up using Docker to solve this problem).

Accomplishments that we're proud of

The biggest accomplishment that we are proud of is the fact that we were able to turn what we had imagined into life. We are also very proud that even the first version of our software is readily available to help disabled people immediately.

What we learned

We gained more experience in working with Deep Learning applications and learned new cutting edge technologies like Docker. We also felt all the huge importance and power of good teamwork.

What's next for NAI - Artificial Intelligence Desktop Navigation

Our further improvements are:

Packaging app into ".exe" file, so it could be downloaded and run by anyone on any kind of operating system.
We are working on a website enabling people to download our software from the web.
Working on the sensitivity of the models, so our app works better, faster, and more precisely. Training our own model, instead of using OpenVINO models for better performance.
Adding NLP and voice control options to our application.
Adding more functionality and options to GUI.
Creating a mobile version of the app, using TensorFlow lite, to run our models on Android and iOS.

Built With

c++
dlib
keras
opencv
openvino
pyautogui
pyqt5
python

Submitted to

Created by

Constructed the architecture of the project.
Implemented models for: gaze and head pose estimation, face and facial landmarks detection. Worked on improving model's inference.
Did the transfer learning of Intel OpenVINO models, and dlib models to improve the quality and speed.
Constructed the GUI, design and the color theme.
Added the "Naruto facial filter" :)

Matvei Popov
IT Enthusiast from Los Angeles, CA.
Worked on blink detection.

Sharan Babu
We are in this together!
I helped in pyautogui mouse click functionality.

Srikar Samudrala