Our goal was to target people with paralysis and severe disabilities by allowing them to surf the web hands free using just their face and voice
What it does
It allows you to surf the web by using face gestures and voice commands!
How we built it
We used Python's OpenCV library to activate your webcam and monitor the motion of your face. When it detects any drastic changes, it makes a call to the Google Cloud Vision API to compute any differences in the pan (x-axis) and tilt (y-axis) of your face and determine which browser action to perform. All of the browser actions including opening, closing, switching tabs, and scrolling was done using Selenium Webdriver, a browser automation library. There was consistent communication between the open browser controlled by Selenium and the response received by the Vision API. Finally, we used Google Cloud Voice API to listen to your voice and trigger more browser commands such as opening a new tab and going back and forth in your browser history.
Challenges we ran into
Properly recognizing the facial gestures, making many sequential calls to Google's APIs, and getting the browser controls to work were just some of the challenges we surmounted.
Accomplishments that we're proud of
We would actually use this.
What we learned
We learned how to use Python's Selenium and OpenCV libraries, asynchronous threading, and Google Cloud's vision and speech API
What's next for NoseGoes
We hope to eventually incorporate eye tracking to actually interact with the web page itself. From there, after polishing up UX and ease of use, we want to bring our application to the world and help bring the wonders of the internet to those who struggle to use it.