We wanted to create a tool that would improve the experience of all users of the internet, not just technically proficient ones. Once fully realized, our system could allow groups such as blind people and the elderly a great deal of utility, as it allows for navigation by simple voice command with only minor setup from someone else. We could also implement function sharing to allow users to share their ideas and tools with everyone else or allow companies to define defaults on their own webpages to improve the user experience.

Created with the intention to make browsing the web easier, if not possible for the visually impaired. Access to the internet may soon be considered a fundamental human right- it should be made accessible to everyone regardless of their abilities.

What it does

A chrome extension that allows a user to speak a phrase(e.g. "Search for guitar songs" on Youtube), then enter a series of actions. Our extension will store the instructions entered, then use a combination of STT and natural language processing to generalize the query for the site and all permutations of the same structure. We use Microsoft LUIS, which at a baseline allows for synonyms. The more it is used, the better this becomes, so it could expand to solve "find piano music" as well. We are also in the process of developing a simple interface to allow users to easily define their custom instruction sets.

How we built it

We used webkit Speech to Text in a chrome plugin to create sentences from recordings. We also created a system to track mouse and keyboard inputs in order to replicate them. This info was passed to a StdLib staging area that processes data and manages Microsoft LUIS to interpret the inputted sentence. This is then passed back to the plugin so it knows which sequence of actions to perform. Our project has a system to generalize "entities" i.e. the variables in an instruction (i.e. "guitar songs").

Challenges we ran into

  • No native APIs for several UX/UI elements. Forced to create workarounds and hack bits of code together.
  • Making the project functions easy for users to follow and understand.

Accomplishments that we're proud of

Our team learned to use an unfamiliar system with an entirely different paradigm from traditional web hosting, and how to manage its advantages and disadvantages on the fly while integrating with several other complex systems

What we learned

It is a better strategy to iterate on outwards from the simplest core of your system rather than aim big. We had to cut features, which meant we sunk unnecessary time into development initially. We also learned all about FAAS and serverless hosting, and about natural language processing.

What's next for Quack - Voice Controlled Action Automation for Chrome

Built With

Share this project: