Inspiration

Wanted to help disabled individuals and anyone experiencing situational disability (carrying groceries, cooking, injury) who struggle to use technology. Wanted to create hands-free technology to boost independence.

What it does

Allows the user to command the browser to perform certain actions via their voice input. Core commands (examples):

  • Search: “search messi” -> searches the keyword "messi"
  • Direct nav: “go to wikipedia” -> opens wikipedia.com directly
  • History: “go back”, “move forward”
  • Scrolling: “scroll down”, “scroll top/bottom”
  • Results paging: “next page”
  • YouTube: "play "

How we built it

  • Used chrome extension, MV3, with messaging and reinjection.
  • Content script renders a floating mic button, listens via SpeechRecognition, parses intents, and sends actions to the worker.
  • Persistence: Uses chrome.storage.session to remember which tabs are “listening” so it auto-resumes after page loads, back/forward.
  • Collab: GitHub for version control, issues, and PRs.

Challenges we ran into

Lots of edge cases due to the broad audience, learning to work as a group and too ambitious about the functionalities we could implement.

Accomplishments that we're proud of

We got most of the basics of voice driven browser operation working as intended. The user can even play a song on YouTube without moving a finger.

What we learned

Learned about using browser API, creating an extension, and using SpeechRecognition API in our project.

What's next for Voice Navigator

Further functionalities, fewer bugs and creating a fun game that allows the user to sing songs that a LLM model can identify.

Built With

Share this project:

Updates