Inspiration
Wanted to help disabled individuals and anyone experiencing situational disability (carrying groceries, cooking, injury) who struggle to use technology. Wanted to create hands-free technology to boost independence.
What it does
Allows the user to command the browser to perform certain actions via their voice input. Core commands (examples):
- Search: “search messi” -> searches the keyword "messi"
- Direct nav: “go to wikipedia” -> opens wikipedia.com directly
- History: “go back”, “move forward”
- Scrolling: “scroll down”, “scroll top/bottom”
- Results paging: “next page”
- YouTube: "play "
How we built it
- Used chrome extension, MV3, with messaging and reinjection.
- Content script renders a floating mic button, listens via SpeechRecognition, parses intents, and sends actions to the worker.
- Persistence: Uses chrome.storage.session to remember which tabs are “listening” so it auto-resumes after page loads, back/forward.
- Collab: GitHub for version control, issues, and PRs.
Challenges we ran into
Lots of edge cases due to the broad audience, learning to work as a group and too ambitious about the functionalities we could implement.
Accomplishments that we're proud of
We got most of the basics of voice driven browser operation working as intended. The user can even play a song on YouTube without moving a finger.
What we learned
Learned about using browser API, creating an extension, and using SpeechRecognition API in our project.
What's next for Voice Navigator
Further functionalities, fewer bugs and creating a fun game that allows the user to sing songs that a LLM model can identify.
Built With
- javascript
- mv3
Log in or sign up for Devpost to join the conversation.