✨ Inspiration The inspiration for this project came from our interest in voice-based automation and LLM-powered web browsing. I’ve always been fascinated by how natural language interfaces can simplify complex user interactions. I wanted to build something that could help users navigate the web, search, and interact with content making browsing more accessible, intelligent, and effortless.
📚 What I Learned Throughout the journey of this project, I learned:
How to work with Chrome Extensions, Manifest V3, and Web Speech API
How to integrate Groq(gemma2-9b-it) for natural language understanding
Better problem-solving and debugging techniques, especially with asynchronous APIs
How to design a user-friendly voice UI
Real-world practices for browser scripting, content scripts, and LLM prompt engineering
🛠️ How I Built It I used the following tools and technologies to bring the project to life:
Frontend/UI: HTML, CSS, Vanilla JavaScript
LLM Integration: Groq via the Completions API
Browser APIs: Web Speech API for speech-to-text, Chrome Tabs/Scripting API
Voice Command Engine: Custom prompt-to-action mapping using Groq
Deployment: Chrome Extension (packed and loaded locally for now)
Development Process: I started by defining the core interaction: voice-to-command. I built a minimal UI to capture speech using the Web Speech API, then used Groq to analyze the command and decide whether to search, navigate to a URL, summarize the page, or answer a question about the current page. The final step was to execute these actions using browser scripting and dynamically injected scripts.
🧗 Challenges I Faced Some of the key challenges I encountered were:
Debugging asynchronous API calls and managing fetch requests with proper headers
Getting content scripts to load dynamically and ensuring has access across all pages
Parsing and handling structured LLM outputs reliably (JSON format from Groq)
Extracting full page innerHTML safely and efficiently summarizing it
Designing a flexible prompt that handles open-ended user speech while keeping response format consistent
Navigating the restrictions of Manifest V3, especially regarding background scripts and permissions
Built With
- chromeextension
- css
- groq
- html5
- javascript
- llm
- promptengineering
- tabs/scripting
- webspeechapi

Log in or sign up for Devpost to join the conversation.