Inspiration

The inspiration for SurfEase came from Google and Apple’s Gemini video app, where users can perform tasks entirely using voice-first interactions. We wanted to bring that same hands-free, intelligent experience to the browser, letting users control and automate web tasks without relying on external APIs or exposing sensitive actions outside their device.

What it does

SurfEase is a voice-first browser automation assistant. It lets users speak commands to their browser and have them executed automatically, including tasks like: Navigating websites Filling forms or logging into accounts Searching for content and extracting information Speaking back results or confirmations All browser actions happen locally on the user’s device, giving both privacy and seamless interactivity.

How we built it

Browser Integration: Using Chrome Extension Chrome DevTools Protocol (CDP) via Extension's debugger API to control, and automate the browser without any external software. AgentCore Runtime: Built with Strands SDK and AWS Agent Core Runtime with AgentCore Memory modules for context-aware automation. Bridge: A WebSocket MCP bridge connects the agent runtime to the browser in real-time. Voice Interface: Web Speech API handles speech-to-text for commands and text-to-speech for responses.

Challenges we ran into

Handling dynamic web pages where DOM elements change frequently. Few issues with Strand's SDK where its not worked as specified in docs

Accomplishments that we're proud of

Built a fully voice-first, browser-based agent that can understand and execute natural language tasks. Developed a secure, local-first system that integrates automation directly with the user's device. Successfully automated multi-step tasks across modern web applications.

What we learned

How to combine browser debugging protocols with agent runtimes for autonomous task execution. The importance of context and memory for meaningful voice interactions. Best practices for real-time bi-directional communication between the browser and agent. How to make voice-first applications responsive and robust in a browser environment.

What's next for SurfEase

Support for multiple browsers like Firefox and Edge. Expand voice-first capabilities to include richer multimedia interactions.

Built With

Share this project:

Updates