Inspiration

Over 2 billion people worldwide struggle to use the internet independently due to visual or motor impairments. Existing accessibility tools are complex, expensive, or require technical setup. I wanted to build something simple — where your voice is all you need.

What it does

WebWhisper is a voice-first AI browser assistant. You simply speak or type a command like "open YouTube and search for chicken recipe" and WebWhisper automatically opens the browser, navigates to the site, types the search, and reads the results aloud — completely hands free.

Key features:

  • 🎤 Voice controlled browser automation
  • 🔊 Reads page results aloud using text to speech
  • 🖱️ Scrolls down and clicks first result automatically
  • 🧠 Remembers conversation context across commands
  • ⌨️ Text input alternative for accessibility
  • 🌐 Supports Google, YouTube, Wikipedia, Amazon, Reddit and more

How I built it

  • Amazon Nova 2 Lite on AWS Bedrock acts as the AI brain — it understands natural language commands and generates structured browser action plans
  • Playwright executes those plans in a real Chromium browser
  • SpeechRecognition captures and transcribes voice input
  • pyttsx3 reads page results aloud to the user
  • Flask serves the web UI
  • Python connects everything together

Challenges I faced

  • Getting Nova 2 Lite to consistently return structured JSON action plans
  • Handling slow network timeouts during browser automation
  • Making voice recognition reliable across different accents and environments
  • Synchronizing the browser automation with the Flask web UI in real time

What I learned

  • How to use Amazon Nova 2 Lite via AWS Bedrock for real world AI tasks
  • How to build reliable browser automation with Playwright
  • How to create accessible AI applications for visually impaired users
  • How to combine voice, AI, and browser automation into one seamless pipeline

What's next

  • Add support for form filling and login automation
  • Integrate Amazon Nova 2 Sonic for real time speech to speech interaction
  • Add support for more languages for global accessibility
  • Build a mobile version for smartphone users

Built With

Share this project:

Updates