Inspiration
Browsing the internet is challenging for blind friends. During the early internet era, I wasn't aware of any effective methods to help them comprehend web content. With the advent of AI, I began to explore potential solutions, which inspired me to create this product.
What it does
With the extension active, blind users can simply by moving their mouse over any element and have its content instantly read back to them via synthesized speech, making web content accessible.
- 🤖 AI-Powered Content Description: Uses Chrome's built-in Gemini Nano model via Prompt API to intelligently describe web elements
- 🎯 Real-time Mouse Tracking: Follows mouse movement and provides instant descriptions of hovered elements
- 🔊 Voice Synthesis: Integrated Web Speech API for clear audio feedback
- 🌍 Multi-language Support: Available in Chinese and English
- 🎨 Visual Feedback: Highlights currently described elements for sighted assistants
- ⚙️ Customizable Settings: Adjustable speech rate, volume, and language preferences
- 🔒 Privacy-First: All AI processing happens locally on-device
- 🌐 Offline Capable: Works without internet connection thanks to local AI models
How we built it
- Chrome AI Prompt API: Core functionality for generating intelligent content descriptions
- Web Speech API: Voice synthesis for audio feedback
- Chrome Extensions API: Extension framework and browser integration
What's next for AI Vision Assistant
- Utilizes Prompt AI to analyze the code structure under the cursor, for enhanced accuracy in detecting elements like buttons, videos, and more.
- Expanded language support.
Built With
- css
- gemini-nano
- google-web-speech-api
- javascript
Log in or sign up for Devpost to join the conversation.