AI Vision Assistant

AI Vision Assistant
Chrome extension interface (On)
Chrome extension interface (Off)

Inspiration

Browsing the internet is challenging for blind friends. During the early internet era, I wasn't aware of any effective methods to help them comprehend web content. With the advent of AI, I began to explore potential solutions, which inspired me to create this product.

What it does

With the extension active, blind users can simply by moving their mouse over any element and have its content instantly read back to them via synthesized speech, making web content accessible.

🤖 AI-Powered Content Description: Uses Chrome's built-in Gemini Nano model via Prompt API to intelligently describe web elements
🎯 Real-time Mouse Tracking: Follows mouse movement and provides instant descriptions of hovered elements
🔊 Voice Synthesis: Integrated Web Speech API for clear audio feedback
🌍 Multi-language Support: Available in Chinese and English
🎨 Visual Feedback: Highlights currently described elements for sighted assistants
⚙️ Customizable Settings: Adjustable speech rate, volume, and language preferences
🔒 Privacy-First: All AI processing happens locally on-device
🌐 Offline Capable: Works without internet connection thanks to local AI models

How we built it

Chrome AI Prompt API: Core functionality for generating intelligent content descriptions
Web Speech API: Voice synthesis for audio feedback
Chrome Extensions API: Extension framework and browser integration

What's next for AI Vision Assistant

Utilizes Prompt AI to analyze the code structure under the cursor, for enhanced accuracy in detecting elements like buttons, videos, and more.
Expanded language support.

Built With

Updates

Donny D started this project — Oct 29, 2025 10:07 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.