Inspiration

Traditional screen readers rely on literal alt-text like “image_45.jpg,” which removes emotion, humor, and cultural meaning. We wanted to make the internet more inclusive by helping visually impaired users truly experience digital content.

What it does

Resonance is a Chrome extension that analyzes images using AI and converts them into expressive audio descriptions. It captures the “vibe,” humor, and context of content, not just basic visual details.

How we built it

We built Resonance using HTML, CSS, and JavaScript as a Chrome extension. Content scripts extract image URLs from webpages, which are sent to the Gemini Multimodal API for analysis. The generated context is then converted into speech using the browser’s Speech Synthesis API.

Challenges we ran into

We faced challenges in accurately extracting relevant images from dynamic webpages, generating meaningful emotional context using AI, and ensuring smooth real-time text-to-speech output.

Accomplishments that we're proud of

We successfully created a working prototype that goes beyond traditional screen readers by delivering emotional and contextual understanding of images, making digital content more accessible.

What we learned

We learned how to integrate AI APIs effectively, work with browser extensions and DOM manipulation, and design solutions focused on accessibility and real-world impact.

What's next for Resonance

We plan to add multi-language support, customizable voice options, and deeper integration with assistive technologies to enhance accessibility further.

Built With

Share this project:

Updates