INCLUSIGHT

Making the invisible understandable.

Inspiration

This project started with something personal.

We have recently had a friend who is blind. Through them, we began to realize something we had never questioned before:

A world that feels effortless to us can be extremely challenging for someone without sight.

Simple actions, like receiving an image in a chat, become complex, multi-step processes.

As we explored further, we saw that current digital infrastructure still has many gaps in accessibility.
One of the most overlooked gaps lies in visual communication:

images
memes
stickers
emojis

These are not just visuals, they carry nuance, emotion, humor, and cultural meaning.

Yet today, blind users are largely excluded from this layer of communication.

The Problem

When a blind user receives an image in a conversation, the current process is fragmented and disruptive:

Download the image
Open another app (ChatGPT, Gemini, etc.)
Upload the image
Ask for a description
Wait for a response
Return to the original conversation

This breaks the natural flow of communication. More importantly, existing tools focus on describing what is in the image, not explaining what it means in context.

Solution

INCLUSIGHT is a real-time visual message interpreter designed for blind users.

We don’t just describe images. We interpret them.

Images → Meaning
Visuals → Understanding

Our system explains:

what the image shows
what it means in the conversation
the tone, emotion, or intention behind it

All delivered through fast, natural audio.

How users use INCLUSIGHT

Designed with accessibility-first interaction:

Trigger instantly from the chat interface
No typing required
Minimal steps
Immediate audio feedback

The system:

Detects the image in conversation
Interprets content and context
Generates a concise explanation
Reads it aloud instantly

Technology Behind

Our system integrates:

Vision-language models for context-aware interpretation
Prompt engineering to prioritize meaning over literal description
Text-to-speech (Blaze.vn) for natural Vietnamese audio output

We focus on reliability and real-world usability, not just accuracy.

Challenges We Faced

1. Speed vs. Level of Detail

There is a fundamental trade-off:

$$ \text{Speed} \uparrow \Rightarrow \text{Detail} \downarrow $$

$$ \text{Detail} \uparrow \Rightarrow \text{Latency} \uparrow $$

We solved this by:

Using lighter models
Designing more structured and precise prompts

This allowed us to maintain:

Fast response High-quality, meaningful output

2. Meaning vs. Description

Most systems optimize for:

$$ \text{Accuracy} = f(\text{Objects}) $$

But real communication requires:

$$ \text{Understanding} = f(\text{Context}, \text{Tone}, \text{Intent}) $$

We shifted the system toward interpretation, not just recognition.

Opportunities & Gaps

Through building INCLUSIGHT, we identified broader accessibility gaps:

Poor Vietnamese speech-to-text (lack of commas, punctuation)
Limited support for Vietnamese context and culture
Robotic, unnatural Vietnamese text-to-speech
Translation gaps from English → Vietnamese

These are not edge cases—they affect millions of users.

Vision

This is just the beginning.

We envision:

A digital world that is fully inclusive and accessible for blind users—especially in Vietnamese contexts.

INCLUSIGHT can evolve into:

An accessibility API for messaging platforms
A real-time interpretation layer across apps
A standard for inclusive visual communication

Closing

INCLUSIGHT — Making the invisible understandable.

Because connection today lives in:

images
memes
stickers
emojis

And:

Everyone deserves to understand it.

Built With

blaze
codex
openai
trae

Updates

Lam Lien started this project — Mar 21, 2026 06:23 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.