VisionTTS

VisionTTS mobile app

Inspiration

According to the Fifth Rwanda Population and Housing Census (2022), there are more than 158,000 individuals living with visual disabilities (blind and low-vision).

Their main problem is accessing everyday visual information such as the following:

Reading signs, books
Navigating unfamiliar environments

What it does

VisionTTS is a mobile app and smart glass designed to help people with visual disabilities. accessing everyday visual information.

Scene Describing, describe indoor and outdoor environments through audio feedback in Kinyarwanda, to help individuals with visual disabilities understanding their area and navigate

Text Reading, read text like printed documents and books through audio feedback in Kinyarwanda

How we built it

Capture the Scene A mobile camera take a picture of what is in front of the user — such as a sign, a book, or the surrounding environment.

Understand What’s Seen the backend logic runs an AI model (qwen3-vl:2b-instruct-q4_K_M) that analyzes the image and understands what is happening in the scene

Translate into Kinyarwanda Since the AI describe images in English, the system automatically translates the description into Kinyarwanda using translation model (Quantized_Nllb_Finetuned_Health_En_Kin_8bit_v2) and claude Ai

Convert to Speech The translated description is turned into voice audio using a Kinyarwanda speech model (KinyarwandaTTS_female_voice). The user hears the description through phone speakers