About the Project

Inspiration

Braille remains one of the most important tools for literacy and communication among visually impaired individuals, yet the ability to read Braille is limited to a small portion of the population. Teachers, caregivers, volunteers, and family members often struggle to interpret Braille documents, creating a communication barrier. We were inspired to build DotVision to bridge this gap by transforming Braille into instantly accessible text and speech using AI and computer vision. Our goal was to create a low-cost, real-time solution that works with everyday cameras and can be deployed anywhere.

What it does

DotVision is an AI-powered Braille recognition platform that converts Braille documents into readable text and spoken audio in real time. Users can scan Braille using a webcam or upload an image, and the system automatically detects Braille cells, reconstructs the grid, identifies individual dots, and translates the content into English. Optional text-to-speech functionality makes the output immediately accessible to a wider audience.

How we built it

Instead of relying entirely on end-to-end deep learning, we designed a Geometry-First + CNN Refinement architecture. The system first uses OpenCV-based computer vision techniques for perspective correction, illumination normalization, blob detection, and Braille grid reconstruction. Once potential dots are identified, a lightweight multi-label CNN validates each of the six Braille dots independently. This hybrid approach combines the reliability of deterministic geometry with the adaptability of machine learning.

To improve robustness, we trained the model using a combination of real-world datasets and synthetic augmented samples generated under varying lighting and noise conditions. We also integrated temporal smoothing and confidence gating to ensure stable real-time performance.

Challenges we ran into

Braille recognition is extremely sensitive to blur, lighting conditions, and camera angles. One of the biggest challenges was accurately reconstructing Braille grids from imperfect images without introducing hallucinated characters. We also faced difficulties balancing speed and accuracy for real-time performance while ensuring the system could gracefully reject low-quality frames instead of producing incorrect results.

Accomplishments that we're proud of

We successfully developed a real-time Braille OCR system capable of processing approximately 20 FPS on standard hardware. Our hybrid architecture delivers high accuracy while remaining explainable and computationally efficient. We are especially proud of the system's confidence-based rejection mechanism, which prioritizes reliability over guessing.

What we learned

Through this project, we gained valuable experience in computer vision, OCR pipelines, machine learning model calibration, accessibility-focused design, and real-time AI deployment. We also learned that combining classical algorithms with AI often produces more reliable systems than relying solely on deep learning.

What's next for DotVision

Our next goal is to support Grade-2 Braille, multilingual translation, mobile deployment, and edge-device optimization. We also plan to expand accessibility features through offline processing, cloud synchronization, and educational tools that help users learn Braille interactively. Ultimately, we envision DotVision becoming a universal accessibility platform that makes Braille information instantly understandable for everyone.

Built With

3.1
8b
angelina
api
apis
braille
character
computer
convolutional
css3
dataset
dotenv
dsbi
eventlet
flask
flask-limiter
flask-socketio
getusermedia
git
github
groq
html5
javascript
json
jupyter
llama
networks
neural
notebook
numpy
onnx
opencv
optical
python
pytorch
recognition
redis
rest
sciencedb
socket.io
speech
text-to-speech
vision
webrtc
xml

Updates

Ayush Kumar started this project — Jun 01, 2026 03:20 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.