About the Project
Inspiration
Braille remains one of the most important tools for literacy and communication among visually impaired individuals, yet the ability to read Braille is limited to a small portion of the population. Teachers, caregivers, volunteers, and family members often struggle to interpret Braille documents, creating a communication barrier. We were inspired to build DotVision to bridge this gap by transforming Braille into instantly accessible text and speech using AI and computer vision. Our goal was to create a low-cost, real-time solution that works with everyday cameras and can be deployed anywhere.
What it does
DotVision is an AI-powered Braille recognition platform that converts Braille documents into readable text and spoken audio in real time. Users can scan Braille using a webcam or upload an image, and the system automatically detects Braille cells, reconstructs the grid, identifies individual dots, and translates the content into English. Optional text-to-speech functionality makes the output immediately accessible to a wider audience.
How we built it
Instead of relying entirely on end-to-end deep learning, we designed a Geometry-First + CNN Refinement architecture. The system first uses OpenCV-based computer vision techniques for perspective correction, illumination normalization, blob detection, and Braille grid reconstruction. Once potential dots are identified, a lightweight multi-label CNN validates each of the six Braille dots independently. This hybrid approach combines the reliability of deterministic geometry with the adaptability of machine learning.
To improve robustness, we trained the model using a combination of real-world datasets and synthetic augmented samples generated under varying lighting and noise conditions. We also integrated temporal smoothing and confidence gating to ensure stable real-time performance.
Challenges we ran into
Braille recognition is extremely sensitive to blur, lighting conditions, and camera angles. One of the biggest challenges was accurately reconstructing Braille grids from imperfect images without introducing hallucinated characters. We also faced difficulties balancing speed and accuracy for real-time performance while ensuring the system could gracefully reject low-quality frames instead of producing incorrect results.
Accomplishments that we're proud of
We successfully developed a real-time Braille OCR system capable of processing approximately 20 FPS on standard hardware. Our hybrid architecture delivers high accuracy while remaining explainable and computationally efficient. We are especially proud of the system's confidence-based rejection mechanism, which prioritizes reliability over guessing.
What we learned
Through this project, we gained valuable experience in computer vision, OCR pipelines, machine learning model calibration, accessibility-focused design, and real-time AI deployment. We also learned that combining classical algorithms with AI often produces more reliable systems than relying solely on deep learning.
What's next for DotVision
Our next goal is to support Grade-2 Braille, multilingual translation, mobile deployment, and edge-device optimization. We also plan to expand accessibility features through offline processing, cloud synchronization, and educational tools that help users learn Braille interactively. Ultimately, we envision DotVision becoming a universal accessibility platform that makes Braille information instantly understandable for everyone.
Built With
- 3.1
- 8b
- angelina
- api
- apis
- braille
- character
- computer
- convolutional
- css3
- dataset
- dotenv
- dsbi
- eventlet
- flask
- flask-limiter
- flask-socketio
- getusermedia
- git
- github
- groq
- html5
- javascript
- json
- jupyter
- llama
- networks
- neural
- notebook
- numpy
- onnx
- opencv
- optical
- python
- pytorch
- recognition
- redis
- rest
- sciencedb
- socket.io
- speech
- text-to-speech
- vision
- webrtc
- xml
Log in or sign up for Devpost to join the conversation.