Visually impaired individuals have trouble reading small print books and other visual media. To make life more accessible to them, it would be very useful to have a service where text in any standard form could be read aloud automatically until a user indicates termination.
What it does
This robot takes a picture of text in some print medium, uses optical character recognition to convert the image into a text file with the recognized characters, converts the text file to a .mp3 audio file that is played back automatically, and physically changes page to repeat the process until the user indicates termination.
How we built it
A webcam was used as the camera source, three programs were written from scratch to process the data in a pipeline, and a robot arm was built to physically change page via a wheel mechanism. The first program takes as input a .jpg image and uses OCR to output a .txt file. The second program takes as input the .txt file and outputs a .mp3 audio file that is automatically played back. The third program interfaces the Arduino and takes as input a signal to change page and outputs the rotation of the wheel, among other physical changes.
Challenges we ran into
As always, arranging multiple programs in a pipeline is a challenge. The construction of the robot out of hard foam and arranging motors to act as joints of the arm and hand was not straightforward either. A laptop was fried in the process due to lack of care with high torque motors.
Accomplishments that we're proud of
Successfully integrating real-world inputs, processing the data in a three-step process, and outputting an accurate page changing mechanism in less than 24 hours.
What we learned
How to use optical character recognition to convert an image to a text file. We're all computer science students, so hardware isn't very easy and inexperience has led to painful mistakes such as a destroyed laptop. We will make sure to consider the physical ratings of hardware components before plugging them in. Learning Arduino with python integration was also new and fun.
What's next for Readable
This is a prototype, so a lot of things are quite janky and unpolished. The webcam should be upgraded to a higher resolution camera with better low-light internal processing for a higher-resolution image. The robot arm should be upgraded to have a more precise page changing mechanism and support for dual sided pages. The software side is generally adequate, but extra polish is definitely possible to add robustness.