Inspiration

I started testing out Microsoft Azure's Computer Vision API and the results were both accurate and fast, so I thought, if this is so close to how I would describe an image, why not help blind people using this technology? After that, I brainstormed what other functions would be useful for someone who was blind, and PySight was born.

What it does

PySight has three main functions:

  1. Image description
  2. Text translation
  3. Motion detection to determine if there are moving cars in an video feed

How I built it

Detailed in the video demo. A special thanks to the following link for an overall scheme for a text-extraction algorithm: http://www.danvk.org/2015/01/07/finding-blocks-of-text-in-an-image-using-python-opencv-and-numpy.html, and Thank You to Microsoft and Google for making effective, easy-to-use API's.

Challenges I ran into

I originally wanted this to run on a Raspberry Pi (it was originally "PiSight", I just kind of got lucky that Python starts with the same sound) with buttons that blind people could easily distinguish, and a little camera on the back. I got it mostly functional and almost everything worked, but I updated the firmware using rbi-update, rebooted, and it never turned back on, so it was a bummer to not be able to produce a complete product. But I was able to demo all of the features that the Raspberry Pi would actually run, so I was overall happy.

Accomplishments that I'm proud of

The image classifier got an accuracy of 100%, which I thought was really cool. I'm happy with the speed of everything as well, especially considering there's so much data being passed around to so many places.

What I learned

Lots of JavaScript and Flask. Amazon S3. Motion Estimation.

What's next for PySight

Hopefully, I can fix the Raspberry Pi and it can be a little $60 device that we can distribute to the visually impaired.

Share this project:
×

Updates