TV is part of our lives, part of our culture… How to make TV more inclusive to the Visually impaired? Trying to grasp the narrative by only listening to the dialog doesn’t tell you the whole story. What about the context? What about the actions, the emotions? If you have impaired hearing, you can turn up the volume or use closed captions. If you can’t see well, or can’t see at all, how do you turn up the image?

What it does

Using your standard run-of-the-mill media player like VLC, you can now pause and send a snapshot to a software that will automatically describe the scene: what’s happening? who is on the screen? what are they doing? Today we are using a piece of cutting-edge software that derives its effectiveness from the way it mimics the brain in key aspects of human visual processing.

How it works

A snapshot of the scene gets processed by a visual algorithm, that enriches it with a text description which get conveyed back to the user through a text to speech software.


-Screen shot and video snippet used on this demo. (c) copyright 2008, Blender Foundation / licensed under CC -Neuraltalk (c) Andrej karpathy licensed under BSD

Built With

  • python
  • vlc
  • espeak
  • mbrola
  • neuraltalk
  • caffe
Share this project: