Owl Accessibility Bot

Owl Profile Icon
Captioning a set of images
Captioning an Audio Tweet 1
Captioning an Audio Tweet 2
Extracting Text from a document 1
Extracting Text from a document 2
Captioning a video 1
Captioning a video 1

Inspiration

Twitter is for everyone, but uncaptioned content prevents people with accessibility needs from experiencing the majority of the media.

What it does

By simply tagging @owl_access_bot on a Tweet with media, we will smartly label images, extract text from photos or letters, and caption audio and video and reply with the information about media in real-time using AI!
For Images that are primarily text or handwriting, we smartly switch from describing the image to extracting the text using OCR.

How we built it

We built it by combining OCR, intelligent speech to text and advanced computer vision from Microsoft Azure's cloud cognitive services, video and audio file processing using ffmpeg and urllib, and the Twitter API using python.

Challenges we ran into

The usual hackathon challenges of setting up cloud services, Python dependencies, troubleshooting bugs and correctly interacting with Twitter API.

Accomplishments that we are proud of

A fully functional system that actually does everything it claims. No smoke and mirrors here, everything just works!

What we learned

We learned more on the topic of Accessibility and the internet as well as familiarity with the Twitter API and platform

What's next for Owl Accessibility Bot

We plan to deploy the Bot to a cloud instance and keep it running. With student subscriptions, the cost to keep it running is minimal but accessibility is an issue best addressed in the long term by the platform itself, either by auto generating captions and descriptions for unlabeled media, or incentivizing users to do so.