LIV - Live Captions

Inspiration

Our inspiration is my brother who is a big fan of Apple products and who wanted to watch Apple Keynote Livestreams badly but couldn't understand a thing because of his poor English skills. My brother and our family members who are non-native English speakers, people who are hearing impaired and differently abled face this problem day-in and day-out when they try to consume content whose audio doesn't make much sense for them.

What it does

Our hack generates closed captions in real-time in the speakers' language (and live "translations" in other languages - planned extension: WIP). This is helpful for hearing impaired and also lifts language barrier for a worldwide audience of any useful media they can benefit from.

How we built it

We source our captions from multiple systems (some of our own) and stitch them together with an intelligent algorithm that can come up with most accurate transcriptions.

Challenges we ran into

Streaming video, integrating and normalizing multiple third-party Speech-to-Text APIs, the accurate stitching with limited metadata about the stream etc.,

Accomplishments that we're proud of

The accuracy of the stitching algorithm that we came up in a very short time, our benchmarking utility to measure the accuracy of any stream, the end-to-end solution we built, and our creative presentation "with captions instead of usual audio"

What we learned

Learned a bunch of latest technologies/frameworks. To build fast, fail fast and iterate. The accuracy levels that we were able to come up are far higher than we imagined.