The idea came thanks to a Twitch Streamer, Snugibun, who started the whole movement for getting captions available for everyone on twitch, but, while the intention was great, I noticed two main problems: 1 - The captions were burned in the video, which meant that getting the original video would be impossible, while re-captioning a video is pretty doable. 2 - Due to them being burned in the stream, they could not be enlarged or moved by the user and also lacked any form of transparency. So I decided to try and find a solution for these problems.

What it does

The objective of the project is to allow streamers to add accurate Close Captioning to their streams without much hassle, and without burning the captions on the stream, and also give viewers the option to customize the font size and be able to pop them up as they wish.

How I built it

The first matter to be decided what framework I was going to use, the decision was not hard, I picked Ruby on Rails for two main reasons: I already have plenty of experience working with it, and it's also the framework in which I am also hosting my other Twitch Extension. Once that was done, I needed to decide what I would do to handle the speech recognition, while many services offer Speech to Text, most of them are meant for recognition of text from recordings, not a stream of audio, so I went with the Web Speech API, which allowed real time STT conversion, which I could later send to the RoR server for re-distribution to all other connected clients, and therefore all clients would receive the same content, in real time.

What I learned

During the process of making this extension, I learned a lot about both the client and server side of WebSockets and about the Web Speech API, which I found rather interesting. Mostly about the way the WS subscriptions work and how Ruby on Rail's ActionCable handles them.

What's next for Closed Captions for Twitch Streams

Next up on the list: Option to use multiple speakers(Not confirmed yet, need research). Support to automatically add Moderators and VIPs from chat. Setting default color options for text on a per-streamer basis. Color, Font and Placement options for text on a per-viewer basis.

+ 18 more
Share this project: