Inspiration
Reading YouTube transcripts is hard: most of them lack proper punctuation and paragraphs.
What it does
Readable uses the local Gemini Nano LLM to restore punctuation and add paragraphs to raw YouTube transcripts.
How we built it
We used Gemini API and chunked the possibly long transcripts into chunks that fit into the context window.
Challenges we ran into
Once we had results for each chunk we needed to merge them. We chose to run yet another LLM prompt with the ending sentence of the previous chunk and the beginning sentence of the next chunk.
Accomplishments that we're proud of
Thanks to the new Gemini API we could achieve an entirely local solution without downloading additional models. This project could potentially be reused to punctuate raw audio Speech to Text which is not necessarily punctuated either. Local processing would be even more useful.
What we learned
The limitations of LLM context windows is to be taken into account and some glue required to achieve the end result. We also learned that the LLM can "fix" mistakes in the raw transcript, which is an advantage compared to using more traditional ways of restoring punctuation (e.g. a BERT token classifier where each class is a punctuation or lack of punctuation).
What's next for Readable YouTube Transcripts
It would be nice to provide this as a service to YouTube itself so when people open the "Transcript" button, they could see the fully punctuated text.
Built With
- flash8b
- gemini
- javascript
- llm

Log in or sign up for Devpost to join the conversation.