A lot of us would have been in a situation where we are trying to watch a video in a public space, but can’t do so because of not having headphones with us. Another downside of videos is that they are not very searchable. It takes ten minutes to figure out what’s in a ten minute video, whereas in a piece of text you could find out what interests you in a matter of seconds. I built an app that makes any YouTube video consumable in the form of text, making it concise, searchable and readable.

⚒What it does⚒

What TL;DW does is extremely simple — it is a web application that takes a video from YouTube and asynchronously transcribes it, powered by AssemblyAI. What’s really cool is that it takes just a few minutes to transcribe any length of video and thanks to Assembly AI’s Audio Intelligence, it takes any accent of English and transcribes it very accurately.

🏗How we built it🏗

I used the AssemblyAI API and Streamlit to build the web app. I was inspired by an AssemblyAI tutorial on using its asynchronous transcription feature for videos and built a tool to solve my day-to-day problems that happen due to the downsides of video content mentioned above.

🚧Challenges we ran into🚧

  • I had never used Streamlit before - Thanks to various resources on the internet and streamlit’s documentation, I was able to tackle multiple challenges that my lack of experience with Streamlit brought about.
  • Me being the only teammate, I had to work around the clock to do both - frontend and backend.
  • Being new to Assembly AI API, Assembly AI’s documentation and tutorials immensely helped me this weekend. Thus - I successfully tackled all challenges I faced this weekend and completed a fully functional hack.

🏆Accomplishments that we're proud of🏆

  • Building my first text-to-speech app.
  • Successfully submitting a hack despite hacking alone and facing many challenges.

📚What we learned📚

  • How to use AssemblyAI’s API
  • How to build web apps with Python using Streamlit
  • How Audio Intelligence works

📈What's next for TL;DW📈

  • Leverage the entire capabilities of AssemblyAI’s API such as sentiment analysis, speaker labels, topic detection, censoring of PII and profanity to make TL;DW better.
  • I plan to use Assembly AI’s summarizing feature to make my transcriptions shorter, something I couldn’t do this weekend due to the lack of time.

Built With

Share this project: