Inspiration
A few of my friends recently started podcasts and wanted an easy way to create captions. I was messing around with AI tools at the time and came upon OpenAI Whisper 😀
What it does
WavScribe makes audio transcription super accessible. You can get a written version of any audio you upload (restricted to around 1 minute for now) sent straight to your inbox thanks to Courier 👐
How we built it
For the front end I used Next.js with Tailwind + Daisyui For audio transcription, I used HuggingFace Transformers Pipeline with openai/whisper base model. I also used RabbitMQ as a queuing system. Everything is hosted on a machine on OCI and I used Linode Object Storage to temporarily store audio files.
Challenges we ran into
I originally wanted to use OCI object storage and spent hours trying to configure it but that didn't work so I continued with Linode's solution. I also tried out a few other queuing solutions but ended up using RabbitMQ.
Accomplishments that we're proud of
I'm proud of taking this from an idea to an actual product!
What we learned
I learned alot about queuing systems and how to fit one into the overall architecture. I also learned how to use HuggingFace models and I improved my Python skills
What's next for WavScribe
I want expand the features by allowing users to:
- create user profiles so you can store transcriptions.
- share a link that allows anybody to view the clip along with timestamped transcriptions
- payment option to allow for longer transcriptions (compute power isn't cheap!)

Log in or sign up for Devpost to join the conversation.