Inspiration

A few of my friends recently started podcasts and wanted an easy way to create captions. I was messing around with AI tools at the time and came upon OpenAI Whisper 😀

What it does

WavScribe makes audio transcription super accessible. You can get a written version of any audio you upload (restricted to around 1 minute for now) sent straight to your inbox thanks to Courier 👐

How we built it

For the front end I used Next.js with Tailwind + Daisyui For audio transcription, I used HuggingFace Transformers Pipeline with openai/whisper base model. I also used RabbitMQ as a queuing system. Everything is hosted on a machine on OCI and I used Linode Object Storage to temporarily store audio files.

Challenges we ran into

I originally wanted to use OCI object storage and spent hours trying to configure it but that didn't work so I continued with Linode's solution. I also tried out a few other queuing solutions but ended up using RabbitMQ.

Accomplishments that we're proud of

I'm proud of taking this from an idea to an actual product!

What we learned

I learned alot about queuing systems and how to fit one into the overall architecture. I also learned how to use HuggingFace models and I improved my Python skills

What's next for WavScribe

I want expand the features by allowing users to:

  • create user profiles so you can store transcriptions.
  • share a link that allows anybody to view the clip along with timestamped transcriptions
  • payment option to allow for longer transcriptions (compute power isn't cheap!)

Built With

Share this project:

Updates