Inspiration
I was looking for an easy to use website that would allow me to analyze the transcript of podcasts I love on YouTube. I wanted to do frequency analysis and visualize the output.
What it does
It downloads a youtube video, converts to wav pcm and sends that to the Speech Transcriber endpoint, displays the results in real-time. It can also be used by uploading an audio file of a conversation and an additional short audio files of the individuals present in the conversation so as to create a signature of their voice. Which then will be used as a way to identify them in the conversation transcription process.
How we built it
It was an iterative process, I built it in small chucks when I had time. Initially I thought it would be completed within weeks but I that was just me being naive.
Challenges we ran into
The Speech SDK was difficult to understand. I had issue managing my time and trying to remember all this new technologies and terminologies. Thankfully Microsoft documentation and quickstarts are very intuitive.
Accomplishments that we're proud of
I learned a LOT about Azure. I did not know anything about Azure or .NET before this project. It has thought me a lot about Cloud Services, SQL, Restful API, ASP.NET Core, the MVC architecture, Entity Framework, Design Patterns, C# Events, Asynchronous programming, Kubernetes, GitHub Actions, Unit Tests and the list goes on. I literally taught myself .NET by doing this project. Which I'm very proud of. At the end of the day, I know that my project is very trivial but after so many headaches it was so fulfilling to say I completed it.
What we learned
Compression of audio in audio file formats makes it impossible to get good quality audio. Youtube serves stereo and during transcription it comes out unrecognizable, which was a bummer.
What's next for Transcriber
A feature I'm excited to add is doing the transcription in a background daemon, storing the transcription result in a json file on Azure Storage and querying that data.

Log in or sign up for Devpost to join the conversation.