Inspiration

We realized that education in general is not designed for both students with hearing disabilities and foreign students. We tried to kill two birds with one stone by implementing audio transcription and translation. We hope that this application will both break barriers, and build bridges.

What it does

This application transcribes your audio and translates it to a language of your choice. If you are deaf/hard of hearing or the speaker is speaking in a different language, this application will take care of your needs.

How I built it

We used the google-cloud-speech api to transcribe audio to text and then used the googletrans library to translate it. We developed our GUI with the Gooey library.

Challenges I ran into

We got stuck trying to authenticate ourselves to google cloud, and that took us a long time to do. Furthermore, for some of our members, it was their first time programming in Python. Another issue was that we could not figure out how to implement real-time translation into a GUI made with tkinter (pretty sure we needed to implement threading). So, we used Gooey, a library that took care of that.

Accomplishments that I'm proud of

I am proud that our team finished the project. Going into it, we didn't even know where to start, and everything seemed so daunting. We looked at all sorts of libraries, ran into all sorts of challenges - we began to get frustrated. However, we pulled through and was able to put a nice looking, functional projects that met all of our specifications.

What I learned

Software development is more efficient when you plan it out, and you know what you need, and how to get it. Furthermore, stand on the shoulders of giants - feel free to use libraries that do most of the "dirty" work for you. There's a reason why they're made.

What's next for AudioTranslate

  • We want it to be on the Zoom Marketplace, and hopefully we can figure out a way to get output audio without needing to install a driver and messing with the sound settings.
  • Implement speaker diarization so we can separate our transcriptions when multiple people are talking.
  • We want to figure out how to make it so that you don't need a gcloud account and make it available for everyone. This is probably through Oauth.authentication.
  • We want to see if we can output text directly in a Zoom video.

Built With

Share this project:

Updates