We all know we are in a Sound tech era and there is always a problem for deaf and people with some or other hearing problems (Hard of Hearing Community) where they face many challenges what the others are speaking it can be during phone call interactions, Zoom/Teams/Slack meetings, any YouTube video, any movie watching channel/video or even to listening to audio channels like Spotify etc.,.
What it does
- In order to solve the problem for Hard of Hearing people, we built an Windows (since its the OS majority of the users use) application which converts all sound apps with Subtitles under the bar which makes an effective solution for Hearing problem users.
- It converts any application sound that is getting played or listened (Youtube, Netflix, Prime, Spotify, Zoom/Teams call, any Video, any Audio etc.,.) to on Windows OS into Subtitles of any language preference the user have.
- Now using our app, without disturbing their other apps, it displays Subtitles near taskbar of Windows for any sound getting listened to or played to.
- It also has capability of displaying subtitles in any language the user prefers to.
How we built it
- We are leveraging Azure cognitive services Speech-to-Text API and use it to convert audio/video played/listened/meetings happening on user Windows OS in to a written transcript.
- With DeepL Translate, we are converting transcript into user preferred language.
- We built it using Python language to build our GUI app for Windows OS.
- With PyQt5, Tkinter which are GUI libraries, we built a user friendly Windows OS app.
- A PyAudio library functionalities to leverage listening to what is getting played on the user computer.
How to use application
- Currently, as part of POC, we need to run app.py file from the code which is a Python script.
- It opens window where user have some options to select like Save transcription, convert into any other language of choice, no. of words to be displayed at once.
- They can start Subtitles, then anything is a sound app playing the background, it automatically displays the Subtitles near taskbar.
- Clone the git repo.
- Have Python venv created and install requirements.txt packages.
- Navigate to configure.py file and replace Azure Speech API key and DeepL API key.
- Run app.py file.
Challenges we ran into
- Initially, researching for Cloud Speech to Text caused some troublesome to our team.
- We have tried multiple cloud services for our idea and tested them best to suit our app functionality.
- Azure gave us the best results in terms Speech-to-Text accuracy, however we actually haven't appropriate documentation for integrating real-time asynchronous requests which listens to Microphone or audio that is getting played.
- Making Tkinter,PyQt5 GUI became little difficult as we are new to such libraries.
Accomplishments that we're proud of
- Impacting the Hard of Hearing community with a generic solution where they can leverage our app to full extent.
- Learning about different cloud services and how to leverage them.
What we learned
- As part of research of cloud services, we learned and tried Google Speech services and translator, AWS Speech services and translator.
- GUI libraries like Tkinter, PyQt5.
- Audio functionalities like PyAudio.
- Websocket programming and understanding.
What's next for SubstituteMe.ai - A Savior for Hard of Hearing Community
- We are currently running application via code as a POC, we wanted to convert into Executable (binary) file where users can download and install on Windows machine.
- Expand the same solution for Mac, Linux, Android users.
- Make it scalable for multiple user load requests at once.
Log in or sign up for Devpost to join the conversation.