Inspiration
Inspired by IOS's Translate app, we wanted to allow people to connect not just in person but through the phone too. For this proof of concept we worked on implementing the live translation feature into a browser. This application could be expanded into a mobile app.
Health care applications for this app are to improve communication between providers and patients. This app would allow for near real-time translation between users that do not share a spoken language, greatly enhancing the ability for a patient or provider to communicate their medical needs.
What it does
When a T-Mobile customer connects their T-Mobile ID to the YNA API, the browser can now receive and make calls without a SIM. When in an active call, Translate with T-Mobile will display a live transcript, translated to your language of choice, of the other persons voice.
How we built it
We built it on top of the T-Mobile YNA API for phone calls and OpenAI for transcription. On the browser, we access the remote stream (the other end of the call) and record it in 100ms intervals. We send these to OpenAI Whisper-1 model for translation and transcription to get back text in desired language of choice.
Challenges we ran into
Everyone on our team was learning Javascript on the fly. This made is especially difficult to figure out how to use MediaStream objects given to us by the YNA Javascrip SDK that T-Mobile prepared to be used with the API.
Another challenge we ran into was live transcription. Unfortunately, we are only allowed 3 requests per minute from OpenAI, and therefore our transcription updates every 20 seconds instead of more often.
Accomplishments that we're proud of
The hardest part was setting up the YNA API, we are extremely proud of being able to make and receive calls over the internet. It's super cool doing real-time things. Understanding how to use the API and access the audio data was also super awesome, speaking into a phone and watching it translate into text was a first time experience for all of us.
What we learned
We gained practical knowledge of Javascript and what is possible with it. We also gained hands on experience with tinkering with a black-box and seeing what works and what doesn't. We also gained experience in setting up and configuring API's
What's next for Translate with T-Mobile
The next step for Translate with T-Mobile would be to research more open solutions for transcription. Currently, OpenAI is very limiting. Another solution is a model alongside the client code, that would improve latency but also increase size.
Duplicating the functionality of the app on mobile devices is another direction we would like to explore.
We would also like to better understand how to process media in Javascript, to better optimize audio collection.
Built With
- css
- html
- javascript
- python
- t-mobile
- yna
Log in or sign up for Devpost to join the conversation.