The idea of PyLingo began with trying to help people with hearing disability communicate in the real world through video chats. The main initial purpose of PyLingo was to allow people to speak and convert their speech to readable text for the hard of hearing. While working on developing this application, we decided to add multi-language support. This opened us to a bigger and better use of PyLingo. PyLingo can be used by businesses to conduct international video conferences. With the multi-language support, attendees can speak in different languages and it would convert to the language each individual can read. PyLingo will help businesses to expand by allowing them to easily hire and then communicate with people with hearing disability or language barriers.
What it does
PyLingo allows a user to communicate via any platform (facetime, skype, etc) and get real time closed captioning on their communications. PyLingo also lets users set what language they want their captions to be in. The reason for this is, although English is a widely understood language we want to be all encompassing and be able to support multiple users.
How we built it
We used Python, Flask, speech_recognition module which uses google-speech (along with multiple speech to text APIs), google-cloud-translation api, Raspberry PIs, PyAudio, etc.
Challenges we ran into
Figuring out how to parse the text best for speech to text recognition. Really hard to delimit WAV files with the Raspberry PI and PyAudio Getting information from server to client, as opposed to client to server Live / Realtime updating Using Flask for the first time. Figuring out what exactly a conversation was.
Accomplishments that we're proud of
For two of us it was our first hackathon, we learned ALOT. Laptop without a laptop charger.
What we learned
Flask, AJAX, Bring laptop charger next time. Don't underestimate caffeine in candy form. google-speech-api google-cloud-translation json Audio parsing Dealing with WAV files
What's next for PyLingo
A mobile app, with built in video support instead of relying on available clients or as a plugin to those clients. Adding multi language support in terms of speech to text. Possibly text to speech as well Having a formal algorithm, for getting text from PI -> Application.possibly using markov models, for even more accurate speech to text (due to less room for error).