In today's world, Public Speaking is one of the greatest skills any individual can have. From pitching at a hackathon to simply conversing with friends, being able to speak clearly, be passionate and modulate your voice are key features of any great speech. To tackle this problem of becoming a better public speaker, we created Talky.
What it does
It helps you improve your speaking skills by giving you suggestions based on what you said to the phone. Once you finish presenting your speech to the app, an audio file of the speech will be sent to a flask server running on Heroku. The server will analyze the audio file by examining pauses, loudness, accuracy and how fast user spoke. In addition, the server will do a comparative analysis with the past data stored in Firebase. Then the server will return the performance of the speech.The app also provides the community functionality which allows the user to check out other people’s audio files and view community speeches.
How we built it
We used Firebase to store the users’ speech data. Having past data will allow the server to do a comparative analysis and inform the users if they have improved or not.
The Flask server uses similar Audio python libraries to extract meaning patterns: Speech Recognition library to extract the words, Pydub to detect silences and Soundfile to find the length of the audio file.
On the iOS side, we used Alamofire to make the Http request to our server to send data and retrieve a response.
Challenges we ran into
Everyone on our team was unfamiliar with the properties of audio, so discovering the nuances of wavelengths in particular and the information it provides was challenging and integral part of our project.
Accomplishments that we're proud of
We successfully recognize the speeches and extract parameters from the sound file to perform the analysis. We successfully provide the users with an interactive bot-like UI. We successfully bridge the IOS to the Flask server and perform efficient connections.
What we learned
We learned how to upload audio file properly and process them using python libraries. We learned to utilize Azure voice recognition to perform operations from speech to text. We learned the fluent UI design using dynamic table views. We learned how to analyze the audio files from different perspectives and given an overall judgment to the performance of the speech.
What's next for Talky
We added the community functionality while it is still basic. In the future, we can expand this functionality and add more social aspects to the existing app. Also, the current version is focused on only the audio file. In the future, we can add the video files to enrich the post libraries and support video analyze which will be promising.