We enjoy having wacky conversations over the phone with our friends, and there's times when they something so hilarious that we want to capture it for posterity. We decided it would also be something that people of all ages can have fun with. This is how we came up with the idea for Friendboard.
What it does
Friendboard records a phone conversation of you and your friend(s) and splits the conversation into smaller .wav files that represent sentences in the conversation. With those individual .wav files, we assign them to individual buttons (with the text of speech labeled) to create a soundboard which can be replayed an infinite amount of times!
How we built it
We developed it using Android Studio for smartphones with Android OS. We created a simple UI where the user will input a phone number and press the "Call" button, which will initiate the call using Android's default call application. In the background, we record the conversation into one big .wav file, which is then split into individual .wav files containing sentences from the call. With the help of Microsoft's Oxford Project Speech API, we were able to convert those .wav files to text which would then be used as labels for the soundboard buttons.
Challenges we ran into
One challenged we faced was attempting to find an API that would take an audio file/sound clip and convert it to text. Two members of our team spent the majority of Friday night trying to find a suitable speech to text API to use. Eventually we stumbled across Microsoft's Oxford Project Speech API that was able to convert a sound clip to text.
Another challenge was finding a way to split the recorded conversation into individual sound clips. The Oxford Project Speech API only converts sound clips into text, and it enforces an maximum limit of 2 minutes for sound clips. As we needed to also store individual sound clips to play back to the user, we needed to determine a way to mark activity in the conversation and split it into separate .wav files. 1-2 members researched into analyzing sound clips and, with the help of the Java musicg library, implemented a custom function to analyze a recorded conversation and output the spliced conversations.
There was also a bug where the label of the buttons for our soundboard, although they played the appropriate sound clip, had generic labels (#1, #2, #3, etc.) instead of the text generated form the Project Oxford Speech API.
Accomplishments that we're proud of
- We were able to record a phone conversation between friends into a .wav file.
- We were also able to splice it into individual sound clips containing mostly complete sentences by analyzing and marking the activity in the conversation using Fast Fourier Transform libraries (musicg).
- We were able to send individual sound clips to Microsoft's Project Oxford Speech to Text API.
What we learned
- We learned a lot on Android Studio and would like to continue to develop apps on the IDE.
- We also learned how to manipulate .wav files using external libraries and cut it into smaller .wav files, each carrying a sentence or phrase of the phone conversation.
- We learned how to learn and implement complex APIs such as Microsoft's Project Oxford Speech to Text API and navigate how to implement its results into our app.
What's next for Friendboard
Friendboard is still trying to work around some bugs that prevent it from its full potential. 36 full hours is not enough for the complexities that Friendboard possesses, but we will continue to work on Friendboard after HackUCI.