Remembering the important conversations we have made throughout the day with different people often becomes a challenge while building a social network. We realized that documenting and summarizing such discussions is extremely helpful for people who want to maintain the connections they've made. Over the years, we have found ourselves meeting people only to forget what it was that we were talking about the last time we met. However, when we initially devised this idea, we had limited it to summarizing phone and video calls or conferences. Later, we realized that building an app such as AutoNote would also prove beneficial in the classroom or an academic setting to briefly read over lecture notes or other discourse ideas and help us in our school lives.
What it does
AutoNote essentially takes notes on the dialogue you make with others when you want to be able to view key points about an exchange in the future. It can also take notes for you on other talks and speeches for you to be able to view later. Using our app can give you a more productive and broad overview of what took place. As a user, you can save time on comprehending the key points of a passage or conversation, and instead be much more prepared for the future.
How we built it
Technologically speaking, AutoNote uses the IBM Watson Speech-to-Text API via Bluemix to understand what someone is saying and convert it into raw text that we can then analyze. Using Natural Language Understanding from Bluemix, we were also able to extract the different segments of sentences and make bullet points for notes that remove any unnecessary or unimportant words. Finally, by grouping the attributes of different subjects in a passage, we provided a cleaner and more organized summary for our app's users.
Challenges we ran into
We ran into a few challenges by using IBM's Speech-to-Text API, but eventually found other ways to get closer to our ideal goal. One such challenge was the system of punctuation in converting speech to text (or lack thereof). By not providing proper periods or other forms of punctuation, it was much harder to break up and search for the parts of a sentence, because IBM Watson would analyze the whole paragraph at once. Furthermore, the speech to text API we used had slight inaccuracies when surveying what someone is trying to communicate with someone else through dialogue.
Accomplishments that we're proud of
Still, we are happy with the astounding results we have achieved at MenloHacks by allowing for summarizations and note-taking of lectures or other first person speeches. Using this in a classroom, university, or even at a meeting in an office, someone can know that their phone will be a reliable notetaker and scribe for what they are thinking or discussing at the time. Overall, we have come very close to our initial objective of providing recaps to users of events in their lives.
What we learned
We learned that using IBM's Watson APIs are very helpful but are not as accurate as training and using a neural network that can truly take advantage of deep learning algorithms to summarize text. We also learned that we can create an algorithm ourselves for properly summarizing different texts and support that as an open-source platform or a paid service online. Lastly, we were able to compare and find the best service for speech-to-text that could fit our needs best.
What's next for AutoNote
Currently, AutoNote is slightly restricted in analyzing and processing voices from different people with different voices. Gradually, we can make AutoNote much more accurate in its listening skills and enhance the accuracy at which it preforms. Moreover, we can find the difference between multiple voices and recognize who is speaking at which time. This would make our platform even more useful for conference calls, as our app can serve as a record keeper of what types of responsibilities may fall on which people in a meeting.