Knowledge is all about accumulation, not only continuing to learn new stuff but also repeatedly tracing back what we have learned before. However, time is limited while resources and information are infinite. How can we best allocate our scarce time and achieve active and efficient learning, especially when it’s a recording instead of a reading?
What it does
It turns a voice recording that either is uploaded by the user or directly recorded in our app into a summary by using Google Could’s products and present to users the categories, semantics, and keywords of this voice message.
How we built it
We built Recorsum’s view controllers in XCode using Swift, stored users’ information and recordings in FireBase Database and Storage separately, and implemented its functions using Google’s APIs. We first turn the audio into text using Google’s Cloud Speech-to-Text API. Then, we use Google’s Natural Language API to detect the category, keyword, and sentiment of the text. Since these two applications are accomplished by Python instead of Swift, we put them in the Google Cloud Platform Compute Engine VM instance and create a listener to call them whenever needed.
Challenges we ran into
We ran into lots of problems in the whole process since we are all new to hacks, and are all programming beginners. Two critical problems are as following: Firstly, we had trouble converting the recording files from m4a to wav because Google Speech-to-text only takes in wav file while the recording files in the iPhone are all in m4a style. However, directly changing the name suffix of the file cannot change the root format of it, so we tried lots of methods to achieve this. Secondly, it’s extremely hard to incorporate Python into Swift code. Our two main scripts are written in Python language which is very hard to be called from our main swift code. We tried lots of methods and finally found out a really complicated method. We will present how we conquered this during our demos.
Accomplishments that we're proud of
I would like to first give credits to the result that we figured out an alternative way of calling python script from swift by placing a “listen” in the cloud and if we need the scripts, we will call from cloud.
What we learned
We learned various products of Google Cloud and how to best assemble and coordinate them to achieve what we want. We learned how to work as a team and support each other.
What's next for Recorsum
Because of the limited time during Hackathon, we are only able to finish the recording conversion part, however, in our ideal model, we will also be able to take in a video and give back the summaries. Our users will be able to simply past a URL and receive the summaries. Furthermore, if we continue to develop our app, we want to implement such a feature that it will compare several sources and based on their summaries, the product will return the most qualified resource according to the user’s input filters to better serve our community.