During my early years studying abroad, I often had to carry an audio recorder to record the lecture content so that I can review them after the lecture, as my English wasn't up to the challenge. Even so, I still had difficulty deciphering some of the words in the recordings that weren't as clear. But with AI, perhaps we can decrypt the entire lecture into plain text, which is much easier to translate, mark key takeaways, it is just much easier to deal with.

What it does

To utilize Azure’s pre-trained speech-to-text (and potentially more) models create a program that takes in meeting audio/video file as input and generates a TXT/DOCS/PDF file which includes the key contents of the meeting/lecture, with extra references about the keywords from Wikipedia/other Wiki-like sources.

How we built it

By using keyword extraction methods such as Rake, TopicRank, and Yake!, combined with Azure's state-of-the-art cognitive service. We were able to create a program that takes in audio files, converts them into text, and using keyword extraction methods to automatically scrape off a brief definition of those keywords from sources like Wikipedia.

Challenges we ran into

Many of the libraries we used in this project had very premature documentation. We wasted a lot of time trying to figure out how we can use the library by going through the API's source code. Many of the dependencies used by the NLP libraries we imported are in conflict with Azure's default libraries. It was the single most time-consuming obstacle we had to deal with in this project.

Accomplishments that we're proud of

Despite coming into this challenge with less than a month left to the deadline, we still managed, from not knowing how to use Azure at all, to be able to use Azure Machine Learning Studio, and combine it with Azure's services to enhance our project us was an accomplishment as it gave us a very generalized, yet important insight into how cloud computing service works in practice.

What we learned

We definitely sharpened our understanding of NLP and a couple of heuristics to improve our results as well as how to use cloud computing platforms such as Azure.

What's next for Meeting Note Generator - Minimal Viable Product

Below is a list of what we have planned for this project: Tryout more keyword extraction methods to improve the keyword highlight functionality. The program can process video files The program can capture information from both video (like lecture figures) and audio. The program can identify the lecturer’s/meeting holder’s opinion/definition on the subject (keyword) The program can export the note in a variety of formats: PDF, MS DOCS. A web-app is built to support the program

Built With

  • azure
  • azure-cognitive-service
  • python
  • rake
  • speech-to-text
  • topicrank
  • yake
Share this project: