Summarai

Logo
Welcome Page
Display Page
Reset Page
Email Page
Content Page

Inspiration

Covid 19 affected everyone and what can be more difficult than preparing your exam through Online materials with long hours and couldn't get a summary of lecture notes? It can be difficult to digest so much content in a limited amount of time, especially during the midterm or exam period.

Introducing Summarai ! Summarai leverages machine learning and Natural language processing to allow students to capture quick summaries of lecture notes from learning videos to improve productivity and their learning experience. We wanted to develop something to support student productivity, celebrate learning diversity, and help students to destroy their exams like an unbowed Samurai!

What it does

Summarai is a tool that forges a magnificent katana from a youtube video so that you can fight like a samurai to destroy your exam. With the power of computer vision and Natural language processing, Summarai forges a katana sword(pdf file) that is filled with fire and blood(summary of the video contents with a screenshot of the image). A knowledge tree graph is also generated which provides a quick overview of the contents of the entire lecture video and a list of recommended video links will also be published to the pdf file.

How we built it

For Summarai, we have separated the front end and back end. Our frontend was wireframed on Figma and developed with HTML, CSS, and javascript. We also developed our own animation so that the user will have a better and comfortable experience. Our backend was coded in python with the FastAPI framework to implement a restful API system as the infrastructure for the app. Our user info and processed doc were stored in the firebase because it's resilient, convenient, and durable. Our project is running on Kubernetes and we use various open-source APIs to achieve most of our features. Summarai uses cutting-edge technologies such as OpenCV to divide and separate videos into different keyframes. In addition, we use Youtube-transcript-API to extract the subtitles of a video and divide them into different steps based on the starting time of each keyframe. For each summary, we use Google Cloud's Natural language APIs to extract keywords based on Audio by using analysis of entity sentiment, as well as Google Cloud vision API to extract keywords and relevant text from keyframe, Furthermore, we leverage Keras to build our own convolutional neural network with multiple layers of ReLU function to accurately label specific extracted keywords and highlight them as part of the summary to the individual section.

Challenges we ran into

While using Keras and CNN model in each image to perform text recognition was great, we struggled to get a better summary of each keyframe of the video lecture. We spent more time on how to use entity sentiment analysis from Google NLP API to get a better summary of audio along with Google Cloud Vision API to perform text recognition to improve the individual summary of lecture slides. result of text summary.

Accomplishments that we're proud of

We were extremely proud of our overall architecture and how our machine learning-related components got to provide a great result to improve student's productivity. We were also proud of our front-end which is user-friendly and colorful for the users to get a better learning experience while preparing for exams or studying in general.

What we learned

Our team learned how to use a fast API and various open-source APIs in general. We as a team learned more about neural networks and Keras as a framework and we got to learn a lot about cloud vision APIs and OpenCV on how to handle image processing and text recognition in general.

What's next for Summarai

We can add features such as a link for a specific keyword to give the users a better understanding of the context. We also need to improve the performance of our NN model so that we can reduce the dependencies from Google Cloud Vision API. In addition, we can also add more components to improve the overall throughput and reduce the latency of the training process with other cloud computing services.

Built With

cloud-vision-api
css3
fastapi
figma
firebase
google-cloud
html5
javascript
keras
kubernetes
neural-network
opencv
python

Submitted to

DeltaHacks 7

Created by

I worked on the machine learning and NLP components related to this project which include buuilding the NN model and using the google NLP APIs. I also did some setup work related to database and container environment

Yu Ang Zhang
I built the backend with FastAPI to handle both synchronous and asynchronous requests in python. Various other open source APIs, like openCV, pyPDF, youtube-download, are also used to achieve the back end features

Shawn Pang
I worked on the UI/UX and front end

Carol Chen