Inspiration

With the flurry of new tech developments and media tools, people have access to an incredibly large amount of news and information about the world. To assist in keeping up with current events, we built Squeeze, a web-app that takes top stories from major news outlets and summarizes + highlights keywords from the article, allowing the user to explore more headlines and find what interests them.

What it does

Squeeze summarizes and highlights stories from top media outlets, yielding a highlighted summary but maintaining complete links to the original source. Squeeze takes in RSS feeds from major news websites, and scrapes their article contents. This data is fed into machine learning algorithms to clean and summarize their contents. Keywords are chosen using the Google Cloud natural language API. The user, upon choosing a headline of interest, are presented with a highlighted text summary, and easy-access links to the original source if they are further interested.

How we built it

We used the Python package Feedparser to parse RSS feeds of new websites, and the package BeautifulSoup to scrape data from these feeds. Machine Learning algorithms were used to clean and summarize the scraped HTML data. Keywords in the article were chosen using the Google Cloud natural language API, and built the custom search engine using the Google Custom Search API. Front-end work was performed using Javascript + HTML.

Challenges we ran into

There were couple of challenges that we ran into. 1) How to deploy the summarized text on the Google Cloud. 2) How to connect and visualize the scraped data through the search engine.

Accomplishments that we're proud of

We are proud that we were able to finish the ML algorithm for text summation and how to find keywords in the text.

What we learned

We learnt how to ML in order to clean the HTML data as well as how to find keywords in the text.

What's next for Squeeze

To be continued

Share this project:

Updates