Have you ever been overwhelmed by millions of news generated on various news platforms every day? We understand that this is a fast pacing world. People would like a quick and straightforward way to view the most important stories. Our project provides the fastest way for our users to know what is happening around the world - word cloud. This data visualization technique emphasizes the importance of news keywords by displaying them in different sizes. Therefore, instead of spending hours of time on reading newspapers, our users can learn about what's going on around the world at a glance. To further exploit the value of each news, we use the Google-Cloud NLP to categorize and analyze the news' sentiment. We also build a database using google cloud SQL to save these data so that users could set their preferences. For example, they can choose which news topics they would like to see.

Key Features

In a Rush? A Word Cloud

We tell you what is the most important news by word cloud. The bigger the sizes are, the more trending the news are.

Interested? Search or See Related News

Users can search for any word in the word cloud to search for this word in google news. There will also be a list containing all words in the word cloud that can take users to a compiled list of articles related to each word.

Only Interested in Some News? Edit Preference

Users can select their preference to only view certain news sources or certain types of news. There are 11 news sources to (multi-)select from (We will add more!). Sorting can be based on Keywords, High-Frequency Words, or Sentiment. Users might only be interested in the news within 24hr, 48hr, or 72hr. The number of words listed can be in a range from 0-60. There are also 25 different news categories for users to choose from.

Set Up and Run

Clone this Repo and use a web browser to open mainPage.html (Google Chrome is recommended).

Feel free to edit your preference for selecting certain proportions of the news. Then click "See What Happened" to view a word cloud and a keyword list generated based on recent trending news and your preference. We also provide a list of news related to a keyword.

Technical Details

Web Scraper

Dependent on python library Newspaper3k

Data Storage

We save the data that we scraped such as news title, main content, URL, data, categories, sentiment to Google-Cloud SQL.

Natural Language Processing

All natural language processing features in this project are implemented using the Google Natural Language API. We implemented word extraction and sentiment analysis.

Word Cloud

The word cloud is generated using the tag chart feature in AnyChart library of JavaScript.

What We learned

This is the first Hackathon for all of us. Although we are inexperienced and newbies to coding, we enjoyed the process of building a project that solves problems in daily life. In terms of coding skills, we learned how to use APIs, web scraping, website building using HTML and javascript, and storing data in the database. This is a very comprehensive learning experience, and we got the chance to explore a lots of coding skills that we never studied before.

What we didn't make it work

  1. The data exchange between front-end and back-end is difficult because we are not familiar with JSON and jQuery
  2. Some features we mention above are partially implemented.

