With all the news out there it becomes much more difficult to be able to filter out and get a good overview of the content you're consuming. We built Newstream with all the events happening around President Trump in particular -- we wanted a better way to visualize the data from news articles and the trends that change over time around those particular topics. We wanted the information to be displayed clear and concise, unlike many of other news aggregators out there that seem to overwhelm users with information overload.
What it does
Newstream aggregates news articles and performs sentiment analysis to display sentiment score, consensus (negative to positive), sentiment spread and sentiment changes over time in a user friendly and minimalistic interface. For each article, it performs sentiment analysis to give them a sentiment value as well.
How we built it
For the front end, we used HTML and CSS and kept it clean, functional and simple. For the backend, we used Python/Flask and built modules to process the news articles and sentiment for each article and for all in total. We used the News API and Twitter API to aggregate news articles and tweets related to the search query, the Indico API for sentiment analysis, and Plotly to generate our graphs based off sentiment values over time.
Challenges we ran into
Our main problems was being able to limit our use of Indico for sentiment analysis because not only were API calls time consuming, we only had a limited number of uses, which we exhausted over the course of the hackathon. Another major challenge was being able to get sentiment values over time for our graph, we struggled with trying to find a method to track this metric.
Accomplishments that we're proud of
Successfully overcoming our challenge of getting sentiment values over time. Our initial idea was to save these values in a database, but of course the main problem is that we would have to make an entry for every query. We decided that this was too time consuming in terms of cases with conflicts. Instead we decided to calculate it on the spot, which reduced loading times and worked accurately.
What we learned
Although the News API is a quick and easy way to grab news based off a search query, it is not very accurate and is also not very thorough, as it would only load news articles close to today or yesterday. For something like news aggregation, we realized that it would be much more accurate if our program directly worked with the APIs of many news sources.
What's next for Newstream
Since API calls to Indico every time we needed to calculate sentiment were very time consuming, we think it would be much better if dealt with it ourselves, such as familiarizing ourselves with the nltk library. This would require us to learn about natural language processing but we think this would be a great learning opportunity.