We were excited by our challenge's clear but broad social good remit. Looking at some of the biggest problems threatening the world right now, we realised they all have one thing in common: effective engagement with the public is a crucial step to solving them.
In our experience, scientists often do not appreciate the way negative stories develop and spread, even when it ends up costing them grants or harms their reputation. We wanted to use this hackathon as an opportunity to address that problem.
What it does
Gorgias combines the power of the Azure Cognitive Services APIs with different web-scraping and news APIs to identify and visualise the most common themes of discussion around a scientific theory, individual, or institution, as well as provide insight into what themes are starting to trend now.
How we built it
The project works by taking a key term or phrase and then scraping down a long list of relevant words and concepts from https://relatedwords.org/relatedto/. These terms are then used to query relevant articles from Google News which are then scraped into memory.
We use Microsoft Azure TextAnalytics to identify the following:
- The general sentiment on a topic (negative, positive, or neutral).
- Key entities such as people and locations.
- Key phrases to get the general gist of a subject, these are then also tracked through google search trends.
- To identify the most relevant articles to the topic.
Challenges we ran into
Our main challenge was the performance of the application. We are limited by the large amount of time it takes to scrape down the relevant articles and the rate limit of how many requests we can make to Azure. This could possibly be solved with more clever caching mechanisms.
Accomplishments that we're proud of
- Getting a basic version of the application up and going
- Managing to build a compelling visual interface
What we learned
- Basic natural language processing