Inspiration
I was trying to think of ways of improving the overall experience of reading news articles online.
What it does
GTFO highlights objective statements on web pages. This allows the user to quickly scan through an article and identify key facts.
How I built it
First, I used a data set of objective and subjective statements as well as natural language processing (NLP) to create a classifier. Secondly, I created an API with the python flask web framework to host the classifier. Then I created the chrome extension to communicate with the server and actually highlight different sections of HTML.
Challenges I ran into
I found the chrome extension a little frustrating because you cannot make any HTTP requests from the script that is put on the actual web page "content script" for valid security reasons. So, to solve that I needed to create a "background script" which communicated with the web page and the API I created.
Another challenge I faced which still needs a better solution is how to decide which text on the web page to classify. Right now the extension will look at every 'p' tag on the page. However, web pages are very unique and this approach does not always yield good results on every page. Related to this issue, paragraphs often contain other HTML elements like 'a', 'strong', etc. This creates a lot of issues when trying to match sentences and determine what to highlight even if a sentence has been properly classified.
Accomplishments that I'm proud of
This was my first time creating a sentence classification model for real world data.
What I learned
Throughout the project I learned a lot of nuances of creating chrome extensions and how to implement NLP models.
What's next for Get The Facts Out (GTFO)
I want to add a new method of identifying important text on the page and highlight text more reliably. I also want to add a feedback feature to give the user the ability to rate the importance of facts or if the model was entirely wrong.
Log in or sign up for Devpost to join the conversation.