Almost all discourse is carried out online: people comment, tweet, retweent, respond, repost, and summarize.

Many of our online communities are made better when people engadge in a good discussion - one that is well informed and based on facts.

How many times have you been on a page regarding a topic - reddit, tumblr,

Well versed is intended for those times online when you want to know topics related to what you're currently viewing - and have a more "Well Versed" browsing experience.

What it does

Well Versed is a similarity engine based written in Javascript in the form of a chrome extension. It uses the Azure Machine Learning API's to extract relevant keywords from the current page, and the Bing news API to draw other news articles from around the web related to your topic. For context regarding the topic, it parses Wikipedia and returns a brief summary of the topic. Then the topic list populates with an image using the Bing Search API - which then shows up as part of the summary as well.

We hosted the service as part of NodeJS on Azure Ubuntu, which we engineered to have a huge capacity and memory.

The moment the user goes to a new page, our extension starts prefetching information about the page.

The interface comes in the form of a ticker on the right hand side of the browser, which opens a small window when clicked.

The window contains three relevant topics or articles, as well as an image. We believe minimal / non intrusive design is the best way to assist the reader while not distracting from the main content.

Challenges we ran into

The initial challenges we faced was our lack of data - initially we intended to go for an autocomplete system for commenting, based on machine learning rooted in the reddit comment database.

However, we later realized the reddit comment base is over 250GB of compressed data, and is only avilable via a Torrent magnet link. We hence could not acquire it without at least a day of downloading.

So we had to shift our project and make it more generalized, utilizing more tools from around the web to do this rather than just Reddit. We finally established a set of News, Images, and Encyclopidic content API's to fetch the content to feed our application.

Accomplishment we are proud of

We are proud of extracting insight out of the machine learning API's, to actually make information that is fitted to the user. This type of high level abstraction saved us the need to actually do the tokenization ourselves - which enabled a massive set of features .

What I learned

We learned how to integrate Azure and Indico platforms for Natural Language algorithms. We learned all different kinds of skills, one member took his first look at building UIs with Material UI and React

What's next for Well-Versed

Well-Versed could use even more streams of information, going forward our application could begin to gauge articles by their bias and begin to give a spectrum of views.

Built With

Share this project: