Well-Versed

Inspiration

Almost all discourse is carried out online: people comment, tweet, retweent, respond, repost, and summarize.

Many of our online communities are made better when people engadge in a good discussion - one that is well informed and based on facts.

How many times have you been on a page regarding a topic - reddit, tumblr,

Well versed is intended for those times online when you want to know topics related to what you're currently viewing - and have a more "Well Versed" browsing experience.

What it does

Well Versed is a similarity engine based written in Javascript in the form of a chrome extension. It uses the Azure Machine Learning API's to extract relevant keywords from the current page, and the Bing news API to draw other news articles from around the web related to your topic. For context regarding the topic, it parses Wikipedia and returns a brief summary of the topic. Then the topic list populates with an image using the Bing Search API - which then shows up as part of the summary as well.

We hosted the service as part of NodeJS on Azure Ubuntu, which we engineered to have a huge capacity and memory.

The moment the user goes to a new page, our extension starts prefetching information about the page.

The interface comes in the form of a ticker on the right hand side of the browser, which opens a small window when clicked.

The window contains three relevant topics or articles, as well as an image. We believe minimal / non intrusive design is the best way to assist the reader while not distracting from the main content.

Challenges we ran into

The initial challenges we faced was our lack of data - initially we intended to go for an autocomplete system for commenting, based on machine learning rooted in the reddit comment database.

However, we later realized the reddit comment base is over 250GB of compressed data, and is only avilable via a Torrent magnet link. We hence could not acquire it without at least a day of downloading.

So we had to shift our project and make it more generalized, utilizing more tools from around the web to do this rather than just Reddit. We finally established a set of News, Images, and Encyclopidic content API's to fetch the content to feed our application.

Accomplishment we are proud of

We are proud of extracting insight out of the machine learning API's, to actually make information that is fitted to the user. This type of high level abstraction saved us the need to actually do the tokenization ourselves - which enabled a massive set of features .

What I learned

We learned how to integrate Azure and Indico platforms for Natural Language algorithms. We learned all different kinds of skills, one member took his first look at building UIs with Material UI and React

What's next for Well-Versed

Well-Versed could use even more streams of information, going forward our application could begin to gauge articles by their bias and begin to give a spectrum of views.

Built With

Submitted to

HackDartmouth II
- Winner 3rd Place

Created by

I helped design the front-end, save our API responses, and integrate that response in the React view.

David Siah
Worked on frontend design and the implementation of the view and function for the chrome extension

Maidi Lin
Worked on backend, configured Bing's Search API including News and Images. Worked with Wikipedia API to extract contextual summaries. Set up and used Azure's Keyword extraction service API.
Set up the development server and the domain name.

Eden Zik
Built the backend in NodeJS. Integrated Indico's natural language APIs (keyword extraction + named entity recognition), Azure's natural language APIs (keyword extraction), Bing Search API, and Wikipedia API.

Kahlil Oppenheimer

Updates

David Siah started this project — Oct 04, 2015 08:46 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.