When searching for news, it's pretty easy to get stuck in a single point of view. Most search engines are focused on producing a high returns in clicks, which means that the best content they can present is one which is agreeable for each user, however that creates a bias and restricts the user from viewing and exploring different sides of a story.
What it does
Alius is a web based search engine for news sites which focuses on producing diverse results without sacrificing relevance. Whenever an user enters a query, the system aims to provide results that represent different emotions, hence producing a set with diverse points of views.
How we built it
First we scrapped several popular newspaper sites to have a significant corpus for the application, this corpus of documents is stored and indexed on a ElasticSearch cluster hosted on IBM's bluemix. The documents are further processed using IBM Watson's tone analyzer to retrieve the scores for different emotions. A python server provides an API for querying the stored documents and implements a submodular probabilistic model which encourages the creation of sets of fixed size with diversity in the features of its elements, these subsets are returned as the result of the query.
The frontend was implemented using AngularJS as the main framework, and Angular Material for the UI elements and UX following the Material Design standards. The web client was implemented to work on mobile phones, tablets and desktops and follows the principles of responsive design.
Challenges we ran into
The scrapping and curation of the corpus took a significant amount of time and required manual checking to ensure that issues like duplicates and bad articles were reduced to a minimum.
Despite the focus on a responsive design and the use of a reliable UI framework, there are still issues in some browsers and platforms where the user experience is not as intended.
Accomplishments that we're proud of
After sharing the link through Twitter and the event's slack channel we got a really positive response and our analytics showed a great interest in the product.
The search results are pretty good despite the issues with the corpus and the naive approach to searching with elastic search, i.e. we are using the default indexing and text analysis. The diversity is visible and the tradeoffs between relevance and diversity can be easily balanced by adjusting the parameters of the model.
What's next for Alius
Learn more about what people like about the product and how the use it, and see if it can gain traction to be an alternative to established news aggregators and search engines.