Researching something can be pretty slow. While there are many informative sources on the internet, it's difficult to get a high-level overview of an unknown concept without needing to read a lot of text. Usually, even just knowing some related words can help bridge this knowledge gap.
We created Monkeybar to have an easy way of understanding any concept by deriving from previous knowledge in the form of related terms. We also wanted it to be useful for exploring connections between different concepts in addition to simply browsing the internet to pass the time.
Monkeybar was also inspired by Wikipedia and other similar text visualization tools.
What it does
Monkeybar is a data visualization platform that can gather and graph a collection of relevant keywords for any high-level concept that has a Wikipedia page. It queries text data from the main page as well as several related pages to construct an undirected graph based on connections between individual tokens. Then, the graph data is presented to the user with nodes representing tokens and edges representing the relatedness of the tokens.
The application can also be used to query multiple topics and find relationships between them if there are any. In this case, it will construct separate graphs for each query and join them together using common tokens.
How we built it
We built Monkeybar's backend with Python as a Flask application. The backend is mainly responsible for constructing the graph that will be rendered to the user. After calling the Wikipedia API to get raw text, our algorithm generates edge values based on the proximity of words in a sentence. After this, it checks for common words and then prunes the graph using breadth-first search.
We used a relatively simple approach for the frontend, using Flask’s HTML templates along with CSS and JS for interactivity. We used Bootstrap for convenient styling and Sigma js with the ForceAtlas2 plugin to render the graph itself in a visually pleasing way.
Challenges we ran into
One challenge we ran into was conflicting visions about how the final product should work. After coming up with the idea, we didn’t have much time to flesh it out before the hackathon began. So, we had some disagreements regarding performance, scalability, and time management. However, we eventually ended up with a final product that everyone agreed with.
Another challenge we ran into was the heavy performance bottleneck caused by making network requests linearly. However, we quickly realized that multithreading can be used to make this process much faster, by making the requests in parallel. We also experimented with doing the graph processing in parallel as well.
Accomplishments that we're proud of
One accomplishment we’re proud of is having pretty good time management this time around. We weren’t scrambling to finish the project nearly as much as last time, meaning that we had much more time to make it polished and presentable. This also meant that the hackathon was less stressful and we had more time to enjoy the events.
We are also proud of our final product, which is simple and elegant but still useful. We feel that our application is something that we would actually use in our daily lives.
What we learned
In working on this project, we learned about common statistical approaches to the analysis and visualization of textual data. We learned a little about natural language processing, the branch of AI concerned with the interaction between computers and human languages. Although our graph construction algorithm is far from optimal, it has piqued our interest in learning more about this field.
We also gained experience in web scraping and crawling, as well as some knowledge of how search engines work behind the scenes.
What's next for Monkeybar
Although the application is in working condition, there are still many things that just couldn’t be done in the limited time frame of a hackathon. In the future, we plan to improve performance and make the product more scalable by delegating some of the work to the user’s browser. We also plan to improve the user interface and overhaul the core graph construction algorithm, possibly using NLP.