I wanted to do a project which involved real time Wikipedia edits. I thought visualizing categories in real time would be a neat demonstration of vis.js (graph library) and (realtime).

What it does

It receives edits to Wikipedia in real time, gets the categories for each edited page, sends them to the web client, and shows the top 20 categories (size correlating to amount of categories) from the past 30 seconds. It links together categories which occurred on the same pages, and groups categories into clusters based on where they appeared. This allows the user to see a real time view of what types of pages are being edited right now on Wikipedia.

Note: The app takes 15 seconds to initially load content to display.

How I built it

I used Python and the Flask web framework to run the backend server. I used a python IRC library to connect to the Wikipedia real time edits IRC server, the python Wikipedia API to get the list of categories for each edited page, and to then broadcast these categories out to all the connected web clients.

Challenges I ran into

Because I was running both and an IRC client on the server side, I ran into threading issues when I wanted to broadcast events from the IRC client to the server. To fix this, I had to use eventlet's monkey_patch function for cross-thread communication.

Accomplishments that I'm proud of

I was able to integrate multiple real-time streams into one smooth web-app experience.

What I learned

I learned how to make dynamic vis.js graphs using real-time data.

What's next for WikiGraph

Smoother updates of categories (right now it is abrupt).

Built With

Share this project: