I wanted to do a project which involved real time Wikipedia edits. I thought visualizing categories in real time would be a neat demonstration of vis.js (graph library) and socket.io (realtime).
What it does
It receives edits to Wikipedia in real time, gets the categories for each edited page, sends them to the web client, and shows the top 20 categories (size correlating to amount of categories) from the past 30 seconds. It links together categories which occurred on the same pages, and groups categories into clusters based on where they appeared. This allows the user to see a real time view of what types of pages are being edited right now on Wikipedia.
Note: The app takes 15 seconds to initially load content to display.
How I built it
I used Python and the Flask web framework to run the backend server. I used a python IRC library to connect to the Wikipedia real time edits IRC server, the python Wikipedia API to get the list of categories for each edited page, and socket.io to then broadcast these categories out to all the connected web clients.
Challenges I ran into
Because I was running both socket.io and an IRC client on the server side, I ran into threading issues when I wanted to broadcast events from the IRC client to the socket.io server. To fix this, I had to use eventlet's monkey_patch function for cross-thread communication.
Accomplishments that I'm proud of
I was able to integrate multiple real-time streams into one smooth web-app experience.
What I learned
I learned how to make dynamic vis.js graphs using real-time data.
What's next for WikiGraph
Smoother updates of categories (right now it is abrupt).