A popular theory is that everyone in the word is six degrees of separation from each other. We wanted to see if that applied to Wikipedia articles, too.

What it does

The program finds how many clicks it would take to go from one Wikipedia article to another. It shows you the link path so you can click through it yourself.

How I built it

A Java breadth-first-search generates a link tree from the starting article and a backlink tree from the ending article consisting of all articles which are connected to that one. It iterates one level down on the smallest tree until it times out, or it finds an article which is shared by both. The Java program is connected to a Node.js server which also generates the frontend.

Challenges I ran into

The biggest challenge was, and still is making the code faster. Because of the vast number of articles on Wikipedia, one level of the backlink tree would have over 10000 articles. Furthermore, the network requests are several orders of magnitude slower than the rest of the code, and each API request is limited to 500 article requests.

Accomplishments that I'm proud of

This was one of the smoothest experiences I've had with a hackathon before. Because I was already familiar with Java, JSON parsing, and making network requests, I could focus my effort on developing a breadth first search algorithm, something I haven't done before.

What I learned

This project was the first time I had to worry about code performance. Instead of finding bugs, like I normally did, I had to instead find the lines which took the most time and optimize them.

What's next for Wikipedia Path Finder

Optimizing the code so it doesn't take hours to find anything that's more than three degrees of separation apart.

Built With

Share this project: