I wanted to practice programming with graphs, and baseball is the only sport I know about
What it does
The program takes in the names of any Major League Baseball player throughout history. The program outputs the shortest connection between the two players by mutual teammates.
How I built it
I compiled a list of team history on every MLB player referenced on "www.baseball-reference.com". The program only considers players whose baseball-reference profiles have an "Appearances" table with at least one entry.
Then I wrote a program that builds a graph based on this data where vertices represent the players and edges represent two players being teammates for at least one season. The shortest path between any two players could then be determined easily using Breadth-First Search.
Challenges I ran into
Much of the time was spent trying to retrieve all the players' data. The final scraping program "MLBListGetter" took about 4 hours to retrieve all the data.
Accomplishments that I'm proud of
- First time using such a large data set
- First time implementing a graph in Java
- First time scraping a website in Java
- First time participating in a Hackathon
What I learned
According to baseball-reference.com, Mark Kiger is the only MLB player who made no appearances. He's also the only player to cause an Exception while reading the player data.
What's next for MLBKevinBacon
Probably nothing. The runtime of the program could be improved, and the data could definitely be retrieved faster. It might be fun to try it with other sports. I know nothing about web/app development, so it might also be fun to design a webpage/app around the program.