Inspiration
We were curious about the predictability (or lack thereof) of the beautiful game. It was our passion for the sport combined with this curiosity that inspired us to take on this challenge.
What it does
The model takes as input a sequence of five matches worth of statistics for a given player in the English Premier League, and outputs a prediction of that player's statistics on their next match.
How we built it
We trained an RNN model with data gathered from www.fbref.com, a popular and accurate data website for football statistics. We used the BeautifulSoup module in Python to scrape the data for 491 English Premier League players and their sequences of consecutive matches spanning the 2022-2023 season. Each player had 33 sequences of 5 matches, meaning we had 16,203 data points available to us. After formatting and preprocessing the data, we used Pytorch to train both an RNN and an LSTM model. The RNN performed better at the task at hand, and as such the LSTM model was discarded.
Challenges we ran into
One of the biggest challenges was formatting and preprocessing the data. We needed data that isn't readily available in the format and quantity we required to train the model, so we had to gather it and format it ourselves, which proved to be quite the challenge. Another roadblock was realizing that our model experienced stagnation in learning the patterns in our data and as such, it isn't anywhere near 100% accurate when making its predictions (turns out, the beautiful game really is unpredictable).
Accomplishments that we're proud of
The challenge of predicting anything football (or any sport) related is certainly a very difficult one, and one can never expect to produce perfect predictions for a sport that has gained notoriety and an immense fan following over the last 150 years because of its unpredictable nature. Keeping this in mind, we are proud that we were able to produce a model that could output somewhat decent educated guesses.
What we learned
As a team, we learned to deal with and overcome unexpected challenges that inevitably surge as the development of a model progresses, making us better equipped to more effectively deliver on development projects in the future. We reinforced our data manipulation and handling skills, and got some practical experience in the development of RNN and LSTM models.
What's next for BAller
As BAller stands, it is very much in a primitive stage. The potential for improvement is great, and there are many ways in which BAller could potentially be used as a stepping stone for larger, more robust ideas. Given the large room for improvement, BAller can serve as a proof of concept for future projects of similar style to refer to.
Log in or sign up for Devpost to join the conversation.