Soccer is a strategic game of formation and spacing. Team formation is a strategy of occupying the field in a manner that establishes the maximal control on the game. It is one of the most noticed part of the game, often drawing ire to a 5-at-the-back and curiosity when watching 3-at-the-back. However, there is more to the story than the moving pieces on the field. The story is about what the space on the field says, regardless of the formation, possession or other statistics might say on its own.
Our project breaks down the soccer field in 12 zones, 3 each for defense, defensive midfield, attacking midfield and offence. The system will observe all the appropriate stats, like passing accuracy, shooting, clearance, fouls, etc. and rates the zone on a scale of 0 to 100 based on effective game play of a team in that zone. This analytic tool can be used for studying the patterns of teams for their self-improvement by focusing on training or strategy in that zone, or as means for opposition research to look for places on the field where one might just press a little harder or drift more into that zone.
We chose to perform this analysis on MLS teams based on their performances from season to season basis. Although we selected MLS as project, this project can be incorporated into every soccer league in the world or even International teams, given that required statistical demands be met to analyse the project. Up to 20 individual weighted factors were considered for each zone on the field to get a performance measure of all the players of the same team who contributed on the zone, like a pass or a shoot, crammed into a scale of 0 to 100. The zone would take information from a wing back to a winger as long as he plays in that zone. Needless to say, factors like clearances would achieve higher weight in the defensive zones of the pitch, whereas shots would be assigned higher weight in the attacking zones.
The factor selection and weight distribution was one of the hardest part of the project, given the overwhelming amount of raw data that was provided to us. We eventually came up with a metric system that would accept certain statistics based on the zone of the play that would accentuate stronger and weaker zones in the field of play for each individual teams in MLS. Another challenge we faced was over-generalization of weights and factors to match all the global soccer leagues and teams.
We were happy to build a system which would be able to tackle an important, but often ignored analysis of team performances in certain parts of the field. Although the zonal statistics might not say it all, it definitely would whisper an advice or two for a team to create better strategies.
This project gave us an opportunity to use data mining and learning skills in a manner which was informative and extremely fun as well. Analyzing the sport that we grew up watching in a manner this analytical and academic was remarkable for us.
Log in or sign up for Devpost to join the conversation.