Inspiration
The inspiration behind this project was how many tired and stereotypical football clichés and sayings there are that get trotted out every game, with no one challenging or investigating them. After listing these clichés, it became apparent that the topics surrounding these clichés could be explored further with the use of data and Graph, to gain useful insights into patterns and trends within the game, which can then be used by fans, commentators and coaches. The insights produced would not only test the credibility of these clichés, but provide information on how teams, players, and managers are affected by each other and weather conditions as well. The key cliché we picked out was 'can they do it on a cold rainy night against Stoke'. This cliché was well suited to being analysed in Graph, as relationships needed to be found between players, teams, matches and weather. Graph then has the potential to explore the relationship between weather and performance in a more general sense
What it does
- links data on various factors involved in the great game that is football
- finds relationships between different factors
How we built it
- First we dealt with the free access data and using python and SQL transformed it to include the necessary data in each csv
- We then built the Schema, adding the required attributes to each vertex and edge
- Once this was complete we uploaded the data and mapped it to the schema
- Finally we wrote queries to investigate the data and came to some interesting conclusions as to whether they can do it in stoke on a rainy day
Challenges we ran into
- Finding unified data and linking the match id, player id, etc in a way that was uniform
- learning about graph technology and being able to apply it to the given problem
- learning gSQL and how to use it within the time constraints of the project
Accomplishments that we're proud of
Our TigerGraph database was able to validate the various sayings/cliches used frequently by football analysts and pundits. This meant our graph and schema were very applicable real-world problems faced by football statisticians and analysts. The schema designed was clear and concise for anyone to understand and in future further queries could be conducted for further data analysis by fans or pundits due to our graphs user-friendly nature.
What we learned
- the advantages that graph technology offers and its enhanced ability at identifying relationships compared to relational databases
- the gSQL language and how to query using this
- how factors that we didn’t think were linked impacted performance of games
What's next for Tigergraph MDC Football Solution
Given more time to explore data available to us, we would look to explore more clichés in more detail. Linking data to do with player transfers would allow us to see which clubs sold/bought frequently from another club, enabling us to view feeder clubs (a club that regularly sells players to another club). Another area we would like to expand on would be individual events within matches, such as a pass, shot, cross, tackle, interception etc. Using Graph to analyse these events would allow us to view relationships between team mates on the field, and how event types and their frequencies change over the duration of a game, amongst many other metrics. For now, we hope that the analyses we've produced help to inform wider audiences of the truths (or lack thereof) behind footballing clichés.
Built With
- football-database
- gsql
- https://www.aerisweather.com/
- python
- sql
- tigergraph
Log in or sign up for Devpost to join the conversation.