Inspiration
To automate the manual process of football/soccer in-game analyses.
What it does
It annotates the in-game footage of football/soccer games to highlight the ball, players, and the referee. This has extensive utility in the post-game analysis of coaches and managers aiding in strategy formations and winning matches.
How we built it
Using various Python-based libraries and packages on computer vision like ultralytics and roboflow. This was used to build a robust vision detection model based on YOLOv8.
Challenges we ran into
Clustering the two teams was tough. An innovative approach of using Sigmoid Loss for Language Image Pre-Training (SigLIP) to form embeddings from individual player crops and combining it with UMAP, to convert the embeddings from (N, 768) to (N, 3) and finally performing a two-cluster division using KMeans worked like a magic in terms of forming a robust detection model.
Accomplishments that we're proud of
Successfully able to generate annotated videos of two teams with an average precision *over 90% * in distinguishing between the players of different teams, the ball, and the referee.
What we learned
Innovation is often at the end of long hours of persistent pursuit.
What's next for Vision for Football
Building visualizations for statistical metrics like top speed of players, heat maps, etc.

Log in or sign up for Devpost to join the conversation.