Inspiration

To automate the manual process of football/soccer in-game analyses.

What it does

It annotates the in-game footage of football/soccer games to highlight the ball, players, and the referee. This has extensive utility in the post-game analysis of coaches and managers aiding in strategy formations and winning matches.

How we built it

Using various Python-based libraries and packages on computer vision like ultralytics and roboflow. This was used to build a robust vision detection model based on YOLOv8.

Challenges we ran into

Clustering the two teams was tough. An innovative approach of using Sigmoid Loss for Language Image Pre-Training (SigLIP) to form embeddings from individual player crops and combining it with UMAP, to convert the embeddings from (N, 768) to (N, 3) and finally performing a two-cluster division using KMeans worked like a magic in terms of forming a robust detection model.

Accomplishments that we're proud of

Successfully able to generate annotated videos of two teams with an average precision *over 90% * in distinguishing between the players of different teams, the ball, and the referee.

What we learned

Innovation is often at the end of long hours of persistent pursuit.

What's next for Vision for Football

Building visualizations for statistical metrics like top speed of players, heat maps, etc.

Share this project:

Updates