Inspiration

Cricket is followed by billions, yet detailed performance analytics are mostly available only at the international level. Domestic and club players often lack access to structured data insights. As a cricketer myself, we wanted to bridge this gap by creating a platform that analyses every level of cricket, giving equal analytical depth to both rising and established players.

What it does

Cricklytics analyzes data from over 11,000 cricket matches collected through custom web scraping. It identifies patterns in player performance to: • Classify batters and bowlers based on their playing styles and consistency • Estimate workload levels to understand fatigue and predict potential injuries • Recommend balanced team combinations based on player form and fitness

It also features a custom fine-tuned Gemini API trained on our processed cricket dataset. This version outperforms the standard Gemini model for cricket-related analysis and allows users to query insights, player comparisons, and workload risks through natural language conversations.

The system computes a workload score that reflects a player’s match intensity, rest periods, and recent activity:

W = f(P, F, S)

where • P = performance and match involvement • F = fitness and recovery time • S = schedule and travel frequency

How we built it

We started by scraping and processing match data from multiple leagues and tournaments, cleaning thousands of inconsistent and incomplete records. This raw data was then transformed into a unified, high-quality dataset - something that had never existed publicly for cricket before. From this foundation, we trained neural networks to classify players, analyze performance trends, and predict injuries based on workload patterns.

Our backend was built using FastAPI for speed and scalability, while Next.js powers the interactive frontend interface. Supabase handles real-time data storage, authentication, and seamless integration across modules.

One of the project’s biggest breakthroughs was developing a custom fine-tuned Gemini API trained on our own dataset. This specialized version goes beyond standard models by understanding cricket-specific context, player roles, and match dynamics - enabling anyone to query deep insights through natural language.

Every component, from model training to deployment, is version-controlled on GitHub. We also published the processed dataset on Kaggle, so developers, analysts, and researchers worldwide can build upon it - marking one of the first open, cleaned cricket datasets of this scale and depth.

Challenges we ran into

• Collecting consistent data from multiple sources with varying formats • Cleaning incomplete or duplicate match records • Ensuring workload metrics made sense in real cricket scenarios • Translating analytical outputs into actionable insights for selectors and scouts

Accomplishments that we're proud of

• Built the first comprehensive cleaned dataset of 11K+ cricket matches and published it on Kaggle • Developed a fine-tuned Gemini API specifically for cricket analytics and player insights • Combined statistical modeling with conversational AI for an intuitive experience • Made professional-grade analytics accessible to domestic players, coaches, and scouts

What we learned

• How meaningful insights come from clean, reliable data • The real-world connection between player workload and performance • How to simplify machine learning results into practical cricket understanding • The importance of making open datasets accessible to the community

What's next for Cricklytics - AI Powered Cricket Analytics Platform

We plan to: • Integrate live match data for real-time workload and performance updates • Extend analytics to Women’s cricket • Build mobile and web dashboards for selectors, coaches, and analysts • Partner with academies and domestic boards to support data-driven scouting and training

Cricklytics will continue to evolve as a platform that connects cricket passion with data intelligence - empowering every player, at every level, with insights that were once out of reach.

Built With

Share this project:

Updates