INSPIRATION: -The inspiration for this project came from the idea of leveraging AI and data analytics to enhance the baseball experience for players, teams, and fans.
KEY MOTIVATIONS:
- Providing real-time game analysis for fans.
- Helping teams and players with data-driven performance evaluation.
- Improving fan engagement through personalized content.
- Exploring AI-powered predictive models to forecast outcomes and player performance.
TOOLS USED:
Throughout this project, we gained deep insights into:
- Big Query for large-scale data handling:
Learned how to clean, store, and query massive MLB datasets efficiently.
- Google Cloud AI/ML tools:
Used Vertex AI for training and deploying models.
- Data Cleaning & Preprocessing:
Ensured that all datasets were free from anomalies.
- Predictive Modeling:
Developed AI models for win probability, play-by-play predictions, and player insights.
- Fan Behavior Analysis:
Learned how user engagement data can personalize the baseball experience.
- Web Integration:
Explored API-based architecture to serve insights dynamically.
HOW WE BUILT THIS PROJECT?
Phase 1: Data Collection & Cleaning
- Uploaded MLB data to Google Big Query.
- Cleaned datasets by removing anomalies, standardizing formats, and ensuring consistency.
- Created structured tables for game-level, player-level, fan interaction, and video insights.
Phase 2: Model Development
- Built AI models to predict game outcomes, player performance, and clutch moments.
- Used statistical analysis for non-AI insights like team comparisons and engagement trends.
- Implemented ML pipelines in Vertex AI for automated model training and deployment.
Phase 3: Web & API Integration:
- Designed APIs to fetch real-time analytics from trained models.
- Developed an interactive web dashboard to display insights.
- Connected fan engagement data to personalize highlight reels and content recommendations.
CHALLENGES FACED:
Data Cleaning Complexity:
- Merging different MLB datasets required careful handling of missing values, outliers, and inconsistencies.
- Merging different MLB datasets required careful handling of missing values, outliers, and inconsistencies.
Model Performance Optimization:
- Achieving high accuracy in predictive models required fine-tuning hyperparameters and using appropriate feature engineering techniques.
- Achieving high accuracy in predictive models required fine-tuning hyperparameters and using appropriate feature engineering techniques.
Real-Time Processing:
- Ensuring low latency predictions for live game tracking was challenging, requiring efficient cloud-based deployment.
- Ensuring low latency predictions for live game tracking was challenging, requiring efficient cloud-based deployment.
Scalability Issues:
- Managing large datasets and optimizing Big Query queries for performance improvements.
- Managing large datasets and optimizing Big Query queries for performance improvements.
FUTURE ENHANCEMENT:
- Implement real-time sentiment analysis on fan discussions and commentary.
- Expand AI models to simulate game strategies for coaches and players.
- Improve fan engagement features with customized notifications and social media integration.
This project was an incredible learning journey, blending sports, AI, and data analytics to create an engaging MLB analytics platform. The experience has provided invaluable skills in cloud computing, machine learning, and big data analytics.
Built With
- bigquery
- cloud-functions
- cloud-logging
- cloud-run
- cloud-storage
- dataflow
- docker
- firestore
- flask
- google-cloud
- javascript
- mlb-stats-api
- numpy
- pandas
- python
- react
- sql
- tensorflow
- vertex-ai
Log in or sign up for Devpost to join the conversation.