Inspiration

For this project, I was really inspired by the possibility of building a video analysis tool using a real GCP environment, Vertex AI tools, and Gemini LLM. It was a great hands-on experience as I was trying to build something I never did before.

What it does

My project name is "Statcast Extractor" and the main goal is to provide the possibility to find and review video content for old games and extract metrics from them. Users just need to know which game content they are looking for, select the appropriate game details (season, team, game), and choose the video. After, video extraction can be started with one click. Currently, it supports the following metrics detection: Pitch Velocity, Exit Velocity, Projected HR Distance, Launch Angle, Max Height, and Arm Strength.

How we built it

The first step was to build a dataset with a huge amount of video content related to statcast information. I found helpful Vertex AI Workbach, where I was able to develop a Python script. Script helped me to go over the 2019-2024 seasons and find all related content/teams/games from statsapi.mlb.com/api/v1, plus filter only videos of home-runs/stacast, and save the final dataset to BigQuery. For this, I used:

  • /teams?season={season}&active=true - to get active teams for selected season
  • /schedule?sportId=1&season={season}&gameType=R - to get games and combine with active teams
  • /game/{game_pk}/content - to get game content For filtering, I used keywordsALL property under /content API. The appropriate video should have at least one of provided the tags below:
{  "type": "taxonomy",
    "value": "player-tracking",
    "displayName": "Statcast"
    }
{  "type": "taxonomy",
    "value": "home-run",
    "displayName": "home run"
    }

The next step was to implement a UI application that would provide users with an easy way to find exact game content from the BQ dataset. Here I did a simple React/Next.js (+ a few third-party libs) app for front-end and cloud functions on a back-end (using functions_framework). With Cloud Run and Cloud Build, I created services and build triggers for them, to get my new changes always easily deployed and available on the 'production' env. As a result, the user can now find the exact game video, and review it.

The final step was implementing a video analysis mechanism. Here I decided to use Cloud Functions with vertexai library (for Python) to interact with the Gemini model. The first step after the user triggers extraction for some video - is to upload the video to the GCS bucket. The second step is to set up an effective Gemini prompt with details of what should be done and provide video as content for LLM, the last but not least is to run an analysis and return results in a predefined JSON schema to have a possibility to easily parse it on the UI app side. To construct and test an effective prompt structure I used Google AI Studio and Vertex AI Prompt Gallery, also it helped me to test with different model versions and temperatures, and as a result, I found model gemini-2.0-flash-exp as the most effective for this task.

Challenges we ran into

The first challenge was data acquisition, which I addressed using a Python script to retrieve video content from a statsapi.mlb.com/api and organize it based on teams, seasons, and games. The data was filtered down to Statcast and Home Run videos, and saved to Big Query.

The second challenge was UI development. I developed a user interface using Next.js/React and cloud functions to view game video content in a more user-friendly and organized manner.

The third challenge involved the extraction of metrics from video content using a cloud function and Gemini LLM. The most time-consuming was finding an effective prompt structure and model version, a lot of time, but it was really nice to have Vertex AI Prompt Gallery which provides handy tools.

Accomplishments that we're proud of

I successfully built an end-to-end video analysis tool using GCP, Vertex AI, and Gemini LLM. Which helps users to extract Statcast metrics.

What we learned

I learned a lot, but the top 3 will be:

  • Building all components of the app using GCP looks not as complicated as I thought
  • Integration and configuration of Gemini LLM now much more clear for me
  • Gemini has a nice video analysis possibilities

What's next for Statcast Extractor

As the next steps for this project, I would develop additional features like:

  • possibility to select multiple videos
  • possibility to specify output destination (like getting an output in csv format, etc.)
  • increase amount of supported metrics
  • improve detection accuracy

Built With

Share this project:

Updates