Inspiration

This project aims to make baseball more accessible to everyone, particularly those who are visually impaired and currently unable to enjoy the game's traditional visual experience. Cutting-edge technology is being used to create new ways for these fans to experience the excitement of baseball through sound.

Here's how it works:

  • Immersive Audio: Detailed audio descriptions of the MLB season, leagues, teams, players, and games are provided. This includes information beyond just the game action, offering a richer understanding of the broader baseball landscape.
  • Sensory Translation: Technology enables game details to be translated into rich auditory experiences, so fans can "hear" the action unfold. This focuses on capturing the nuances of the game and conveying them through sound.
  • Real-Time Narratives: Comprehensive game narratives are tailored specifically for visually impaired audiences, providing a dynamic and engaging listening experience that replicates the excitement of watching a game.

Goal:

  • Increase inclusivity in sports: Everyone should have the opportunity to enjoy baseball, regardless of their visual abilities.
  • Enrich entertainment: Everyone deserves access to engaging and meaningful entertainment experiences.
  • Showcase the power of technology: This project demonstrates how technology can be used to break down barriers and create a more inclusive world.
  • Commitment to inclusion: This project aligns with Major League Baseball's ongoing commitment to diversity and inclusion, ensuring that the thrill of baseball can be shared by all.

What it does

This project revolutionizes MLB information access through an innovative audio interface:

Audio search:

Users can issue voice commands to retrieve up-to-date MLB data in audio. Gemini Flash 2.0's advanced audio processing capabilities interpret these commands accurately and provide real time MLB data back in the form of audio to the fans.

Audio Search

Audio search gets its data from the MLB Stats API, ensuring real-time, accurate data on MLB leagues, seasons, teams, rosters, and games. Gemini Flash 2.0's function calling capabilities seamlessly integrate this data, presenting it to users in a clear, accessible format.

The system provides:

  • Live game data and Clutch Plays
  • Comprehensive player and team data
  • MLB League and Season data
  • Historical data for in-depth analysis
  • Team Standings
  • This innovative approach makes MLB information readily available to all fans, including those with visual impairments, enhancing the baseball experience for everyone.

How we built it

Technologies used

  • Gemini Multi Modal Live API
  • Gemini 2.0 Flash 2.0 Exp
  • Vertex Generative AI SDK
  • Google Gen AI SDKs
  • React
  • Google Material Design
  • Python
  • Flask
  • Websockets

Technology Architecture

Architecture

Note - Web and Video capabilities have been added in the 2nd iteration.

Frontend Architecture

  • Frameworks -
    • React - Frontend of the MLB clutch moments application is built using React.
    • Google Material - Front end design is powered by Google Material provide a rich user experience
  • Deployment - Google Cloud Run is used for deployment of the React application

Backend Architecture

Challenges we ran into

  • Size of the datasets - Due to the size of the MLB data sets, model quotas often exceeded. By adding some filters reduced the size of the datasets.
  • Websockets deployment - Deployment of websockets was a bit of a challenging but i found a solution using server.
  • Function calling chains - After going through several iterations and challenges on chained function calls, learnt how to solve this problem by providing clear function instructions.

Accomplishments that we're proud of

  • Building Audio Experience - Leveraging Google's Multimodal Live API, I developed an inclusive audio experience for baseball fans, enabling visually impaired and all users to enjoy the game through innovative, real-time audio description technology. This solution transforms accessibility, making sports more engaging for everyone. I hope this app helps as many people as possible.

What we learned

  • Google Gemini Flash 2.0 - Learnt about the power of the Multi Modal Live API, this really helped in finalizing the MLB audio search use case
  • Google Cloud - Learnt Google Cloud from scratch
  • MLB API - Learnt MLB terminology, studied different API's to present the data
  • Websockets - Learnt a lot about Websockets while building the Audio API and interface.

What's next for MLB Clutch Moments

  • Identified MLB web search uses that could be enabled using Gemini 2.0 and started enabling those features in the app. Started updating the architecture and design for these use cases.

Web search:

Features an auto-complete search function for quick and easy information retrieval.

Web Search

  • Started working on video summarization of game videos
  • Get feedback and refine the solution
  • Add video capabilities into the app to provide an in stadium experience

Built With

Share this project:

Updates