About Us:

Welcome to Baseball Analytics, a project created for the Google Cloud MLB Hackathon. Our team is dedicated to leveraging the power of Google Cloud technologies and advanced machine learning techniques to revolutionize the way baseball teams, players, and fans experience the game. The goal of our project is to create an innovative and data-driven platform that empowers fans, coaches, and teams with actionable insights. By utilizing machine learning algorithms, real-time data analysis, and personalized player metrics, we aim to improve decision-making, enhance player performance, and provide deeper insights into the game of baseball. Google Cloud’s powerful suite of tools provides the perfect foundation for our project. With the ability to handle large datasets, run advanced machine learning models, and deliver results in real time, Google Cloud enables us to deliver a seamless and robust solution that can scale with the evolving needs of the baseball industry. Our platform is powered by Google Cloud's AI and ML services, which allow us to process game footage, player statistics, and live events to provide insights that were once unimaginable. By integrating Google Cloud’s advanced analytics tools, we can offer predictive analytics and personalized insights that enhance both player development and fan engagement. Our team is made up of passionate engineers, data scientists, and baseball enthusiasts who came together to participate in the Google Cloud MLB Hackathon. We are a college students team. Each member brings a unique skill set to the table, including machine learning expertise, data engineering, and a deep understanding of the game of baseball. We are united by a shared vision: to create innovative solutions that not only enhance the game but also make baseball analytics more accessible to fans and professionals alike. Our team is dedicated to pushing the boundaries of what's possible by combining the world of sports with cutting-edge technology. We believe that technology can transform baseball into a more dynamic and interactive sport for fans and teams alike. Our vision is to harness the power of data and machine learning to provide deeper insights, improve player performance, and create more engaging experiences for fans. With Google Cloud, we aim to set a new standard for how data is used in sports analytics. Through our website and the tools we are developing for the Google Cloud MLB Hackathon, you can expect to find a range of features including personalized player statistics, predictive analysis, and detailed game insights. Our goal is to make the power of data accessible to everyone, from casual fans to professional coaches. As part of our hackathon project, we are working hard to bring this vision to life using Google Cloud’s machine learning and data analytics capabilities. Stay tuned for more updates as we continue to enhance and refine our platform. At Baseball Analytics, we provide cutting-edge analytics to improve the performance of baseball teams and players using advanced machine learning techniques. Our mission is to revolutionize the game of baseball with data-driven insights and analytics. Our team consists of passionate engineers and data scientists working together to bring innovative solutions to the sport.

Tech Description:

The project leverages Automatic Speech Recognition (ASR) to generate JSON-formatted output, identifying the exact timing of spoken content. The ASR output consists of an array where each element is a JSON object containing key-value pairs. One key, "text," represents the spoken words, while another key, "time," provides a timestamp in seconds, indicating when the speech begins. Traditional ASR models typically only transcribe spoken words without precise timestamps. However, Google’s powerful Gemini AI not only recognizes the speech content but also accurately pinpoints when each phrase starts, solving a critical problem—ensuring that translated audio is synchronized correctly within the original video. To maintain authenticity and naturalness across different languages, the project utilizes GPT-Sovits, a voice cloning and synthesis model, to extract and replicate the original commentator's voice characteristics. This ensures that the generated Japanese and Spanish audio closely matches the tone and expressiveness of the original English broadcast, rather than sounding robotic or monotonous. The goal is to recreate the immersive baseball atmosphere in multiple languages while preserving the original speaker’s vocal nuances. Personalized Fan Highlights Workflow: Audio-Video Separation:

The original video is processed to extract audio and visuals separately. Vertex AI and Spleeter (an open-source project) are used to separate the commentator's voice from background noise such as crowd cheers and stadium sounds. The entire pipeline is deployed on Google Cloud servers for efficient processing. Speech Recognition and Translation:

Gemini AI is utilized to generate text transcriptions (what was said) and timestamps (when it was said). The transcribed text is then translated into Japanese and Spanish using Gemini AI Translation. Voice Cloning and Audio Generation:

GPT-Sovits Text-to-Speech (TTS) converts the translated text into speech. The model is trained using the original commentator’s voice profile, ensuring that the generated Japanese and Spanish commentary retains the same vocal characteristics as the original English broadcast. Audio-Video Synchronization & Integration:

The processed audio files are aligned with the original timestamp data extracted by Gemini AI. A script is developed to insert the newly generated commentary into the original video while maintaining accurate synchronization. Final Assembly & Deployment:

The final step involves merging the video, background crowd sounds, and the trained multilingual commentary audio into a seamless highlight package. For backend processing, general functionalities that do not require Vertex AI are implemented using Firebase Functions, Cloud Run, App Engine, and Containers for scalable and efficient deployment. By integrating cutting-edge AI models for speech recognition, translation, and voice cloning, the project ensures that fans across different languages can experience an authentic and immersive baseball commentary, personalized to their preferences.

Built With

Share this project:

Updates