Inspiration
The 2024 Summer Olympics are a global event, watched by millions. We wanted to create a tool that would allow fans, broadcasters, and content creators to relive the most thrilling Olympic moments quickly and in a personalized way. Traditional highlight generation is time-consuming and limited by manual editing. Our goal was to enable producers to make better content for consumers by creating engaging commentary on moments that the consumer is interested in.
What it does
Our AI-powered system takes a natural language prompt from the user, generates a script, creates a voice-over from that script, and edits visuals combining noteworthy clips from a provided index of Olympic footage. It delivers personalized highlight reels of 90 seconds that align the generated script with the perfect footage, offering a streamlined experience to create professional-quality sports videos with ease.
How we built it
We developed a workflow that combines several AI models through API calls. First, we take input from user and use that to extract clips of appropriate videos from the curated database provided using Marengo model of TwelveLabs. Next, we process this merged video through the Pegasus model API to generate an engaging commentary in the context of the clips. Then, we use a text-to-speech engine API of ElevenLabs to convert the script into a realistic voice-over. Finally, a custom algorithm automatically edits Olympic footage to match the narrative of the script. These components were integrated into a web-based platform for seamless user interaction.
Challenges we ran into
One major challenge was ensuring that the AI-generated script aligned perfectly with the available Olympic footage. Balancing the timing of visuals with voice-over narration required fine-tuning the model to understand both the context of the footage and the nuances of the script.
Accomplishments that we're proud of
We are proud of successfully creating a system that can generate dynamic highlight reels entirely from user prompts, automating what is typically a labor-intensive process. The quality of the voice-over and the synchronization with the footage exceeded our expectations, and we were able to create a product that is user-friendly and scalable.
What we learned
Throughout this project, we learned a great deal about the intricacies of video editing and synchronization with AI-generated content. Additionally, working with Olympic footage presented unique challenges in handling various API calls, and integrating them in one seamless function in the pipeline.
What's next for Olympic Highlight Generator
Moving forward, we plan to expand the system to handle live events, allowing for real-time highlight generation during Olympic broadcasts. We also aim to incorporate more advanced personalization features, such as adjusting the tone of the script or allowing users to highlight specific athletes or events. Expanding our video index to include more sports footage and making the system adaptable for other global sporting events is also on our roadmap.
Built With
- angular.js
- api
- firebase
- flask
- python
- twelvelab
Log in or sign up for Devpost to join the conversation.