Inspiration
We were fascinated by how film scores enhance emotional connections to the story, and we wondered - what if we could do the same for any written content? We wanted to enable people to 'listen' to their favorite texts and to unlock a whole new sensory dimension of storytelling.
What it does
Score AI takes any piece of text, such as a screenplay or novel, and translates it into a uniquely composed musical score. It analyzes the emotional sentiment in the text, then uses a state-of-the-art text-to-music model to generate a score that embodies the text's emotional content. This allows users to experience their narratives in a truly multimodal way, transforming written words into an emotional, musical journey.
How we built it
The core of Score AI is a two-step AI model process. We leverage the powerful GPT-4 language model to break down the text into emotional components and translate these emotions into musical attributes like tempo, key, mode, and dynamics. The outputs are then fed into MusicGen, which generates 30 seconds of music based on the prompt. By handling continuations, the model can create cohesive scores for long texts, letting the music naturally evolve with the story.
Challenges we ran into
Creating a translation layer between emotional analysis and musical generation was a significant challenge. Understanding and categorizing emotions in a text is complex, and translating those emotions into musical attributes is not straightforward. We had to test, iterate, and fine-tune our model many times to ensure it was generating emotionally coherent and aesthetically pleasing music. Another challenge was handling the continuity in long pieces of text to generate a seamless and cohesive musical score. We use a chunk of the previously generated music as context to ensure the music flows smoothly throughout while generating new musical motifs.
Accomplishments that we're proud of
We're thrilled with the emotional depth and complexity that Score AI can handle. While reading our screenplays (or watching films) with the music that it created, the music enhances the experience shockingly well. It's one thing to create a simple happy or sad score, but our system can navigate the nuanced emotional landscape of the complex narratives we presented it. We're also proud of how we've seamlessly integrated GPT-4 and MusicGen, creating an efficient pipeline from text input to musical output.
What we learned
We learned a great deal about the intersection of AI, sentiment analysis, and music. One key takeaway was that machine learning models (specifically transformer-based models) are incredible tools for creative applications when used in an innovative manner. We also learned about the limitations of computing hardware, specifically in long inference times.
What's next for Score AI
We envision Score AI becoming a leading tool in the consumer entertainment industry, used by creatives to add an audio touch to their textual creations. We believe that creators, including filmmakers, authors, and producers will use this tool in the short term to get instant inspiration for their projects. Additionally, we're excited about exploring collaborative possibilities with other forms of media, like video games and virtual reality, where Score AI could provide real-time, emotionally responsive music.
Log in or sign up for Devpost to join the conversation.