Inspiration

My name is Aurelien, through my company Bleue, we create music composition and audio solutions for films and advertising. You might have heard some of our work on HBO or while watching a commercial for Apple or Google.

Back in April, I was working with one of my main clients, Lancôme, the luxury skincare company. We were working on creating music and audio for one of their commercials, featuring Julia Roberts. They were at an advanced internal mockup stage, where they needed to start thinking about voice-over. Unfortunately, Julia wasn’t available to just pop in the vocal booth and record a few takes for them. So they turned to us, having heard about the capabilities of AI.

We started using Eleven Labs’ powerful voice cloning tool, and feeding it interviews and snippets of dialogue from Julia Roberts. And very quickly we ended with a synthesized version of her voice, available for internal mockup purposes.

From this experience, more of this type of work started coming our way. With the same kind of use case, being a high profile celebrity partnering with a brand. But a problem started to emerge. While very powerful, ElevenLab’s Voicelab and Speech synthesis provide a limited scope of customization and a user interface that is not suited to rapidly iterating on and refining content.

We decided to create a prototype of a user interface that makes the work of combining and generating takes for a long script into one fun creative process. To do this we leveraged prompting behind the scenes in the model to give fine-grained control over the emotional tone of each word in the final product.

That’s when we decided to create FINAL TAKE on top of Eleven Labs’ API, to help us style a voice and address this issue.

What it does

Voice cloning and styling application Powerful app to elevate voice overs for internal mock-ups

How we built it

AI pipeline: Eleven labs (voice cloning + API) Emotions: Eleven labs prompting Text-based audio editing built on top of Whisper Timestamps Philosophy: Interactive UX makes AI into a creative tool – vue3+supabase+vercel Try it out!

Challenges we ran into

Building an intuitive interface for tools that have never existed before? Using the voice of a celebrity without their consent could create liability for our app (i.e. copyright infringement)

Accomplishments that we're proud of

Emotional voice styling

What we learned

Lots of prototyping and iteration + grounding in real life workflow We have develop an initial governance controls to address the copyright risk (i.e. disclaimers, usage policy, checklist, sign-off)

What's next for Final Take

Leverage next gen of models to customize further the voice styling feature Building governance framework and controls to widen the use case scope

Built With

  • elevenlabs
  • supabase
  • vercel
  • vue
  • whisper
Share this project:

Updates