Inspiration

Imagine you attend a Ed Sheeran concert. He takes a few selfies with lucky fans – you're not one of them. You're too far away from him, he's too busy, it's too expensive to get better seating.

What would be cool – to give everyone a selfie of their style with Ed Sheeran. What would be cooler - to let everyone talk to Ed Sheeran.

Ethan, our teammate has 1.4 million followers, and 70% of his income comes from brand deals. Agents and social media manager take 40% of his income. Having this middleman takes a lot of effort

What it does

  • Democratise brand deals through AI
  • Multiplication of presence
  • Technology that augments the talent, allowing their likeness and brand to scale
  • Voice to voice communication with an AI personality.
  • AI generation of selfies and photos of you with a famous figure in the location, pose and style you want.

How we built it

  • Beam: For model training and webhooks
  • Python
  • Vana: text-to-image, for facial stable diffusion
  • OpenAI: GPT-3
  • Uberduck.ai: text-to-speech
  • Whisper: speech-to-text

Challenges we ran into

  • Speed and fidelity of vocal output
  • Stability of stable diffusion
  • Latency of multiple services
  • Orchestration of API calls

Accomplishments that we're proud of

  • End to end system
  • Live product

What we learned

  • Beam hosting, training, development flow
  • Audio engineering
  • Prompt engineering

What's next for Doppelganger

  • Further training and tuning of audio and visual models.

Built With

  • beam
  • openai
  • python
  • uberduck.ai
  • vana
  • whisper
Share this project:

Updates