Inspiration
Imagine you attend a Ed Sheeran concert. He takes a few selfies with lucky fans – you're not one of them. You're too far away from him, he's too busy, it's too expensive to get better seating.
What would be cool – to give everyone a selfie of their style with Ed Sheeran. What would be cooler - to let everyone talk to Ed Sheeran.
Ethan, our teammate has 1.4 million followers, and 70% of his income comes from brand deals. Agents and social media manager take 40% of his income. Having this middleman takes a lot of effort
What it does
- Democratise brand deals through AI
- Multiplication of presence
- Technology that augments the talent, allowing their likeness and brand to scale
- Voice to voice communication with an AI personality.
- AI generation of selfies and photos of you with a famous figure in the location, pose and style you want.
How we built it
- Beam: For model training and webhooks
- Python
- Vana: text-to-image, for facial stable diffusion
- OpenAI: GPT-3
- Uberduck.ai: text-to-speech
- Whisper: speech-to-text
Challenges we ran into
- Speed and fidelity of vocal output
- Stability of stable diffusion
- Latency of multiple services
- Orchestration of API calls
Accomplishments that we're proud of
- End to end system
- Live product
What we learned
- Beam hosting, training, development flow
- Audio engineering
- Prompt engineering
What's next for Doppelganger
- Further training and tuning of audio and visual models.
Built With
- beam
- openai
- python
- uberduck.ai
- vana
- whisper

Log in or sign up for Devpost to join the conversation.