Clancy: A Conversation With Your Selves


Inspiration

I’ve always been fascinated by inner dialogue—the “committee in your head” that debates, cheers, and questions every decision. Science fiction has long played with the idea of splitting the self: doppelgängers, simulated personas, or mirrors that talk back. When I saw how far real-time AI voice cloning has come (thanks to ElevenLabs), I wondered: what if you could literally talk to yourself? And what if the voices arguing inside your mind… actually spoke?

That’s how Clancy was born: a tool that lets you hear your supportive and critical selves in your own cloned voice, having a debate about whatever’s on your mind. It’s part uncanny, part playful, and all about exploring the edge between self-reflection and AI.


What I Learned

  • Voice is powerful. Text is one thing, but hearing your own voice come back at you with encouragement or skepticism feels totally different—and a bit unsettling.
  • Timing and orchestration are everything. Creating a “turn-based” dialogue with AI means handling context, state, and human interaction so it feels natural.
  • Audio in the browser is tricky! Autoplay restrictions, blob URLs, and async user actions all require care to get right.
  • Sometimes the weirdest ideas are the most memorable. Most people would never think to do this, and that’s what made it worth pursuing.

How I Built It

  • Frontend: Next.js (App Router), TailwindCSS for layout, and the Web Speech API for capturing user speech.
  • Backend:
    • A /api/converse endpoint orchestrates the turn-by-turn conversation between the two agents (supportive/critical), powered by OpenAI’s GPT models.
    • A /api/voice endpoint calls ElevenLabs to synthesize each agent’s response as a cloned voice.
  • Flow:
    1. User speaks a prompt into the app (browser mic, transcribed for confirmation).
    2. “Supportive You” replies (in your own voice), then “Critical You” replies (also your voice), each via audio only.
    3. You can then speak again—looping the dialogue as many times as you want.
    4. All agent output is heard, not seen. The effect is like eavesdropping on your own mind.

Challenges

  • Audio playback: Ensuring the agents’ voices played reliably in all browsers, especially with autoplay restrictions.
  • Speed: Hackathons are fast! There wasn’t time for a fancy UI—my priority was getting the voice loop working, end-to-end.
  • State management: Managing who should speak next, keeping conversation context in sync, and making sure the system didn’t get confused or stall.
  • Uncanny valley: The project walks a line between delightful and weirdly unsettling. That’s by design.

Conclusion

Clancy is proof that, with today’s AI, you can make your imagination audible. It’s rough, but it’s real—and it actually talks back. Sometimes that’s all you need.


Built With

Share this project:

Updates