Inspiration

THIS PROJECT WAS MADE COMPLETELY ON MY PHONE. NO LAPTOPS USED

We’ve all looked at an object and thought, "What would you say if you could talk?" Whimsy was born from that curiosity. Point your camera at anything and bring it to life as a character you can talk to. It’s like a Pixar-style imagination layer on the real world.

Googly eyes were the key insight. Two wobbly eyes can make anything feel alive. We built a pipeline: scan an object, give it eyes and a personality, then talk to it through a retro CRT interface.

What it does

Whimsy turns any object into a talking NPC.

  • Identifies the object with Gemini and generates a name, personality, backstory, and voice
  • Transforms the image with googly eyes, a mouth, and cartoon arms
  • Enables voice conversations with a walkie-talkie style interface and AI-generated voice

Modes:

  • Character Mode: goofy personalities with googly eyes
  • Photo Mode: reflective, existential voices for places and memories

All characters are saved to a persistent gallery.

How we built it

  • Gemini 3.1 Flash Image: object detection, personality generation, image transformation, transcription, responses
  • ElevenLabs TTS: voice synthesis with 50+ voices
  • SpaceTimeDB: character storage and state management
  • Cloudflare R2: image storage and CDN

Frontend:

  • Next.js 15, React 19, Tailwind CSS
  • CSS-only CRT UI with scanlines, glow, and effects

Voice loop: Audio → Gemini (transcribe + respond) → ElevenLabs (speak) → playback

Challenges

  • Consistent googly-eye image generation required heavy prompt tuning
  • Voice latency across multiple APIs needed optimization and streaming
  • CRT effects in pure CSS required careful layering and performance tuning
  • Browser audio formats differed between Safari and Chrome

What we learned

  • Gemini is strong at natural language driven image transformations
  • SpaceTimeDB reducers are elegant but have JSON quirks
  • CSS alone can create convincing retro visuals
  • ElevenLabs voices are diverse enough for dynamic character assignment

Built With

Share this project:

Updates