Inspiration

We noticed that digital comic creation has a steep learning curve , expensive software, complex tools, and hours of work just to produce a single panel. At the same time, we saw how AI is transforming creative workflows, and we asked ourselves: What if anyone could create professional comics just by waving their hands?

The idea of gesture-controlled drawing excited us. We wanted to make comic creation feel like magic where there are no stylus, no mouse, just your hands in the air bringing stories to life. Combine that with AI that transforms rough sketches into polished artwork and animates static panels into video, and Waaah-Comics was born.

We were also inspired by the accessibility angle. For creators with mobility challenges or those who simply want a more intuitive experience, touch-free drawing opens up new possibilities.

What it does

Waaah-Comics is an AI-powered comic creation platform that lets you:

  • Draw touch-free using hand gestures tracked by your webcam where you just point to draw, make a fist to erase, open your palm to stop
  • Transform rough sketches into polished comic art using Google Gemini 2.0 Flash, complete with bold ink outlines, cel-shading, and vibrant colors
  • Bring your comics to life with Google Veo 2 video generation, watch your static panels become animated sequences
  • Build complete comic pages using 22+ professional templates (Manga, Webtoon, Superhero, Comic Strip, and more)
  • Save and sync projects to the cloud with user authentication

From a simple hand gesture to an animated comic story, all in one seamless workflow.

How we achieved it

We combined three cutting-edge AI/CV technologies into a unified platform:

1. Computer Vision (MediaPipe Hand Landmarker)

  • Real-time hand tracking at 30fps using the webcam
  • Recognizes 5 distinct gestures: Point (draw), Fist (erase), Palm (stop), Peace (drag toolbar), Pinch (alt draw)
  • Implemented a One Euro Filter for jitter reduction and smooth cursor movement
  • Configurable "draw zone" mapping for comfortable gesture range

2. AI Image Generation (Google Gemini 2.0 Flash)

  • Text-to-image generation for comic assets
  • Sketch-to-image transformation that preserves composition while adding professional comic styling
  • Custom prompts ensure consistent comic book aesthetic (bold outlines, cel-shading, no speech bubbles)

3. AI Video Generation (Google Veo 2)

  • Transforms static comic panels into animated video sequences
  • Brings characters and scenes to life with motion

Tech Stack:

  • Frontend: Next.js 14, TypeScript, Tailwind CSS, Konva.js
  • Backend: FastAPI (Python)
  • Database & Storage: Supabase (PostgreSQL + Storage)
  • Auth: Clerk

Challenges we ran into

1. Hand Tracking Stability Raw MediaPipe coordinates were jittery, making drawing nearly impossible. We solved this by implementing a One Euro Filter, an adaptive low-pass filter that smooths slow movements while preserving quick strokes.

2. Gesture Recognition Accuracy Distinguishing between gestures (especially Pinch vs Point) was tricky. We normalized measurements using hand scale (wrist-to-MCP distance) so gestures work regardless of hand distance from the camera.

3. Coordinate Mapping Mapping hand position in camera space to canvas coordinates required careful calibration. We created a "gesture box" system where a defined region of the camera view maps to the full canvas, making drawing more comfortable.

4. AI Prompt Engineering Getting consistent comic-style outputs from Gemini required extensive prompt tuning. We had to explicitly instruct it to avoid adding speech bubbles, maintain composition from sketches, and apply specific artistic styles.

5. Real-time Performance Running MediaPipe hand tracking, canvas rendering, and AI generation without lag required careful optimization, GPU delegation, frame rate limiting, and efficient state management.

Accomplishments that we're proud of

  • Touch-free drawing actually works — and it feels magical. The gesture controls are responsive and intuitive enough for real creative work.

  • The AI transformation is stunning — watching a rough sketch become polished comic art in seconds never gets old.

  • 22+ professional templates — we didn't just build a tool, we built a complete comic creation studio.

  • Seamless three-way integration — MediaPipe, Gemini, and Veo working together in one fluid workflow from gesture → sketch → art → video.

  • The One Euro Filter implementation — this single addition transformed unusable jittery input into smooth, precise strokes.

  • Full cloud persistence — projects save automatically and sync across devices.

What we learned

  • Computer vision is hard, but MediaPipe makes it accessible — the hand landmarker model is incredibly powerful out of the box, but real-world usage requires significant post-processing (filtering, normalization, gesture classification).

  • AI prompt engineering is an art — small changes in prompts lead to dramatically different outputs. We iterated dozens of times to get consistent comic styling.

  • User experience matters more than features — the gesture box, visual feedback (cursor states, draw zone overlay), and smooth animations made the difference between a demo and a usable product.

  • Full-stack AI apps have many moving pieces — coordinating frontend canvas state, backend storage, AI APIs, and real-time webcam input taught us a lot about system design.

  • Filtering algorithms are underrated — the One Euro Filter was a game-changer. Sometimes the solution isn't more AI, it's classic signal processing.

What's next for Waaah-Comics

  • Multi-panel video generation — stitch animated panels into complete comic videos with transitions and sound effects

  • Collaborative mode — real-time multiplayer comic creation

  • Custom AI style training — let users define their own comic art styles

  • Voice-to-comic — narrate your story and watch AI generate the panels automaticallys

We believe Waaah-Comics is just the beginning of gesture-controlled, AI-powered creative tools. Comics today, animations tomorrow, full interactive stories next. 🚀

Built With

Share this project:

Updates