Inspiration

The inspiration came from the desire to democratize manga creation by allowing anyone to transform their ideas into visually consistent manga pages without needing artistic skills, making storytelling accessible to everyone who has a story to tell.

What it does

The application converts text prompts into complete manga pages with multiple panels, maintaining character consistency across scenes, automatically generating dialogue with Google's Gemini AI, and offering various artistic styles from Studio Ghibli to classic manga aesthetics.

How we built it

We built it using a full-stack TypeScript architecture with React for the frontend, Express.js backend, PostgreSQL database, and integrated multiple AI models including Replicate's FLUX for image generation, PixArt-Sigma for character consistency, and Google Gemini for story and dialogue generation.

Challenges we ran into

The biggest challenge was achieving consistent character appearance across multiple panels and pages, which required implementing a complex reference sheet system and seed-based generation to maintain visual identity throughout the manga.

Accomplishments that we're proud of

We successfully created a system that maintains character consistency across panels, integrated multiple AI services seamlessly, and built an intuitive interface that lets users generate complete manga stories from simple text ideas in minutes.

What we learned

We learned how to orchestrate multiple AI models to work together cohesively, the importance of prompt engineering for consistent visual output, and how to balance creative freedom with technical constraints in generative AI applications.

What's next for Prompt2Manga

Next steps include implementing object consistency for non-character elements like dragons and vehicles, adding collaborative features for team manga creation, expanding style options with more artistic presets, and potentially integrating voice-to-manga capabilities for even more accessible storytelling.

How We Use the Gemini API

The application integrates Google's Gemini API in multiple ways to enhance manga generation:

1. Nano Banana (Gemini 2.0 Flash) for Image Generation

  • Purpose: Alternative to FLUX models for manga image generation with superior character consistency
  • Key Feature: Uses previous panels as visual references to maintain character appearance
  • How it works:
    • When enabled, Nano Banana automatically fetches the most recent panel from your project
    • Converts previous panels to base64 and includes them as reference images in the generation request
    • The AI uses these references to keep characters looking consistent across all panels
    • Particularly effective for maintaining unique character designs, outfits, and art style

Built With

Share this project:

Updates