Inspiration
When I was a kid, I hated reading long blocks of text and always preferred pictures. Classic literature sounded so boring to me, and history books were just way too long. I wish there were classic storybooks made specifically for young kids—not through a boring, stripped-down summary, but through a genuine picture book that keeps the original soul, characters, and vibe intact.
What it does
It takes dense, complex, and unreadable masterpiece novels (like The Great Gatsby ) and automatically transforms them into gorgeous, context-aware children's picture books.
How we built it
The 4-Agent Pipeline
I use Google ADK’s SequentialAgent to run four distinct roles back-to-back on Cloud Run:
- Analyzer: Pulls the book from Project Gutenberg, maps out who the main characters are, and uses the TextTiling algorithm to find the story's emotional highs and turning points.
- Writer: Takes those scenes and rewrites them into simple, engaging prose for kids.
- Artist: Uses Gemini 3 Flash Image to create the illustrations and blends the story text naturally into the art (like inside clouds or speech bubbles).
- Vision QA: Powered by Gemini 3.5 Flash (Vision), this agent checks the images against the original character design. If a page doesn't look right, it triggers a self-correction loop to automatically redraw it based on feedback until it's perfect.
Solving Consistency (MongoDB MCP)
To stop characters from changing appearance, I used the new MongoDB MCP Server:
- During the setup phase, every character gets a locked-in "Visual Identity Sheet" saved in MongoDB as the single source of truth.
- Every time the Artist Agent draws a new page, MCP injects this reference sheet as a tool, ensuring the protagonist looks identical from page 1 to page 40.
Google Cloud Infrastructure
- AI Models: Everything runs on Vertex AI. We use Gemini 3.5 Flash for fast text processing and Vision QA, and Gemini 3 Flash Image for drawing.
- Compute & Storage: The backend runs on Cloud Run, and all image assets, character data, and final PDFs are stored in Google Cloud Storage (GCS).
The Final Result: A beautiful, square-format PDF book, plus an interactive web app where users can click and fine-tune any character, scene, or text overlay on the fly.
Challenges I ran into
To do this, I solved three big problems:
- Smart Simplification: Breaking down a massive novel into distinct scenes without losing the core plot, and translating it into 6-year-old friendly language.
- Text-to-Image Alignment: Making sure the illustrations actually match the emotions and details of the story.
- Character Consistency: Keeping the main character looking exactly the same across a 40-page book without their face or clothes shifting.
Accomplishments that I am proud of
- Anyone can produce high-quality, visually consistent stories without a professional design background.
- It slashes the time and budget needed for animation and storyboards, opening doors for anyone with a great idea.
What I learned
- Chaining mini-agents is way better than one giant prompt—it made the whole storytelling pipeline incredibly stable and fast.
- MCP is a total game-changer for database sync—allowing me to lock in a single "visual identity source of truth" without writing endless glue code.
- Closing the loop with automated Vision QA is the future—it turns unpredictable AI drawings into a reliable, high-quality production line.
What's next for StorySprout
I want to build a shared repository where parents and educators can publish their generated children's books, remix other creators' character sheets, and co-create multi-chapter fantasy universes using a unified, crowdsourced visual identity hub.
Built With
- cloud-run
- css3
- gemini-3-flash-image
- gemini-3.5-flash
- google-agent-development-kid(adk)
- google-cloud-storage(gcs)
- html5
- javascripts
- model-context-pprotocol(mcp)-sdk
- mongodb
- mongodb-altas
- mongodb-mcp-server
- python
- texttiling-algorithm
- typescript
Log in or sign up for Devpost to join the conversation.