Multi-Tales

🧸What it does

Our project, MultiTales, presents a unique challenge in the age of AI: it requires not only imag- ination and linguistic coherence but also safety, age appropriateness, and emotional resonance. Traditionally, generating high-quality children’s stories is labor-intensive and demands multiple roles—author, editor, reviewer, illustrator, and reader testing. This complexity poses significant barriers to educators, parents, and small publishers who want to produce quality content quickly.

🤖Agents and Interactions

Writer Agent: Generates and iterates on the story. Editor Agent: Revises and improves the story based on the draft. Positive Reviewer Agent: Evaluates strengths and gives praise. Negative Reviewer Agent: Identifies problems and assigns scores.

The agents operate in a structured sequence (via SequentialAgent), enabling a feedback and revision loop. All interactions are stored and versioned using ADK’s built-in session and artifact tracking features.

⚙️How we built it

The entire workflow is orchestrated by a top-level SequentialAgent, which receives an initial invocation signal and subsequently enters a multi-round iterative process managed by An internal LoopAgent. The process begins with the first story generation round, based on the user’s prompt and a set of creative elements randomly drawn from the “Playing Cards” tool—including characters, scenes, and conflicts—the Writer Agent (powered by Gemini 2.5 Pro) composes an initial draft of the children’s story. This draft is then passed to the Editor Agent (powered by Gemini 2.0 Flash), which performs language polishing, logical consistency adjustments, and formatting normalization to ensure the story aligns with the stylistic and cognitive needs of young readers.

Once editing is complete, the refined story is dispatched in parallel to two distinct Reviewers Agents. Reviewer Agent 1 plays the role of a critical reader, focusing on identifying weaknesses in plot coherence, narrative engagement, and linguistic clarity, and provides targeted suggestions for improvement. Reviewer Agent 2, by contrast, acts as a general audience evaluator, offering an overall score and highlighting strengths or particularly engaging moments in the story. The structured feedback from both reviewers, along with the current story version, is aggregated and sent back to the Writer Agent to guide the next iteration. In each subsequent round, the Writer Agent integrates the critique, score, and new creative prompts from the Playing Cards tool to improve the story content iteratively.

This process continues under the control of the LoopAgent until a predefined number of iterations is reached or a termination condition is met (e.g., the story receives a high enough score or passes quality checks). At this point, the finalized story proceeds to the visual generation stage. The final manuscript is handed off to the ImageCreatorAgent, which invokes a tool to interface with Google’s Imagen 4 model. This tool automatically generates illustrations that visually align with the narrative content. To enhance usability and presentability, the ImageCreatorAgent also generates a fully formatted PDF file that combines the polished story, generated illustrations, and a custom cover page, resulting in a complete, child-friendly picture book in digital format.

Through this agentic pipeline—from story ideation to text generation, multi-perspective reviewing, revision, visual synthesis, and final PDF compilation—a closed-loop, collaborative multi-agent system is established. This workflow not only streamlines the production of educational and emotionally resonant children’s literature, but also demonstrates the power of structured, traceable automation in creative content generation.

🚀Accomplishments that we're proud of

The MultiTales system also benefits from game-theoretic dynamics among agents. For example, the Writer Agent’s implicit goal is to maximize creative expression and narrative flow, while the Reviewer Agents are incentivized to find flaws or inconsistencies, whether stylistic or semantic. This divergence creates a natural adversarial collaboration, where agents “compete” with differing objectives but ultimately contribute to a better outcome. By simulating a multi-agent game—where conflicting roles iterate and react to one another—we observed an emergent improvement in story coherence, richness, and editability. This dynamic mirrors real-world editorial tension and proves advantageous in refining AI-generated content.

🛠️Learnings & Changes

Throughout the development of MultiTales, we gained deep hands-on experience with ADK’s multi-agent orchestration model, and encountered several unique technical and design challenges

🧠Designing agent prompts for role specialization: It required careful prompt engineering to ensure each agent (Writer, Editor, Reviewers) maintained distinct behavior and did not drift into overlapping roles, especially under LLM-driven delegation.

🔁 Managing state transitions between review and edit phases: We had to design a structured and reliable mechanism for passing outputs across agents using ADK’s session state. Ensuring reviewers’ feedback was interpreted correctly by the writer agent required explicit output formats and controlled context management.

🪐Overcoming the limitation of Gemini 2.5 Pro:

The new Gemini model creates stories with greater variety and better writing at the cost of unstable outputs, such as the insertion of random foreign words and the active modification of the story, which requires a carefully designed prompt and agent structure to overcome. That is the reason why we designed the agent pair (Writer-Editor), where the writer can focus on generating better content with some minor issues, e.g., foreign words or comments about how to change the story, while the editor can take the draft and make the necessary modifications.

🎲Improving the variety of the story:

An LLM may generate content that is cliché since it is designed to create content that is the most common according to the request. So, a playcard system is integrated into the writer agent as a tool that generates random plots and story elements.

🖼️Leveraging ADK callbacks to manage iteration limits and finalization triggers:

We used ADK’s before/after callbacks to implement loop control logic. Integrating a visual generation model as a tool: Calling an external illustration (Based on the final story content) introduced challenges in asynchronous tool usage, re- response handling, and image artifact versioning within the ADK framework.

✅Balancing automation with quality:

Ensuring high-quality, child-safe stories while keeping the entire flow autonomous forced us to refine agent logic and evaluation strategies through multiple iterations

What's Next for MultiTales: 🎯 A Focus on User Interface 🖥️, Unbiased Review ⚖️, and AI Narration 🤖📖

The next phase of development for our AI program will focus on a three-pronged approach to elevate the user experience and ensure content integrity. First, we will design a user interface that is both fabulous and intuitive, providing a seamless and engaging environment for users to interact with the AI's creative potential. This interface will be the gateway to the program's capabilities, making the process of generating and refining stories both accessible and enjoyable.

Secondly, to introduce a layer of objective evaluation, we will implement a "reviewer agent" protocol. This unique agent will assess generated content without any access to previous iterations or contextual information, ensuring that each review is completely individual and unbiased. This "blind" review process is critical for maintaining a high standard of quality and originality in the final narrative.

Finally, the culmination of this creative process will be the introduction of an AI narrator. This final agent will be tasked with reading the completed story, bringing the text to life through advanced text-to-speech synthesis. This feature will not only provide an immersive way for users to experience the final product but also serve as a final check on the narrative's flow and readability, completing the cycle of AI-powered creation and consumption.