Inspiration

Every kid has imagined their own video game — a world born from a single idea. But turning imagination into something playable traditionally requires artists, designers, musicians, and programmers.

We asked: what if one sentence could create an entire playable game?

Using Gemini’s multimodal capabilities (text, images, audio), we built Mirror Land — an interactive storybook where a single prompt generates a playable world in real time.

Type: “A spiderman on a desert island” An AI Creative Director narrates the story, generates art, composes music, designs levels, and streams everything live — then you play it.


What it does

Mirror Land turns a single prompt into a fully playable 3-chapter 2D platformer in real time.

AI generates:

  • Story narration — 3-chapter story arc with missions and dialogue
  • Game art — characters, enemies, NPCs, platforms, backgrounds in one art style
  • Ambient music — unique soundtrack per chapter using Lyria on Vertex AI
  • Level design — platform layouts, enemies, physics, weather via structured JSON
  • Gameplay mechanics — combat, lasers, teleporters, bounce pads, spotlight mode, weather

Every prompt produces a completely new game.


How we built it

We built a CreativeDirector pipeline using Google ADK (SequentialAgent) on FastAPI deployed to Cloud Run.

StoryPlanner (gemini-2.5-flash) Generates structured story plan: title, art style, characters, 3 chapters, missions.

StoryArchitect (gemini-2.5-flash-image) Generates 5 sprites in parallel (character, enemy, NPC, platform, background). Sprites are cleaned using rembg + alpha cropping.

LevelBuilder (gemini-2.5-flash + gemini-2.5-flash-image + Lyria + Gemini TTS) Creates level JSON, chapter backgrounds, ambient music, and NPC voice in parallel. A 9-rule Level Validator ensures levels are playable.

Streaming Architecture

All outputs stream to the frontend using Server-Sent Events (SSE):

  • narration
  • image URLs
  • audio
  • level JSON

Frontend uses a custom 2D engine built with Vanilla JS + HTML5 Canvas handling physics, combat, weather, particles, dialogue, and 13 procedural sound effects.

Chapters 2 & 3 generate in the background while the player plays Chapter 1.


Challenges we ran into

Sprite consistency Maintaining visual consistency across parallel image generations required strong style constraints in prompts.

Background removal Watercolor styles broke segmentation. We standardized white backgrounds + clear silhouettes.

Level playability AI sometimes created unreachable platforms. A 9-check validator auto-fixes gaps, spawns, enemies, and mission items.

SSE streaming bugs Events split across TCP chunks caused dropped images. Fix: persistent parser state across reads.

Real-time UX Generation latency was solved by turning the creation process itself into the experience via live storyboard streaming.


Accomplishments we're proud of

A real playable game from one sentence A full 3-chapter platformer with combat, missions, physics, weather, and dialogue — fully AI generated.

Generation as the experience The live storyboard streams narration, art, and music as the world forms.

Zero-wait chapter transitions Next chapters generate while you play.

Infinite variety Pirates, ninjas, hackers, fairies — every prompt generates different worlds, mechanics, and art styles.

Custom game engine A complete 2D engine with almost zero dependencies built using Vanilla JS + Canvas.


What we learned

Reliability matters as much as AI quality Validators, fallbacks, and streaming robustness are essential.

Prompt engineering becomes level design AI constraints directly impact playability and difficulty.

Parallel generation is critical Running image, music, and level generation concurrently reduces wait time.

Storyboard UX works Streaming generation transforms loading into discovery.

Multimodal AI needs guardrails Text, image, audio, and JSON outputs require validators and auto-fixers to stay coherent.


What's next for Mirror Land

  • Story branching — player choices change future chapters
  • Veo integration — AI-generated cinematic intros
  • Live NPC voice — real-time conversations using Gemini audio streaming
  • More genres — RPG, puzzle, racing generated from prompts
  • Persistent worlds — save and share generated games
  • Mobile support — touch controls and responsive canvas

Built With

Share this project:

Updates