Overview

The Flimo World is an AI-powered creation tool that transforms infinite imagination into interactive stories. In Create mode, you can use natural language to describe your imagined world, establish its lore, characters, and conflicts. The AI assists in generating NPCs, scenes, plot points, and a playable map. You can then set interaction modes and share your world with other users to experience. Each user's journey through your world creates unique experiences and story endings.

Inspiration

Our world is composed of countless stories. Many people in the real world have creative inspiration and the impulse to craft stories, but the barrier to creating from scratch is prohibitively high for most—especially when it comes to creating game-like story experiences. After witnessing Gemini's AI capabilities, I envisioned a new paradigm of creation. I wanted to connect "imagination," "creation," and "experience" together in a way that enables creators to rapidly build their own story worlds with an extremely low barrier to entry. These story worlds are presented through multimodal content and support real-time interaction. The inherent randomness of AI delivers completely diverse and unique experiences to each user.

What it does

Create Mode

Create Mode

To achieve the seamless connection between "imagination," "creation," and "experience," we've built an AI-assisted conversational creation mode. In this mode, creators can begin their creative journey through natural language or inspiration templates.

When creators input their ideas, the AI analyzes and interprets them, outputting a world story outline, character settings, environmental descriptions, character relationships, and plot development. Gemini's AI capabilities power the generation of both text and images. Creators can continue to edit and refine their story through ongoing AI dialogue.

Create Mode

For users to truly experience the story, the progression mechanism and visual presentation of interactions are crucial. We've integrated map generation into Create mode—based on the creator's written story, the system automatically generates the world's layout and map. The layout constrains where characters can walk and defines pathways, while the map provides the visual presentation for user experience. Creators can still edit and modify these elements through AI dialogue.

Create Mode

Once everything is ready, the story can be published to Play mode for users to experience.

Play Mode

Play Mode

Users can browse all published interactive stories on The Flimo World homepage and select ones that interest them to play.

We've introduced an intelligent NPC feature here. Characters created by the creator autonomously decide their subsequent behaviors based on the story setting and their dialogue interactions with users—including where to go and what to say to users.

Users must uncover the truth of the story by gathering information from character events and dialogue content.

Play Mode

The AI records the user's entire interactive experience and all character events. At the conclusion of the experience, it generates a personalized story ending based on the user's unique journey. Because each user's dialogue content, experiences, and discovered information differ, the ending is truly unique to each individual.

How we built it

The frontend uses React 18 + Vite, with Tailwind CSS + DaisyUI for UI interactions and layout.

Create Mode

Create mode leverages Gemini AI's capabilities through our custom Flimo Agent, which employs an "intent routing + SKILL modules" architecture to respond to various creative needs:

  • story-world: Generate world setting, background, truth
  • story-npc: Create NPCs with personalities & secrets
  • mystery-world: Generate mystery plot & clues
  • mystery-npc: Create NPC dual versions (creator/player)
  • environment-image: Generate scene images via Gemini
  • npc-image: Generate character portraits
  • worldmap-layout: Plan location layout on map
  • worldmap-image: Generate final world map image

Play Mode

Play mode utilizes Gemini AI's capabilities to manage each character's behavior and user interaction context. We've designed several core mechanisms to drive character action execution and story progression:

  • Think: Each character reflects on their own behavior at appropriate moments, determining subsequent actions and events
  • Chat: User dialogue with characters influences each character's memory and future Think results
  • Navigation: Determines how each character moves and navigates within the world

Challenges we ran into

This tool proved far more challenging than I initially anticipated. While many issues remain unsolved, we managed to get it running successfully. The challenges included:

  1. Maintaining image consistency: How to ensure the generated map image matches the layout exactly. Gemini's image generation capabilities pleasantly surprised me—it maintains excellent consistency. Through prompt optimization alone, I achieved maps that perfectly match the layout's structure, arrangement, and dimensions.

  2. Persistent character memory and story-driven behavior: How to permanently save all user interaction records to character memory and ensure characters act according to the story setting. This was the most difficult challenge. AI often struggles to understand and remember past user inputs after extended interaction periods. I implemented extensive prompt engineering optimization along with information compression to ensure memory effectiveness over limited time spans, but this still cannot support indefinite gameplay. Therefore, in story design, we aim to ensure stories can be completed within approximately 10 minutes.

  3. Stable structured JSON output: How to ensure the model consistently outputs parseable structured JSON. Gemini occasionally fails to follow JSON output specifications, causing errors. I addressed this through prompt optimization and format validation as a fallback mechanism.

What we learned

  1. In multimodal products, "consistency" is more important than "single moments of brilliance." In interactive stories, consistency is the foundation for sustained user immersion and experience.

  2. Structured output (schema/JSON) + small, specialized SKILL modules are more stable and easier to iterate on than "monolithic mega-prompts." Effectively leveraging AI's function calling and SKILL capabilities significantly improves Agent output stability.

What's next for The Flimo World

In our initial plans, we intended to use Gemini's video generation to represent each character's behavioral events. However, we found the real-time performance too slow. While the quality is impressive, it's not suitable for interactive story products in this format. In the future, we're considering introducing video generation at story beginnings and endings:

  1. Creators can produce stunning, attention-grabbing opening videos for their stories
  2. After users complete their experience, generate a personalized story ending video unique to their journey

Built With

+ 1 more
Share this project:

Updates