Gemini 3 Integration Overview

First, Gemini interprets multimodal human communication in context. It reasons over message content, tone, emotional cues, current game state, and accumulated interaction history to infer player intent. That intent is translated into structured, engine-ready tactical rules that describe priorities, conditions, and responses. These rules are validated and compiled before a match begins, after which the game engine executes them consistently without further AI intervention during play.

Second, Gemini maintains persistent internal context across matches. This includes prior communications, active tactics, and evolving relationship dynamics for each squadmate. Because this context is preserved, behavior does not reset between sessions: the same instruction can lead to different outcomes depending on history, trust, and prior decisions.

To support performance and reuse, we leverage Gemini’s context caching to preload and retain the full description of the game engine, rule schema, and world semantics. This cached context can be reused across sessions and players, allowing Gemini to operate with a stable understanding of the system without re-sending static information on every interaction.

Finally, Gemini reasons over gameplay telemetry and accumulated state to generate qualitative post-match reflections and configure future matches, ensuring challenges adapt to how the player is leading rather than following fixed difficulty rules.

🎮 About the Project

From the start, we knew we wanted to build something different.

Before committing to a direction, we explored multiple ideas, mediums, constraints, and sources of inspiration. We gave ourselves one week to independently develop concepts and then present them to each other. From those, we chose the idea that met a few simple criteria:

  • It would be fun for us to build
  • It would be something we would actually want to use
  • It would let us explore the limits of next-generation AI
  • And it would be realistic to build within the time we had

💡 Inspiration

What we chose to build was inspired by everyday interactions with people across different contexts—work, games, collaboration, and shared activities—and how the same message can be interpreted very differently depending on tone, history, and context.

We wanted to recreate that experience in a playful, engaging environment, while also giving users the opportunity to develop real skills: leadership, communication, and managing automation through AI.

The goal wasn’t to teach players how to issue better commands, but how to lead when agents have autonomy and don’t always behave exactly as expected.

🛠️ How We Built It

To support this idea, we decided to build a game engine from scratch.

The engine is designed so that any meaningful action can be defined deterministically through high-level commands. Instead of relying primarily on direct controller input, players define tactics—what should matter and how priorities should shift—and the engine takes care of executing the details.

This exposes the player to the idea of configuring and managing CPU-controlled AI, without requiring micromanagement or low-level scripting.

That, however, was only half of the picture.

We didn’t want our characters to be perfect automatons. A team where everyone behaves flawlessly is sterile and unrealistic, and it misses much of what makes collaboration interesting. Our AI characters needed to have personalities, strengths, weaknesses, preferences, and history. They needed to have a relationship with the player and with the goals they are working toward.

🧠 Using Gemini as the Interface

This raised a key question: how should players actually set all of this up?

Exposing dozens of controls, switches, and configuration panels would turn the experience into a technical exercise, which is not what we wanted. We wanted players to lead naturally.

This is where Gemini became essential.

On one side, Gemini acts as an orchestrator. It understands the syntax and structure of our game engine and serves as a bridge between player input and a serialized representation of tactics that the engine can consume directly.

On the other side, Gemini’s reasoning and long-context capabilities provide what makes the squad simulation work. With access to accumulated interaction history, Gemini can map tone, preferences, personality traits, and relationship dynamics against the player’s leadership style and communication patterns.

This allows the same instruction to produce different outcomes depending on who receives it and how the relationship has evolved.

🧩 System Responsibilities and Technology Breakdown

To make the division of responsibilities clear, the system is intentionally split between Gemini and a deterministic game engine.

🤖 What Gemini Does

  • Interpret player intent
    Converts natural player input (voice or text) into structured intent, grounded in current game context.

  • Translate intent into tactics
    Produces serialized, engine-ready tactical rules that describe priorities, conditions, and responses.

  • Maintain long-term context
    Tracks accumulated interaction history, squadmate relationships, preferences, and prior decisions across matches.

  • Resolve ambiguity through reasoning
    Handles underspecified or vague instructions by inferring intent based on tone, history, and past behavior.

  • Shape adaptive challenges
    Generates match configurations (enemy composition, pacing, behaviors) based on how the player has been leading over time.

  • Generate post-match reflections
    Produces qualitative reflections based on match telemetry to help the player understand outcomes and team dynamics.

⚙️ What the Game Engine Does

  • Execute tactics deterministically
    Runs the compiled tactical rules exactly as defined, without AI-driven decision-making during the match.

  • Simulate the game world
    Handles movement, navigation, pathfinding, collisions, combat resolution, and timing.

  • Enforce consistency and predictability
    Ensures that identical inputs and tactics produce consistent outcomes.

  • Collect gameplay telemetry
    Records relevant in-match events used for reflection and future match generation.

  • Handle real-time performance constraints
    Maintains stable simulation performance regardless of player input or match complexity.

🧱 Technologies Used

  • Web Platform
    Chosen for rapid iteration, accessibility, and easy public deployment.

  • Babylon.js / WebGL
    Used for real-time rendering.

  • IndexedDB
    Used to persist game sessions and player state locally.

  • Web Workers
    All game simulation logic runs off the main thread, isolating rendering from computation.

  • Input Translation Layer
    Converts keyboard, mouse, and gamepad input into high-level tactical intent for compatibility with the engine.

🎯 Creating Meaningful Challenge

Once we had the engine and the interaction model, we ran into another problem: where does the challenge come from?

If we designed fixed levels and enemy behaviors, players could simply adapt to those patterns and eventually brute-force solutions. That would undermine the goal of helping players grow in leadership, communication, and automation.

Instead, Gemini takes on the role of shaping each match.

Gemini can observe how the player leads, how they win or lose, how they manage priorities, and how they relate to their squadmates. Because it already understands how to interact with the engine, it can generate each match dynamically: enemy configurations, wave pacing, and even custom behaviors per enemy.

During play, we record telemetry for relevant events. This builds a history of how the squad performs and how the player behaves. From that, Gemini can generate reflections that help the player understand where friction or gaps may exist, and can also shape the next challenge so that it responds to the player’s trajectory rather than following a fixed difficulty curve.

The result is an experience that adapts to each player over time.

🧍 Player Participation

While the squad operates autonomously, we also wanted the player to be present on the field. Players can jump in and help directly, but collaboration remains central. Acting alone is rarely enough, and Gemini’s role in shaping challenges reinforces the idea that success comes from teamwork rather than individual heroics.

🌐 Platform and Technical Choices

We chose the web as our platform to maximize speed, flexibility, and ease of deployment.

Building for the web allowed us to iterate quickly and take advantage of existing infrastructure. We use IndexedDB to store game sessions and WebGL (via Babylon.js) for rendering. Thanks to the way the engine is structured, all game logic runs inside a Web Worker, leaving the main thread focused solely on rendering. This helped us maintain smooth performance, even during heavy simulation and input.

One challenge we encountered was that an engine based on high-level commands does not naturally align with traditional keyboard, mouse, or gamepad input (all of which we support). To address this, we built a translation layer that converts human input into tactical commands—ironically in a less efficient way than Gemini does, but necessary for accessibility and testing.

🧪 Challenges and What We Learned

The most difficult parts of the project were the game engine fundamentals: navigation, pathfinding, collision handling, and ensuring deterministic behavior at scale.

These challenges pushed us outside our comfort zone and made us question our approach more than once. However, working through them was essential. Without a solid, predictable engine, none of the higher-level ideas—autonomous agents, persistent relationships, or AI-driven adaptation—would have been viable.

In the end, building and integrating these systems is what allowed us to create an experience that feels coherent, reactive, and worth engaging with over time.

Built With

Share this project:

Updates