-
-
Human-in-the-loop: Review agent proposals and approve tasks using a simulated Pull Request workflow.
-
The unified dashboard seamlessly blends the 3D simulation with a powerful node-based Team Visualizer.
-
Watch the AI agents walk, sit, and interact in real-time as they execute complex tasks.
-
A living 3D workspace where Gemini-powered agents physically navigate to computers and collaborate.
-
Agents generate highly detailed, structured markdown deliverables and professional multimodal assets.
Inspiration 💡
As AI shifts from simple chatbots to complex multi-agent ecosystems, it's becoming harder for non-technical users to visualize and trust what goes on "under the hood." We were inspired to bridge this gap. We wanted to take the abstract concept of Agentic AI and ground it in a familiar reality—a bustling, living 3D office. We asked ourselves: "What if you could literally see AI agents walking around, collaborating, and passing tasks to one another just like a real human team?"
What it does 🛠️
The Consensus is a no-code 3D playground that allows anyone to explore, design, and interact with Agentic AI systems.
Instead of writing code, users can build custom agent hierarchies using a visual node editor (or choose from 6 predefined industry templates like a PR Agency or Film Studio). Each agent is embodied as a 3D character inside a simulation. When a user assigns a task, they can watch the agents physically navigate a NavMesh, find a computer, "think" (powered by Gemini), and collaborate. The platform generates rich multimodal assets—including highly detailed markdown deliverables, images, music, and video—all through an intuitive, human-in-the-loop "Pull Request" approval workflow.
How we built it 🏗️
We built The Consensus using a high-performance, hybrid GPU/CPU architecture:
- The Brains: We deeply integrated Google Gemini 3.1 Pro (High) for complex reasoning and Gemini 3 Flash for rapid, lower-level agent tasks. Gemini handles the entire autonomous decision-making process, task delegation, and multimodal asset generation.
- The World: The 3D embodied simulation was built from the ground up using Three.js (WebGPU). We rigged custom assets and animations in Blender and implemented intelligent NPC pathfinding.
- The Interface: We used React and React Flow to build the node-based visualizer where users map out agent hierarchies.
- The Glue: We utilized Zustand to maintain a unified, reactive state that bridges the 2D React UI with the 3D Three.js world seamlessly.
Challenges we ran into 🧩
Synchronizing a 2D reactive UI with a continuous 3D game loop was a massive hurdle. Ensuring that an agent's "thoughts" (the Gemini API stream) matched up perfectly with their physical 3D state machine (walking to a desk, sitting down, typing, and showing a speech bubble) required building a robust, decoupled state manager. Additionally, tuning the systemic prompts to force the agents to output comprehensive, structured markdown instead of short answers took extensive prompt engineering and testing with the Gemini models.
Accomplishments that we're proud of 🏆
We are incredibly proud of achieving a true "Sims-like" AI experience running entirely in the browser. Successfully routing complex tasks between different instances of Gemini models—while visually representing that data flow as physical interactions between cute 3D mascots—is a technical and design milestone for us. We're also proud of the "BYOK" (Bring Your Own Key) architecture, which keeps the platform accessible and open.
What we learned 🧠
We learned the incredible power of multimodal LLMs. By leveraging Gemini, we didn't just get text; we got an orchestration engine capable of acting out roles, understanding context, and generating diverse media types. We also learned how to effectively handle complex state management across React and WebGL contexts without sacrificing frame rates.
What's next for The Consensus 🚀
We want to push the "World Building" aspect further. Next steps include:
- Office Editor: Allowing users to drag-and-drop furniture and customize the 3D workspace.
- Inter-Agent Memory: Giving agents long-term, persistent memory so they can remember past projects and learn over time.
- Direct Spatial Interaction: Allowing users to jump into the 3D world in first-person to physically hand tasks to the AI agents!
Built With
- blender
- google-gemini
- react
- react-flow
- tailwind-css
- three.js
- typescript
- vite
- webgpu
- zustand

Log in or sign up for Devpost to join the conversation.