Inspiration

In the current AI landscape, we are moving past simple "chat" interfaces toward Autonomous Orchestration. We noticed that while LLMs can write code, they often struggle with the multi-step reasoning required for production-grade software: architecture, edge-case auditing, documentation, and performance hardening. Working AGI was born from the vision of a "Creative Autopilot"—a system where a human provides a spark of intent (a brief or a visual "DNA" reference), and a swarm of specialized agents handles the heavy lifting of synthesis, critique, and self-correction.

What it does

Working AGI is an autonomous research and generation agent designed to create high-fidelity game substrates and interactive simulations from scratch. Visual DNA Extraction: Users can upload images to sync "Visual DNA," which the system analyzes to extract aesthetic patterns and gameplay mechanics. Autonomous Swarm: It triggers a multi-stage workflow: Research Agent: Uses Google Search Grounding to find industry benchmarks. Architect Agent: Uses Gemini 3 Pro with a 32k thinking budget to plan modular ECS structures. Generator Agent: Synthesizes the full HTML5/JS/CSS substrate. Critic & Refactor Agents: A brutal "senior auditor" loop that identifies logic flaws and self-corrects them. Nexus Arena: A real-time testing environment with an integrated Neural Repair Core for on-the-fly bug fixing. Neural Memory Bank: Every critique and score is stored in a semantic memory hub, allowing the AGI to learn from past "mistakes" to improve future generations.

How we built it

The application is a showcase of high-performance frontend engineering and the latest Gemini API capabilities: Core Reasoning: Powered by gemini-3-pro-preview for complex architecture and gemini-3-flash-preview for high-speed research and documentation. Thinking Budget: We utilized the thinkingConfig (up to 32,768 tokens) to allow the Architect and Critic agents to "deliberate" before outputting code. Grounding: Integrated Google Search Grounding to ensure the Research Agent provides data-backed benchmarks. UI/UX: A "Glassmorphism" interface built with React and Tailwind CSS. The background features a custom WebGL shader (FloatingLines) to represent the neural filaments of the AGI. Multimodality: Uses Gemini's vision capabilities to perform multimodal analysis on uploaded reference assets.

Challenges we ran into

State Convergence: Ensuring that the Generator Agent actually followed the Architect Agent’s plan required strict system instructions and iterative prompting. Latency vs. Fidelity: High-fidelity code generation takes time. we solved this by implementing a Reasoning Log, which provides the user with real-time "Monologues" and "Decisions" from the agents, turning wait time into an engaging insight into the AGI’s thought process. The "Shaking" UI: Initially, we had an aggressive "glitch" effect during repairs, but user testing (feedback: "don't shake the whole page!") led us to refactor it into a more sophisticated "Neural Reconstruction" overlay.

Accomplishments that we're proud of

The Self-Correction Loop: Seeing the Critic Agent successfully reject a build for "performance risks" and the Refactor Agent subsequently fixing those issues without human intervention. Neural Memory Integration: Creating a system where "Past Experience" (stored in LocalStorage and semantic logs) is injected back into the Architect Agent’s context, effectively allowing the agent to "evolve" with every run. The Aesthetic: Achieving a "Commercial Production" feel that makes the AGI feel like a high-end creative tool rather than a toy.

What we learned

Thinking is Key: Higher thinking budgets significantly reduce "hallucinations" in complex physics and state-management code. Multimodal Anchoring: Providing an image as a "DNA Reference" results in far more aesthetically aligned outputs than text prompts alone. Agentic Specialized Roles: Breaking down "Write a game" into Research -> Architect -> Generator -> Critic leads to much cleaner, modular, and maintainable code than a single-shot prompt.

What's next for Working AI – Creative Autopilot

Multi-File Orchestration: Moving beyond single-file HTML substrates to full-scale project directories. Human-in-the-Loop Swarm: Allowing users to jump into the reasoning process at specific steps (e.g., approving an architecture before code is written). VEO Video Integration: Using the Veo model to generate promotional trailers for the substrates created in the Arena. Collaborative Memory: A cloud-synced Memory Bank where the AGI can learn from the collective successes and failures of all users.

Share this project:

Updates