Inspiration

The inspiration for Nexus was born from a frustration with the "fragmented intelligence" of the modern web. We have incredible tools for coding, stunning models for image generation, and powerful engines for reasoning—but they all live in isolated tabs.

I asked myself: What would a computer look like if the OS itself was the AI?

I didn't want another chatbot. I wanted a Spatial Computation Environment. I wanted a "Glass Box" system where I could see the AI thinking, where an image generated in one window could be dragged into a code editor to build a website, and where a strategic report could be instantly converted into a podcast. Nexus is the realization of that dream—a unified "Swarm" of specialized agents working in concert.

What it does

Nexus AI Studio is a comprehensive Multi-Agent Operating System powered by the full spectrum of Google's Gemini models. It breaks the AI experience into specialized "Views":

  • The Fabricator (Code): A self-healing IDE that builds functional React apps from prompts or sketches.
  • The Council (Reasoning): A multi-agent debate room where distinct personas (Legal, Security, Finance) argue to reach a consensus.
  • Motion Lab (Video): A cinematic interface for the Veo model, allowing directors to generate and splice video assets.
  • Symphony (Audio): A neural audio workstation that turns text into multi-speaker podcasts using Gemini 2.5 TTS.
  • Overwatch (Vision): A real-time screen intelligence agent that watches your workflow and provides live debugging and critiques.

How we built it

Nexus is built on a custom Orbital Window Manager engine using React 18 and TypeScript, styled with a brutalist, high-fidelity cyberpunk aesthetic via Tailwind CSS.

The core intelligence is powered exclusively by the Google Gemini API, utilizing a dynamic router to select the perfect model for the task:

  • Gemini 2.0 Flash for high-speed UI interaction and tool use.
  • Gemini 3.0 Pro for complex reasoning (The "Reasoning Engine").
  • Veo for cinematic video generation.
  • Gemini 2.5 Flash Image for visual synthesis.

The "Swarm" Protocol Instead of a single LLM context, Nexus utilizes a Multi-Agent Swarm Architecture. When a user queries the "Council" module, the system instantiates multiple specialized personas. The consensus mechanism utilizes a weighted synthesis.

The "Vault" (Semantic Memory) We implemented a client-side Vector Database using IndexedDB. Every asset generated (image, code, or text) is embedded into a high-dimensional vector space. Retrieval isn't done by filename, but by semantic cosine similarity.

Challenges we ran into

  1. The "Stale Closure" Trap in Real-Time Audio: Building the Live Uplink (real-time voice mode) was the hardest technical hurdle. Bridging the Web Audio API with React state caused massive sync issues. We solved this by decoupling the audio analysis loop from the React render cycle using useRef.
  2. Handling XML Thinking Streams: Gemini 3.0's "Thinking" capability outputs raw XML tags mixed with the response. We wrote a custom state-machine parser to strip these tags in real-time, diverting the "thoughts" to a separate Cognitive Log UI component.
  3. The Context Window Juggernaut: Designing the Juggernaut view (2M Token processor) required implementing a client-side ZIP decompression stream to allow users to drop entire codebases into memory without crashing the UI thread.

Accomplishments that we're proud of

  • The "Holo-Vault": A functional 3D force-directed graph visualization of the user's stored assets, built with HTML5 Canvas math (no heavy 3D libraries).
  • Zero-Latency Switching: The "Orbital Window" system keeps agent contexts alive in the background, allowing the user to switch between coding, art, and strategy instantly without losing state.
  • Aesthetic Unity: We didn't just build a tool; we built a vibe. The UI sound design, the animations, and the typography all work together to make the user feel like the commander of a starship.

What we learned

We learned that Interface is Intelligence.

The same model performs 10x better when wrapped in a "Persona". When we gave the "Code Fabricator" a specific UI that looked like an IDE, the model wrote better code. When we gave the "Director" module a film-slate interface, the model generated better video prompts.

We learned that we are moving away from "Prompt Engineering" and toward "Context Engineering".

What's next for Nexus AI Studio

  • Local LLM Fallback: Integrating WebLLM to allow the "Sentinel" agent to run entirely offline for privacy.
  • Electron Desktop App: Moving from the browser to a native app to gain deeper OS-level control for the "Overwatch" agent.
  • Team Sync: Allowing multiple users to enter the same "War Room" and collaborate with the Swarm in real-time.

Built With

  • framer-motion
  • gemini-2.5-flash
  • gemini-3.0-pro
  • google-gemini-api
  • react
  • tailwindcss
  • typescript
  • veo
  • web-audio-api
Share this project:

Updates