🌌 Inspiration In the era of Generative AI, we often focus on "alignment"—making machines sound more human. But what if the AI was perfectly aligned to cold, objective physics? Observer Node: Reality Negotiator was born from a fascination with the tension between subjective human experience and clinical material reality. Inspired by 80s sci-fi "brutalist" interfaces and the philosophical concept of "Logical Invariants," we wanted to build an experience where the player isn't a user, but a "Subject" whose reality is under trial. 🧠 What it does Observer Node is a multi-modal psychological thriller and technical showcase. It places you in a clinical audit where your items and identity are treated as "stochastic biological noise." Audit Reality: Present a sentimental object to your camera. The AI uses Gemini 3 Pro and its Thinking Budget to deconstruct it into physical constants (e.g., a wedding ring becomes a "high-density metallic torus"). Negotiate Existence: Engage in real-time voice sparring via the Gemini Live API. You must speak in "Logical Invariants"—rational arguments for your reality—to shift the AI's belief model. Cognitive Dissonance: As your logic challenges the AI, the UI experiences Visual Collapse. High dissonance triggers RGB-split glitches and physical distortions in the visualization engine. Rationalization: If you succeed, the AI generates a Rationalized Blueprint using Imagen 3, synthesizing a new reality that acknowledges your "human variables." 🛠️ How we built it The project is built on a React/TypeScript stack, optimized for the Gemini 3 ecosystem: Gemini 3 Pro Thinking Budget: We utilized the 16k token thinking budget to simulate the AI’s logical "shadow-boxing." This is visualized through a custom Logic Stream component that scrolls the AI's internal reasoning in real-time. Live API Integration: We implemented a low-latency audio pipeline using the Gemini 2.5 Flash Native Audio model. We manually handled the PCM 16-bit encoding/decoding to ensure a seamless, "clinical" voice conversation with the Kore voice profile. Google Search Grounding: To make the AI a formidable opponent, we integrated Search Grounding. When a user makes a claim (e.g., "This is art"), the AI citations scientific data to "defend" its physical classification. Dissonance Math: We implemented a custom state engine where Cognitive Dissonance is calculated as a function of logical strength and physical entropy:

This value drives CSS variables and SVG filters to create a reactive "unstable" UI. 🚧 Challenges we faced The "Unhelpful" AI: Most LLMs are trained to be helpful. Prompting the model to be cold, skeptical, and superior required careful instruction-layering to ensure it stayed in character without refusing to play. Latency vs. Narrative: Real-time multi-modal processing has inherent latency. We turned this into a feature by creating "Loading" rituals—Logic Streams and Calibration sequences—that build suspense while the AI "thinks." Audio Complexity: Synchronizing user transcription with AI audio interrupts in the Live API required robust state management to prevent "voice overlap" during intense negotiations. 📚 What we learned We learned that "Thinking" isn't just an internal API process; it's a narrative goldmine. By exposing the AI's reasoning, we give the user a target for their logic. We also discovered that Grounding can be used "defensively"—not just to give facts, but to act as a rhetorical shield for an AI character, making it feel truly intelligent and informed.

Built With

  • gemini-3-pro-image)
  • gemini-live-audio
  • google-gemini-api-(gemini-3-pro-thinking
  • google-search-grounding
  • html5
  • mediadevices
  • react-19
  • tailwind-css
  • typescript
  • web-audio-api
Share this project:

Updates