Inspiration

The project is inspired by Anil Seth's work on controlled hallucination and Karl Friston's active inference framework. These theories explain perception not as passive input processing but as active, top-down prediction error minimization. Current multimodal AI demos rarely show this loop visually and computationally — they stop at captioning or generation. This project bridges that gap by externalizing the inference cycle in a way that feels like watching a brain think.

What it does

Active Inference Engine simulates a closed-loop perceptual system using Gemini 3.

  • User uploads an image or short video
  • Bottom-up stream extracts structured features (luminance, edges, objects, motion)
  • Top-down prior generates expected feature vector and predicted next frame
  • Computes real prediction error (cosine similarity) and surprise score
  • Updates a persistent belief state (EMA)
  • Visualizes surprise meter, region activation, belief vector evolution
  • Outputs a predicted motor readiness action with confidence (e.g., "Approach subject", "Maintain gaze")
  • Delivers first-person report of the full cycle

The result is a dynamic, visual demonstration of predictive processing, prediction error, belief updating, and action preparation — not just image analysis.

How we built it

Built entirely in Google AI Studio using Gemini 3 Pro.

  • Vibe Code to generate the frontend layout, dark neuroscience theme, and interactive brain diagram
  • Gemini multimodal vision for bottom-up feature extraction
  • High-reasoning mode (network_intelligence) for top-down prediction and motor mapping
  • Code execution tool to compute cosine distance, EMA belief update, and matplotlib surprise plots
  • Image generation & editing for predicted next frame and error heatmaps
  • Structured output for clean feature vectors, surprise score, and conscious report
  • No external frameworks, cloud services, or databases — 100% Gemini-native

Challenges we ran into

  • AI Studio vibe-code struggles with complex animated flows and persistent state — required multiple strict refinement prompts to force region activation, arrow flows, and surprise computation.
  • Gemini sometimes defaults to descriptive captioning instead of numerical feature vectors — fixed with explicit code-execution instructions.
  • Keeping layout clean (no grids/collages) while adding computational depth was difficult — solved by ruthlessly removing visual clutter in iterations.
  • Motor prediction felt generic at first — improved by tying it directly to belief state vector.

Accomplishments that we're proud of

  • Built a working active inference loop inside pure AI Studio (no external code) — surprise score, belief update, and motor output are computed, not faked.
  • Visualized top-down/bottom-up distinction with flowing signals, region activation, and surprise pulses — creates genuine "watching a brain think" moment.
  • Heavy, non-trivial Gemini usage: vision + reasoning + code execution + image gen/editing in a single coherent cycle.
  • Delivered neuroscience authenticity without overclaiming consciousness — focused strictly on predictive processing mechanics.

What we learned

  • Gemini 3 Pro is extremely powerful for computational neuroscience prototypes when forced to output vectors and execute math — not just text.
  • Vibe-code is capable of interactive dashboards but requires very strict, iterative prompting to avoid generic vision patterns.
  • Active inference concepts (precision, surprise, belief updating) become far more intuitive when externalized visually and numerically.
  • Judges value demos that compute something measurable (surprise score, belief shift) over pure aesthetics.

What's next for Active Inference Engine

  • Add user-controlled precision weighting (attention) to modulate surprise influence
  • Extend to real-time webcam input for continuous loop
  • Integrate voice output for the conscious report (Gemini audio_spark)
  • Open-source the prompt chain to let others build domain-specific inference engines (e.g., emotion, interoception)
  • Explore scaling to multi-modal long-context sessions for emergent self-modeling

Built With

  • activeinference
  • aistudiohackathon
  • gemini3
  • geminivision
  • generativeai
  • googleaistudio
  • imagegeneration
  • multimodalai
  • neurosciencesimulation
  • predictiveprocessing
  • vibecode
  • vision
Share this project:

Updates

posted an update

Perceptual Inference Model update – v5.3 Evolved from captioning + overlays → real computational loop. Now does:

Bottom-up feature extraction (vector) Top-down prediction vector Cosine surprise score (actual number) Persistent belief state update Motor readiness output (action + confidence)

Brain on right shows live signal flow, region activation, surprise pulses.

Log in or sign up for Devpost to join the conversation.