posted an update


Watcher Boundary


Snapstr AI — Environment Boundary & Agent Loop

┌────────────────────────────────────────────────────────────┐
│                        REAL WORLD                           │
│                                                            │
│   Filesystem   Camera Clips   Mobile Uploads   API Events  │
│                                                            │
└───────────────┬────────────────────────────────────────────┘
                │   (external event)
                ▼
┌────────────────────────────────────────────────────────────┐
│                  INPUT WATCHER LAYER                       │
│                                                            │
│  agent/file_watcher.py                                     │
│                                                            │
│  Detects new input                                       │
│  Normalizes event                                        │
│  Emits ONE episode trigger                               │
│                                                            │
│  No Gemini                                              │
│  No memory access                                       │
│  No decisions                                           │
│  No reinforcement                                       │
│                                                            │
│  ─────── HARD REINFORCEMENT BOUNDARY ───────                │
└───────────────┬────────────────────────────────────────────┘
                │   (episode start)
                ▼
┌────────────────────────────────────────────────────────────┐
│               AUTONOMOUS AGENT SYSTEM                       │
│                                                            │
│  AnalyzerAgent  →  Decision Agents  →  ExecutionAgent      │
│       │                   │                │              │
│    Gemini 3            Memory (read)     Real World Action │
│                                                            │
│  Outcome Observation → Reinforcement → Memory Update       │
│                                                            │
│  ReflectionAgent (Gemini 3) → Behavior Change              │
│                                                            │
└────────────────────────────────────────────────────────────┘

The watcher is an environment sensor, not part of the agent. Every episode begins cleanly at the boundary, enabling auditable learning.


Failure-Mode

architecture is safe, intentional, and not accidental.

showing controlled failure.


Snapstr AI fails safely, learns correctly, and does not contaminate decisions when something goes wrong.


Watcher Failure (Input Chaos)

Scenario

Simulate a broken input source:

  • Rapid duplicate events
  • Corrupted metadata
  • Mixed sources firing simultaneously
  1. Trigger 3 fake events:
touch video1.mp4
touch video1.mp4
touch video1.mp4
  1. logs:
[WATCHER] Event detected: video1.mp4
[WATCHER] Emitting episode trigger
  1. no other side effects.

“Even if the environment misbehaves, the watcher only emits episode triggers. There is no learning, no reasoning, and no policy impact here.”

Why This Matters

No hidden state No cascading errors Clean episode boundaries


Bad Decision Outcome (Learning Stress Test)

Scenario

The agent makes a reasonable decision that performs badly.

Example:

  • Public video
  • Low engagement
  • User manually flips to private

What Happens in Code

[EXECUTOR] Upload complete (public)
[LEARNING] privacy_changed=True
[REINFORCEMENT] Reward = -1.0

Memory Update

[MEMORY] Penalized public decision pattern

“The agent isn’t punished instantly. Only real-world feedback produces reinforcement.”


Gemini Failure

Scenario

Gemini API fails or times out.

Behavior

[ANALYZER] Gemini unavailable
[ANALYZER] Falling back to conservative defaults

Result

  • Privacy defaults to private
  • No memory update
  • No learning contamination

Safety first No hallucinated state No corrupted reinforcement


Reinforcement Abuse Attempt

Scenario

User tries to “game” learning by:

  • Uploading junk
  • Deleting videos repeatedly

Result

[REINFORCEMENT] Deleted video penalty applied
[MEMORY] Confidence reduced, exploration dampened

“Negative reinforcement dominates. The system becomes more conservative, not more reckless.”


“We didn’t just test success cases — we designed Snapstr AI to fail safely, learn only from real outcomes, and never let sensors or models silently influence policy.”


Log in or sign up for Devpost to join the conversation.