Cathy l posted an update — Jan 10, 2026 12:37 PM EST

Watcher Boundary

Snapstr AI — Environment Boundary & Agent Loop

┌────────────────────────────────────────────────────────────┐
│                        REAL WORLD                           │
│                                                            │
│   Filesystem   Camera Clips   Mobile Uploads   API Events  │
│                                                            │
└───────────────┬────────────────────────────────────────────┘
                │   (external event)
                ▼
┌────────────────────────────────────────────────────────────┐
│                  INPUT WATCHER LAYER                       │
│                                                            │
│  agent/file_watcher.py                                     │
│                                                            │
│  Detects new input                                       │
│  Normalizes event                                        │
│  Emits ONE episode trigger                               │
│                                                            │
│  No Gemini                                              │
│  No memory access                                       │
│  No decisions                                           │
│  No reinforcement                                       │
│                                                            │
│  ─────── HARD REINFORCEMENT BOUNDARY ───────                │
└───────────────┬────────────────────────────────────────────┘
                │   (episode start)
                ▼
┌────────────────────────────────────────────────────────────┐
│               AUTONOMOUS AGENT SYSTEM                       │
│                                                            │
│  AnalyzerAgent  →  Decision Agents  →  ExecutionAgent      │
│       │                   │                │              │
│    Gemini 3            Memory (read)     Real World Action │
│                                                            │
│  Outcome Observation → Reinforcement → Memory Update       │
│                                                            │
│  ReflectionAgent (Gemini 3) → Behavior Change              │
│                                                            │
└────────────────────────────────────────────────────────────┘

The watcher is an environment sensor, not part of the agent. Every episode begins cleanly at the boundary, enabling auditable learning.

Failure-Mode

architecture is safe, intentional, and not accidental.

showing controlled failure.

Snapstr AI fails safely, learns correctly, and does not contaminate decisions when something goes wrong.

Watcher Failure (Input Chaos)

Scenario

Simulate a broken input source:

Rapid duplicate events
Corrupted metadata
Mixed sources firing simultaneously

Trigger 3 fake events:

touch video1.mp4
touch video1.mp4
touch video1.mp4

logs:

[WATCHER] Event detected: video1.mp4
[WATCHER] Emitting episode trigger

no other side effects.

“Even if the environment misbehaves, the watcher only emits episode triggers. There is no learning, no reasoning, and no policy impact here.”

Why This Matters

No hidden state No cascading errors Clean episode boundaries

Bad Decision Outcome (Learning Stress Test)

Scenario

The agent makes a reasonable decision that performs badly.

Example:

Public video
Low engagement
User manually flips to private

What Happens in Code

[EXECUTOR] Upload complete (public)
[LEARNING] privacy_changed=True
[REINFORCEMENT] Reward = -1.0

Memory Update

[MEMORY] Penalized public decision pattern

“The agent isn’t punished instantly. Only real-world feedback produces reinforcement.”

Gemini Failure

Scenario

Gemini API fails or times out.

Behavior

[ANALYZER] Gemini unavailable
[ANALYZER] Falling back to conservative defaults

Result

Privacy defaults to private
No memory update
No learning contamination

Safety first No hallucinated state No corrupted reinforcement

Reinforcement Abuse Attempt

Scenario

User tries to “game” learning by:

Uploading junk
Deleting videos repeatedly

Result

[REINFORCEMENT] Deleted video penalty applied
[MEMORY] Confidence reduced, exploration dampened

“Negative reinforcement dominates. The system becomes more conservative, not more reckless.”

“We didn’t just test success cases — we designed Snapstr AI to fail safely, learn only from real outcomes, and never let sensors or models silently influence policy.”

Log in or sign up for Devpost to join the conversation.