Watchdog

Inspiration

AI is scaling fast — and we're not talking about next decade. Fleets of thousands of autonomous agents running in parallel, each making decisions, generating outputs, and influencing one another. The problem no one is solving yet: hallucination contagion. In a fully connected network of 1,000 agents, you have over a million possible connections. One agent hallucinates. It spreads to two. Those two spread to four. Before any human can intervene, the entire network has poisoned itself — and no team, no matter how large or how vigilant, can monitor a million connections 24/7.

We built Watchdog because the solution can't be more humans. It has to be smarter architecture.

Our thinking was also validated by recent research — the LatentMAS paper (Zou et al., 2025) demonstrated that moving agent collaboration from token space into latent space cuts token usage by 50–80% and speeds up wall-clock time 3–7×. We independently arrived at the same core insight and built anomaly detection and trust scoring on top of it.

What It Does

Watchdog is a real-time hallucination detection and consensus system for large-scale multi-agent AI networks. Instead of agents talking directly to each other — O(n²) connections, contagion risk — every agent publishes its outputs into a shared latent space. Watchdog compresses those outputs into 64-dimensional vectors, runs them through a self-evolving geometric engine, and instantly flags any output that deviates from the learned consensus before it can propagate.

1. Hallucination runaway prevention. Every agent output is embedded and inserted into the latent space as an anchor. The space is engineered so that anomalous vectors stand out geometrically and get flagged in real time. One person can monitor a fleet of thousands.

2. O(n) efficiency. Traditional agent meshes require every agent to talk to every other agent: O(n²) connections. Watchdog collapses this to O(n) — every agent connects only to the shared latent space. At 1,000 agents, that's the difference between a million connections and a thousand.

3. Richer, shared context. The latent space is a living semantic memory. Every agent's outputs contribute to a global ground-truth vector that continuously improves. Agents retrieve context from this shared pool, so the whole system gets smarter over time.

How We Built It

Watchdog is built around a custom self-evolving latent vector engine with several interlocking mechanisms.

Anchors. Every agent output is compressed by a trained pair-encoder (MiniLM-L6-v2 → 64D) and stored as an anchor carrying a vector, text payload, agent ID, timestamp, impact score, and weight.

Ground truth vector. A global consensus vector is continuously recomputed as a normalized, impact-weighted aggregation of all anchors. This GT acts as a gravitational attractor for the entire space.

Gravitational dynamics. Every anchor is iteratively pulled toward GT at a small learning rate η. High-impact anchors pull harder; anomalous anchors are penalized and drift away.

Directional deformation. The key geometric innovation. A consensus direction n̂ is computed as the normalized difference between GT and the base vector. Clean anchors are elongated along n̂ at insertion (stretch factor s₀ > 1). Anomalous anchors are compressed along n̂ (s₀ < 1), pushing them geometrically further from the consensus cluster. Both effects decay back to neutral over time — deformation is a transient amplifier, not a permanent warp.

Anomaly detection. Each new vector is scored: 60% neighbor distance + 40% GT divergence. Vectors above threshold are flagged, downweighted, and compressed. Agents with sustained high anomaly rates are identified as bad actors and removed.

Multi-agent orchestration. The AgentNetwork layer manages a pool of agents, each with a role-derived query vector. Agents retrieve top-K anchors weighted by cosine similarity × decay × impact, generate outputs conditioned on that context, and feed results back into the space.

Auth0

Every operation in Watchdog is gated by Auth0 JWT-verified scopes: read:memory for retrieval, write:memory for insertion, update:latent for modifying the space geometry. Identity flows through every layer — not bolted on at the edge, but threaded through every read, write, and update. This is what makes per-agent accountability possible at scale. You always know exactly which agent touched what, when, and with what permissions. Without identity at the infrastructure level, trust scoring is meaningless.

Kiro

We used Kiro as the coordination layer across the entire stack — Python backend, FastAPI demo, and React + Three.js visualization. The key wasn't raw code generation; it was Kiro's spec-driven workflow keeping our architecture coherent under time pressure. We maintained a written spec for the visualization semantics (3D PCA hull of 64D vectors, 2D K-means anomaly scatter, agent topology with Bézier edges) and implemented against it file by file. Steering docs kept the agent from drifting — defaulting to minimal diffs, stable API contracts, and hackathon scope rather than spinning up unrelated frameworks. For long files with complex lifecycle like the Three.js scene (dispose paths, orbit camera persistence, bloom composer), Kiro's multi-step guided edits were the difference between clean extraction and spaghetti. The result is a codebase where you can trace intent to UI — the visualization isn't random polish, it's specified behavior.

Macroscope AI

Macroscope AI gave us the observability layer. Rather than reading logs to understand what the latent space was doing, we could see it — clusters forming, the ground truth vector shifting, anomalous anchors getting pushed to the geometric fringe in real time. For a system whose core thesis is that structure in embedding space encodes trust, being able to visualize that structure wasn't just a demo nicety — it was how we validated that the geometry was actually working the way we designed it to.

Challenges We Ran Into

Cold-start instability. With fewer than ~10 anchors, the GT is too noisy — the anomaly detector fires on everything. We solved this with a warm-up gate: anomaly checking is skipped until the space has enough anchors to form a stable consensus.

Deformation without permanent warping. If stretch compounds indefinitely, the space collapses or explodes. The solution was making stretch decay independently of weight decay at a faster rate, so geometric deformation is always transient while semantic weight persists.

Generator stubs masking the real system. Placeholder generate functions were just concatenating prompt strings — the latent space was working, but the outputs were garbage. Writing real domain-aware generators (honest, subtle adversarial, fully adversarial) was essential to validating the pipeline.

IQR outlier removal on small pools. Statistical outlier detection via IQR requires at least three agents with data. Tuning the IQR factor, minimum trust threshold, and cycle count before removal was critical to getting the pipeline to converge correctly.

Accomplishments We're Proud Of

A latent space that geometrically reshapes itself to make anomalies easier to detect over time — the deformation axis is a novel mechanism, not borrowed from any existing retrieval system.
Reducing multi-agent communication from O(n²) to O(n) with no loss of shared context — in fact richer context, because the space aggregates semantic signal from every agent.
A single human operator genuinely able to monitor and control a multi-agent network in real time, with bad actors flagged and removed automatically.
End-to-end integration: trained encoder checkpoints, a ResponseLatentNet fusion model, Auth0 identity, Kiro-coordinated full-stack implementation, and Macroscope AI visualization — running as one coherent system.

What We Learned

Geometric structure in embedding space is underutilized. Everyone treats vector stores as dumb lookup tables — cosine similarity and done. Watchdog showed us that you can engineer the geometry of a latent space to encode trust, amplify consensus, and suppress noise passively, without any explicit classifier. The space itself becomes the immune system.

We also learned that identity has to be infrastructure, not an afterthought. Auth0 scopes flowing through every operation aren't just security — they're what makes per-agent accountability real. And working alongside the LatentMAS research confirmed we were solving a real problem: their results show latent-space collaboration is faster and leaner than token-space communication. Watchdog adds the safety layer that makes it trustworthy.

What's Next for Watchdog

Scale testing — 50–100 agents to validate O(n) efficiency and GT stability under high insert volume
Live LLM integration — replacing local generators with real model calls so Watchdog operates on actual inference outputs
Visualization dashboard — real-time 2D/3D projection of the latent space so operators see the space, not logs
Continuous trust scoring UI — live agent leaderboard with anomaly history drill-down
Federated latent spaces — multiple Watchdog instances sharing GT signals across organizational boundaries

Built With

auth0
kiro
macroscopeai
python

Updates

Bingyuan Wen started this project — Mar 27, 2026 07:29 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.