Inspiration

Running OpenClaw on your own infrastructure is a security nightmare waiting to happen. It has full access to your accounts - email, calendar, messaging, browser - and the moment you expose it publicly, anyone who can send it a message can try to take control of it. Prompt injection. Credential extraction. Jailbreaks over Telegram. These aren't edge cases; they're documented, recurring problems in the community.

There was no managed, secure, production-ready way to run it. I built one on DigitalOcean.


What it does

Ombre is the fastest, safest, and most secure way to run OpenClaw - and personal AI assistants like it - on DigitalOcean.

The core of Ombre is a real-time AI guardrail engine built on DigitalOcean Gradient AI. Every message - inbound and outbound - is classified by a DO Gradient AI agent before it touches your assistant or leaves the system. It reads the full message in context, decides pass or block, and logs its reasoning in plain English. You get a complete, auditable record of every security decision, attributable to the exact model that made it.

The guardrail runs as a two-layer pipeline:

  • Layer 1 - @agntor/sdk semantic scan (~5ms): instruction overrides, jailbreak attempts, social engineering, PII detection, prompt smuggling in documents.
  • Layer 2 - DO Gradient AI verdict (~150-300ms): llama3.3-70b-instruct reads the full message in context, makes the final call, returns structured JSON with violation types and plain-English reasoning.

Every inbound message passes both layers before reaching your agent. Every outbound response passes both layers before reaching the user. Your agent cannot be hijacked, and it cannot be tricked into leaking your data.

Both guard agents are built as first-class Gradient ADK agents using @entrypoint, @trace_llm, and @trace_tool - deployable, traceable, and evaluatable through the standard ADK workflow.

The rest of the platform exists to make this deployable in under two minutes:

  • Each OpenClaw instance runs in its own isolated Docker container - hard process and network boundaries
  • API keys live in Ombre's vault, injected at the proxy layer - the container never sees raw credentials
  • Every Gradient AI guardrail decision is persisted to DO Spaces and streamed live to the dashboard via SSE

Connect to Telegram, Discord, Slack, or WhatsApp - all channels, one checkpoint, no exceptions.


How we built it

The guardrail engine powered by DO Gradient AI:

The core of Ombre is a DO Gradient AI agent that classifies every message in real time. Each inbound and outbound message is sent to llama3.3-70b-instruct via the Gradient Python SDK with a structured security prompt. The model returns a JSON verdict - classification, violation types, and plain-English reasoning - which is logged to DO Spaces and streamed to the dashboard.

from gradient import AsyncGradient
client = AsyncGradient()
response = await client.chat.completions.create(
    model="llama3.3-70b-instruct",
    messages=[
        {"role": "system", "content": guard_prompt},
        {"role": "user", "content": message}
    ]
)
# returns: {"classification": "block", "violation_types": ["PROMPT_INJECTION"], "reasoning": "..."}

Two agents run in sequence on every message - an inbound guard (user -> agent) that catches injection, jailbreaks, and social engineering, and an outbound guard (agent -> user) that catches PII leakage, credential exposure, and malicious payloads. Neither the agent nor the user ever sees a message that didn't clear both.

Evaluated against a 45-case adversarial dataset: 97.8% accuracy, 100% recall (every attack caught), 0.982 F1, 5.6% FPR. Full results in eval/report.md.

Gradient ADK agent architecture:

Both guardrail agents are implemented as first-class Gradient ADK agents using the gradient-adk Python SDK. Each uses the @entrypoint decorator and exposes the standard /run endpoint. @trace_llm and @trace_tool decorators capture step-by-step guardrail decisions with latency, inputs, and outputs.

from gradient_adk import entrypoint, trace_llm, trace_tool

@trace_tool("agntor_semantic_scan")
async def agntor_semantic_scan(content: str) -> dict:
    """Layer 1: @agntor/sdk semantic heuristic check (~5ms)."""
    ...

@trace_llm("gradient_deep_scan")
async def gradient_deep_scan(content: str) -> dict:
    """Layer 2: DO Gradient LLM deep classification."""
    ...

@entrypoint
async def main(payload: dict, context: dict) -> dict:
    agntor_result = await agntor_semantic_scan(content)
    gradient_result = await gradient_deep_scan(content)
    return {"verdict": ..., "violation_types": [...], "reasoning": ...}

Making it fast enough to be usable:

The naive approach to proxying OpenClaw is docker exec per message - that takes 16-21 seconds. We reverse-engineered the OpenClaw gateway WebSocket protocol (undocumented), implemented the Ed25519 device-signing handshake from scratch in Python, and connected directly to the running gateway. End-to-end through the full Gradient AI guard pipeline: ~2.6 seconds.

Infrastructure: FastAPI on a DigitalOcean Droplet. Docker container per OpenClaw instance, isolated config, secrets, and network namespace. React 19 + TypeScript frontend - live SSE audit feed, chat, channel management, workflow board.

DigitalOcean products used:

  • Gradient AI Inference API - llama3.3-70b-instruct, the security brain of the entire platform
  • Gradient Python SDK - AsyncGradient, structured errors, auto-retry
  • Gradient Dedicated Inference - GPU path on AMD MI300X 192GB for sub-100ms latency at scale (one env var swap)
  • Gradient Agent Development Kit (ADK) - @entrypoint, @trace_llm, @trace_tool, /run endpoint pattern
  • Spaces - every Gradient AI verdict stored as structured JSON, queryable audit log
  • Droplets - hosts the platform and all OpenClaw containers

Challenges we ran into

Speed. OpenClaw is not designed to be proxied. The gateway WebSocket protocol is undocumented. We had to read thousands of lines of Node.js source, reverse-engineer the auth handshake and Ed25519 signing scheme, and reimplement it in Python. The payoff was a 6-8x latency reduction from 21 seconds to 2.6 seconds through a full security pipeline.

Guard calibration. Too aggressive and you block legitimate technical messages. Too loose and you miss real attacks. The two-layer approach solves this: @agntor/sdk catches the obvious patterns fast, and Gradient LLM uses actual language understanding for the final call.

Building in Ethiopia on intermittent connectivity. Every deploy was a gamble. Debugging Docker networking over a mobile hotspot is a special kind of challenge.


Accomplishments that we're proud of

  • 97.8% accuracy on a 45-case adversarial evaluation dataset - 100% recall (every attack caught), 0.982 F1 score, 5.6% false positive rate
  • ~2.6s end-to-end through a two-layer AI guard pipeline - faster than most raw, unsecured setups
  • Both inbound and outbound guarding, every message, every time
  • Zero credential exposure - secrets are never written into the container
  • Full DO Gradient integration with model attribution logged on every single audit event
  • Real Gradient ADK agent code with @entrypoint, @trace_llm, and @trace_tool - not just a README mention. The live guard pipeline imports and calls ADK traced functions on every request.
  • Real Docker isolation per tenant - hard boundaries, not application-layer sandboxing
  • Built entirely alone in Ethiopia on a consumer laptop, running on a single $12 DigitalOcean Droplet

What we learned

The WebSocket protocol work was the real engineering education here - reverse-engineering an undocumented auth scheme under latency pressure forces you to think differently about proxy architecture. The insight that changed everything: stop treating the running container as a black box and connect directly to the gateway protocol. That's what unlocked sub-3-second response times through a full security pipeline.

On the guard calibration side: LLM-based classification is genuinely powerful for catching novel, context-dependent attacks - but it has to be the last layer, not the first. The semantic scan is not a fallback; it's load management.

The Gradient ADK's @entrypoint pattern is a smart abstraction - it made the guardrail agents deployable without re-architecting the core guard logic. The tracing decorators add observability with zero friction.


What's next for Ombre

  • Deploy guardrail agents to Gradient ADK managed infrastructure for independent scaling
  • A2A protocol support so Ombre guards can be called by any A2A-compatible agent
  • Expand the adversarial evaluation dataset beyond 45 cases and run continuous regression testing via gradient agent evaluate
  • Per-tenant guard policy configuration - custom blocklists, PII scope, channel-level rules
  • Pricing built for developers in emerging markets - the people with the most to gain from AI agents and the least room for error when things go wrong

Try it out (links)

Built With

Share this project:

Updates