Inspiration
In 2026, enterprises are deploying AI agents inside their infrastructure — agents that hold real credentials and can reach databases, secrets stores, and internal APIs. Every existing safety control for these agents lives in the same place the agent does its thinking: the system prompt, the tool allow-list, the code-level filter. That is the flaw. A single prompt injection or poisoned input compromises the agent's reasoning, and every in-band guardrail falls with it — because you cannot ask a compromised mind to police itself.
We kept coming back to one idea from network security: you don't trust the host to enforce its own access — you enforce at the network. Tailscale doesn't ask a laptop nicely not to reach a server; the network refuses. AgentGate brings that principle to AI agents, and makes Splunk the intelligence layer that sees, detects, and investigates every decision.
What it does
AgentGate enforces agent access at the network layer, below the agent, where model reasoning cannot override it — and turns Splunk into the SOC brain for agentic operations.
- Scoped identity — every agent carries a JWT listing exactly the services it may reach (zero standing privilege, least privilege per identity).
- Policy broker — every request passes through a broker that validates the token and checks the target against the allowlist. In scope, the request is proxied. Out of scope, it is refused at the network layer (403, no route). There is no prompt that un-refuses it.
- Total observability — every decision, ALLOW and DENY, streams into Splunk over HEC as a CIM-friendly JSON-RPC event in
index=agentgate. Three SPL correlation rules detect scope violations, sensitive-service attempts, and denial bursts. - Machine-speed investigation — on every denial, Sentinel, an autonomous investigation agent, pulls incident context through the native Splunk MCP Server, reasons over it with a local security model, and writes a full triage report back into Splunk.
- SOC view — a native Dashboard Studio dashboard, AgentGate Control, shows the live decision feed, per-identity map, denial timeline, and the latest Sentinel triage.
The analyst arrives to a closed incident — detection, investigation, and evidence already in Splunk — instead of an open breach.
The core idea
A prompt injection can hijack the agent's reasoning — but not its network identity. The enforcement flow:
Agent (scoped JWT) → AgentGate Broker → [in scope] proxy to service / [out of scope] 403, no route → every decision emitted to Splunk
Because the broker sits outside the agent, a prompt that hijacks the agent's reasoning still cannot grant access the token never had. The compromise is contained at the network boundary. (Full architecture diagram in the repo: architecture_diagram.md.)
How AgentGate uses Splunk — to the fullest
AgentGate is not "an app that logs to Splunk." Splunk is the ingest plane, the detection engine, the agent-accessible query surface, and the analyst interface — four distinct native surfaces:
1. Ingest — the broker writes every decision to HEC with CIM-aligned fields under the MCP-TA sourcetype into index=agentgate.
2. Detect — three SPL correlation rules run over index=agentgate (scope violation, sensitive-service access, denial-burst anomaly); one fires the demo alert in real time.
3. Investigate (the MCP bonus) — on a denial, Sentinel reaches Splunk through the native Splunk MCP Server with an encrypted token, calling saia_generate_spl to turn natural language into SPL and splunk_run_query to execute it — the same agentic surface a human analyst's copilot would use. The triage report lands in index=agentgate_investigations.
4. Present — a native Dashboard Studio dashboard reads both indexes for the SOC view.
The investigation flow:
DENY → Sentinel → native Splunk MCP Server (saia_generate_spl → splunk_run_query) → Foundation-Sec-8B (local, Ollama) → triage report → index=agentgate_investigations → AgentGate Control dashboard
Nothing is bolted on the side — Splunk is the operational runtime. This is "Best Use of Splunk MCP Server" by design: the agent reasons in natural language and Splunk's own MCP tools do the work.
How AI is used
The AI is Sentinel, the investigation agent. On every denial it (1) pulls recent context for the offending identity over MCP, (2) reasons over the deny event and the injected instruction using Foundation-Sec-8B-Instruct, a security-specialized open-weights model running locally via Ollama (in-perimeter, no data egress, no hosted-model API), and (3) writes a structured triage — probable cause, blast radius, evidence, recommended action — back into Splunk.
Critically, Sentinel does not take the injection's bait: it analyzes the poisoned instruction as evidence of an attack and classifies it as prompt_injection, rather than obeying it.
How we built it
- Broker — FastAPI middleware issuing and validating scoped HS256 JWTs; exact-match policy check; default-deny on every error path (missing, expired, or malformed token, unknown service, exception — all resolve to DENY).
- Services — three mock internal targets (prod-db, secrets-store, internal-api).
- Emit — best-effort HEC pipeline; a Splunk outage can never block enforcement.
- Sentinel — MCP client to the Splunk MCP Server, local Foundation-Sec inference, structured report writer.
- Splunk —
indexes.conf, three correlation rules insavedsearches.conf, a Dashboard Studio dashboard, deployed over REST.
Failure isolation by design: enforcement and event emission never depend on the LLM or MCP. If Ollama or the MCP server is down, the deny still happens and still lands in Splunk — only the narrative degrades, and the report records which path ran.
Real-world implementation
AgentGate drops in front of any agent framework — it is protocol-level, not framework-specific. In production, the mock services become your real internal APIs, databases, and secret stores; the broker becomes a sidecar or gateway every agent's egress routes through; tokens are minted per agent-session with short TTLs and per-identity scopes managed centrally. Splunk is almost always already the enterprise SIEM, so the detection rules, investigation agent, and SOC dashboard slot directly into an existing security-operations workflow. A real deployment adds mTLS between broker and services, an HA broker pool, and per-team scope policy as code — but the enforcement model and the Splunk intelligence layer are exactly what's shown here.
Challenges we ran into
- Hosted models are Cloud-only. We're on Splunk Enterprise, so we deployed the open-weights Foundation-Sec model locally via Ollama and kept the entire investigation in-perimeter — arguably a stronger security story than a hosted API.
- Deterministic triage. Early classification was non-deterministic. We fixed it with a constrained triage prompt (explicit precedence rule anchored on stable facts), temperature 0, a fixed output enum, and parse-to-structured-output — verified
prompt_injectionacross repeated runs. - Demo reliability. Local CPU inference cold-loads slowly; we pinned the model resident and sized timeouts so investigations stay fast on camera.
What we learned
Enforcement and reasoning belong in different layers. The moment we stopped trying to make the agent police itself and pushed enforcement below it, the whole security model got simpler and stronger. And Splunk's MCP Server makes "the agent investigates through the SIEM" a real, clean pattern rather than a bolt-on.
What's next
AgentReady — a build-time auditor that checks whether a Splunk app is agent-ready before it ships. The pairing: AgentReady audits agent-readiness at build time; AgentGate enforces it at runtime.
Built With
- ai-agents
- mcp
- security
- splunk
- zero-trust
Log in or sign up for Devpost to join the conversation.