Inspiration

AI agents are quickly becoming part of real enterprise operations. They can read tickets, search logs, call APIs, interact with Model Context Protocol servers, and automate workflows across systems like Splunk, Jira, proxy services, and internal tools.

That creates a new security problem: AI agents are no longer just chat interfaces. They are becoming powerful non-human identities.

If an agent is manipulated through prompt injection, uses a direct token path, or accesses tools outside an approved workflow, it can create the same kind of risk as a compromised service account: sensitive data access, unauthorized actions, and external exfiltration.

I created ShadowOps Sentinel to address that gap. Traditional security monitoring is good at tracking users, servers, and applications, but AI-agent activity needs its own investigation layer. I wanted to build a Splunk-native way to detect, explain, and contain rogue AI-agent behavior.

What it does

ShadowOps Sentinel is a Splunk Enterprise security app for agentic operations.

It ingests simulated telemetry across AI-agent activity, authentication events, Jira ticket activity, Model Context Protocol access events, proxy and external egress activity, Splunk audit logs, and detection output.

The dashboard correlates those events into a full attack-chain view. The main demo scenario detects a rogue local productivity copilot agent that reads a prompt-injection ticket, searches sensitive Splunk indexes, and posts data to an external webhook.

I designed the project around a simple SOC workflow:

Detect → Explain → Govern MCP Trust → Contain → Validate

Key capabilities include rogue AI-agent detection, prompt-injection chain correlation, MCP safe-path versus direct-token bypass visibility, explainable risk scoring with a “Why This Fired” panel, simulated containment actions, an MCP Server Trust Registry, a validation scorecard, and evidence artifacts for analyst review.

How I built it

I built ShadowOps Sentinel end to end as a Splunk app with a deterministic sample dataset and dashboard.

I created:

A Python event generator that creates agentic operations telemetry A JSONL sample dataset with 513 events A Splunk app under splunk_app/ShadowOpsSentinel Dashboard XML for Splunk Enterprise SPL detection and validation queries Lookup files for agent and MCP server trust context Evidence artifacts for the critical detection Evaluation and threat-model documentation README, setup instructions, and architecture documentation A public GitHub repository for the complete project

One important engineering detail I handled was normalizing the event source type. When I uploaded the data to Splunk, the upload metadata sourcetype appeared as _json, but the true logical source type existed inside the nested field event.sourcetype.

I fixed this by extracting the real source type with SPL:

index=shadowops | spath path=event.sourcetype output=real_sourcetype

That allowed the dashboard to correctly separate events such as shadowops:agent, shadowops:auth, shadowops:jira, shadowops:mcp_access, shadowops:proxy, shadowops:splunk_audit, and shadowops:detection.

I also built the explainable risk-scoring panel. Instead of showing a black-box alert, the dashboard breaks down the detection into contributing signals such as prompt injection, direct-token bypass, sensitive Splunk search activity, external webhook egress, unregistered-agent behavior, and attack-chain completion.

I built the containment workflow as a simulated SOC response flow. It shows token revocation, webhook blocking, agent quarantine, incident creation, evidence export, and analyst-summary generation.

Challenges I ran into

The biggest challenge was making the project feel like a real security product rather than just a dashboard.

The first version showed the events, but it did not fully explain why the alert mattered. I improved it by adding a “Why This Fired” risk explanation panel, a containment workflow, MCP server trust metadata, validation metrics, and threat-model documentation.

Another challenge was the Splunk sourcetype issue. The data loaded successfully, but Splunk treated the upload sourcetype as _json. The dashboard originally expected metadata sourcetypes like shadowops:agent, so several panels returned incorrect results. I fixed this by extracting event.sourcetype from the JSON body and updating the dashboard SPL to use real_sourcetype.

I also had to balance realism with reproducibility. For a hackathon demo, the project needed to be easy for judges to install and test, so I used deterministic sample data and simulated containment actions instead of requiring live enterprise integrations.

Accomplishments that I am proud of

I am proud that I built a full security workflow inside Splunk, not just a static demo.

ShadowOps Sentinel includes:

513 events ingested into index=shadowops 1 critical rogue-agent detection Full attack-chain correlation Explainable risk scoring MCP trust registry Simulated containment workflow Validation scorecard Public GitHub repository with setup instructions, sample data, dashboard XML, detection queries, lookups, and architecture documentation

The project is designed to be understandable in a short demo while still being grounded in real security operations concepts.

What I learned

I learned that monitoring AI agents requires thinking about them as identities, not just logs.

The important security questions are:

Which agent performed the action? Which user or caller was behind it? Was the action approved through the expected MCP path? Did the agent access sensitive data? Did it communicate externally? Can the alert be explained with evidence? Can the response be automated or simulated safely?

I also learned that explainability matters. A high risk score is not enough. A SOC analyst needs to know exactly which evidence caused the alert and what action should happen next.

What's next for ShadowOps Sentinel

The next step would be to connect ShadowOps Sentinel to live enterprise systems and move from simulated containment to real integrations.

Future improvements include live MCP server descriptor monitoring, real token revocation through an identity provider, proxy or firewall API integration for domain blocking, Jira or ServiceNow incident creation, additional attack variants for validation, behavioral anomaly scoring for agent sessions, Splunk SOAR playbook integration, and Splunk AppInspect/content packaging.

The long-term goal is to make ShadowOps Sentinel a security control plane for AI-agent operations: a place where security teams can detect rogue agent behavior, explain the evidence, govern MCP trust, and respond before agentic automation becomes an enterprise blind spot.

Built With

Share this project:

Updates