Inspiration

The average enterprise SOC receives over 10,000 alerts per day. Analysts spend the majority of their time not investigating threats, but doing the manual labor that precedes investigation: copying alert data between tabs, querying multiple indices to reconstruct timelines, cross-referencing user and host activity, and writing up findings in a ticket format that may or may not capture the full picture. The cognitive work of understanding what actually happened gets compressed into whatever time is left after the plumbing is done.

The bottleneck in security operations is not detection. Modern SIEMs are excellent at surfacing signals. The bottleneck is the gap between "an alert fired" and "here is a complete, defensible account of the incident." That gap is measured in hours per incident, and it's where threats slip through, analysts burn out, and institutional knowledge walks out the door.

We didn't want to build a chatbot that summarizes logs. We wanted to build a reasoning agent that does the actual investigative work -- tracing how events connect causally, identifying what systems and users fall within the blast radius, and recommending prioritized response actions. And we wanted every claim in the output to be cited to the specific event that produced it, so the result is something an analyst can trust, verify, and defend.

That's Meridian.

What it does

Production at meridian.metisos.co.

Meridian is a context-aware reasoning agent for the security operations center. When a detection fires, Meridian doesn't surface a summary. It conducts an autonomous investigation and produces a complete incident report containing three components:

Causal chain -- Meridian traces the sequence of events that led to the detection, reconstructing the attack path or failure sequence from source events. It answers "what happened and in what order," establishing directional causation rather than mere temporal correlation.

Blast radius -- Meridian maps which users, hosts, services, and data stores are affected or potentially affected by the incident. It traverses entity relationships (user-to-host, host-to-service, service-to-data) to answer "how bad is this and what else is at risk."

Ranked actions -- Meridian generates prioritized response recommendations based on the severity, scope, and nature of the incident. It answers "what should we do first," giving analysts a decision framework rather than an open-ended problem.

Every claim in the report is cited to the specific event that produced it. If Meridian says a user account was involved, it points to the log entry that proves it. This makes reports defensible, auditable, and trustworthy in a way that generic LLM-generated summaries are not.

Meridian uses 768-dimensional embeddings (nomic-embed-text-v1.5) to semantically index ingested events, enabling it to surface connections that keyword-based search would miss. It maintains a durable casebook in MongoDB, so context accumulates across incidents rather than resetting with every alert.

How we built it

Meridian's architecture connects two MCP integrations into a unified reasoning pipeline, driven by a Gemini 3.1 Pro reasoning core, on top of a protocol layer that decouples state from compute.

Splunk MCP Server serves as the detection and event source. Security events, alerts, and telemetry are queried directly from Splunk via MCP, giving the reasoning agent live access to the operational data it needs to investigate. Structured SPL queries handle retrieval for known patterns, while the agent dynamically constructs queries based on what it discovers during investigation.

MongoDB MCP Server serves as the persistence and semantic search layer. Ingested events are normalized, embedded using nomic-embed-text-v1.5 (768d), and stored in MongoDB alongside raw event data. MongoDB Atlas Vector Search powers the semantic retrieval side of the system, enabling Meridian to find non-obvious connections between events that keyword-based queries would miss. Investigation state, completed reports, and citation indexes are also persisted in MongoDB, giving the agent durable memory across investigations.

Gemini 3.1 Pro on Vertex AI is the reasoning core. Meridian's agent is built on @google/genai with Vertex's location: "global", exploiting Gemini 3's long-horizon planning to walk a seven-step investigate() procedure inside a single agentic loop. The model's native MCP tool-use lets it switch fluidly between Splunk (live telemetry) and MongoDB (semantic recall + investigation memory) without orchestration glue. For one-sentence outputs like CISO-voice business impact, we use thinkingConfig: { thinkingBudget: 0 } so internal reasoning tokens don't starve the output cap -- a Gemini-3-specific tuning detail that mattered in practice.

Under the hood, Meridian is the first surface on MetisOS -- a protocol stack that decouples state from compute. Every ingested event is wrapped as a versioned ContextSync artifact addressed by a ctx:// URI, stamped with USC (Universal Spatiotemporal Coordinates) so cross-tier matching becomes a closed-form formula rather than a heuristic. The agent's citations aren't free-text strings -- they're URIs into a content-addressed protocol with a provenance log behind them. That's what makes "every claim is cited" enforceable rather than aspirational.

The reasoning layer ties all of this together. When a detection fires, the agent plans its investigation, pulls relevant events through a hybrid approach (structured queries via Splunk MCP for known patterns, semantic similarity search via MongoDB for unexpected correlations), constructs the causal chain, maps blast radius through entity traversal, generates ranked response actions, and produces a structured report with inline citations pointing back to source events.

The Meridian Control Center, built on Next.js 16 and React 19 Server Components, provides a UI for reviewing investigations, drilling into cited evidence, and providing feedback that improves future reasoning. The entire system is open source under Apache 2.0 at github.com/metisos/meridian.

Challenges we ran into

Citation accuracy was harder than reasoning. Getting an LLM to reason about security events is relatively straightforward. Getting it to only make claims it can back with specific evidence is a fundamentally different problem. Early iterations would generate plausible-sounding conclusions and retrofit citations that didn't fully support them. We went through multiple iterations of our citation pipeline to ensure every statement traces back to a real event. This required tight coupling between the retrieval layer and the generation layer -- constraining the model to reason from evidence rather than generating conclusions and searching for evidence afterward.

Balancing recall and precision in event retrieval. Cast the net too wide and the reasoning agent drowns in irrelevant events, producing bloated reports full of tangential connections. Cast it too narrow and it misses the critical connection that explains the incident. Tuning the hybrid retrieval approach across both Splunk (structured) and MongoDB (semantic, with Reciprocal Rank Fusion on top of $vectorSearch + $text) required extensive iteration against real-world alert scenarios to find the right balance.

Making causal chains directional, not just correlational. "These events happened around the same time" is not the same as "this event caused that event." Teaching the reasoning agent to distinguish temporal correlation from actual causation required structured reasoning chains that force the model to justify each causal link with evidence. This is where many AI security tools fail -- they produce impressive-looking timelines that confuse co-occurrence with causation.

Coordinating two MCP servers in a single agent workflow. The agent needs to fluidly move between Splunk (for live event data) and MongoDB (for semantic search and investigation memory) within a single reasoning loop. Getting the tool routing right -- so the agent knows when to query Splunk for fresh data versus when to search MongoDB for historical patterns and prior investigation context -- required careful orchestration design and a meta-tool layer (search_tools, list_tools, call_tool) that lets the model reason about its own tool surface.

Accomplishments that we're proud of

Every claim is cited. In a field saturated with AI tools that generate confident-sounding output no one can verify, Meridian produces reports where every statement can be traced to source data -- by ctx:// URI, not by paraphrase. In security operations, unverifiable claims are worse than no claims at all.

Seconds, not hours. An investigation that would typically take a Tier 1 analyst 30 to 60 minutes, Meridian completes in seconds. With 7,810+ events ingested and a growing set of completed investigations in our demo environment, the system demonstrates that autonomous investigation at speed is practically achievable.

Built on Gemini 3.1 Pro. Meridian was designed from day one around Gemini 3's long-horizon reasoning and native MCP tool-use. The seven-step investigate() procedure works because the model can hold a multi-step investigative plan in its head without losing thread mid-loop -- something the prior generation of models genuinely struggled with.

A real agent, not a chatbot. Meridian doesn't wait for questions. It receives a detection, plans its investigation, executes multi-step retrieval and reasoning across two data platforms, and produces a structured output. It uses tools (Splunk MCP queries, MongoDB vector search, entity traversal) to accomplish a complex task autonomously.

Open source from day one. Meridian is Apache 2.0. Security tooling should be inspectable, auditable, and extensible. Organizations should be able to verify exactly how their incident reports are generated rather than trusting a black box.

What we learned

Citation is a design constraint, not a feature. We initially treated citations as a post-processing step. That fails. Citations have to shape how the agent reasons from the start. The model should never generate a claim and then search for supporting evidence -- it should reason from evidence and articulate what the evidence shows. This inversion was the single most important architectural decision we made, and it's why we built a typed protocol (ContextSync) underneath the agent instead of leaving citations as free-text strings.

MCP changes the integration model fundamentally. Building on MCP for both Splunk and MongoDB simplified how Meridian connects to its data sources. Instead of maintaining custom API integrations, MCP provides a standardized protocol that lets agents interact with platforms naturally. The fact that Meridian can coordinate across two entirely different data platforms -- a SIEM and a document database -- through the same protocol pattern validates that MCP is the right foundation for agentic tooling.

State and compute want to be decoupled. The closer we got to production, the clearer it became that the agent (compute) and the artifact store (state) had to evolve independently. Coupling them is what breaks every previous generation of security automation -- the rules and the data get tangled, and the system rots. ContextSync exists because we needed a content-addressed, versioned, spatiotemporally-stamped state layer that the agent could trust without owning.

Persistent investigation memory changes the game. When Meridian can reference prior investigations stored in MongoDB, it stops treating every alert as a blank slate. Patterns that took 10 minutes to diagnose the first time get recognized in seconds the next time. This is the kind of institutional knowledge that currently exists only in the heads of senior analysts and disappears when they leave.

What's next for Meridian

Multi-source ingestion. The MCP-based architecture means adding new security data sources is an integration problem, not an architecture problem. We're expanding to ingest detections from Microsoft Sentinel and CrowdStrike alongside Splunk, enabling cross-platform investigations where the causal chain spans multiple security tools.

Investigation memory at scale. Today Meridian's casebook holds a small set of investigations. We're building toward hundreds, creating an institutional knowledge base where past investigations inform future ones. When a similar attack pattern surfaces six months later, Meridian should recognize it immediately and cite the prior incident.

Collaborative investigation. Enabling multiple analysts to interact with an ongoing Meridian investigation -- adding context, correcting reasoning, and guiding the agent toward areas it may have missed. Human-in-the-loop not as a safety net, but as a force multiplier.

MITRE ATT&CK mapping. Automatically mapping causal chains to ATT&CK techniques and tactics, giving analysts an immediate framework-aligned view for reporting, compliance, and threat intelligence sharing.

Feedback loops. When analysts accept, modify, or reject Meridian's findings, those decisions become training signal. Over time the system calibrates to each organization's environment, learning which patterns matter and which are noise in their specific context.

Built With

Share this project:

Updates