Inspiration

Every day, engineering teams merge code that passes all their security scanners - and still ships vulnerabilities. We observed fundamental limitations in existing tools:

Pattern-matching tools (Semgrep, SonarQube) can't understand business logic - they miss vulnerabilities like "User A can see User B's invoices"

Hardcoded rule engines can't adapt to different PR contexts - they run the same checks on a typo fix as a payment system change

Single-repo analyzers are repo-blind - they can't detect "this API change breaks 3 downstream services"

Binary pass/fail results provide no reasoning - teams don't understand why something failed

We were inspired by how senior consultants actually work. They don't run the same checklist on every engagement. They assess context first, understand regulatory complexity (DORA, EU AI Act), and explain their reasoning.

What if we could build a platform that gives security audits on each change in org like a Senior consultant?


What it does

Sentinel.ai is a self-orchestrating multi-agent PR security and compliance system where Claude dynamically decides which specialized agents to spawn based on PR context.

Core Capabilities:

Dynamic Agent Orchestration - Claude (tools builder) analyzes each PR and decides: "This touches auth code, I need Security + Business Logic agents. This is a typo fix, skip most agents."

Semantic Intent Tracking - Detects when PR description says "fix typo" but code adds an admin endpoint

Cross-Repo Impact Analysis - Finds "This API change breaks 3 downstream services" using a knowledge graph (built and updated everytime) of service dependencies

Business Logic Vulnerability Detection - Catches IDOR, access control issues, race conditions that static scanners fundamentally cannot detect

Regulatory Compliance - DORA Article-level checking, EU AI Act risk classification with effort-estimated remediation

Full Explainability (XAI) - Every finding has a 10-50 step reasoning chain showing exactly how Claude reached its conclusion


How we built it

Architecture:

┌──────────────────────────────────────────────────────────────┐
│                    ZED IDE EXTENSION                         │
│        Custom commands for change analysis in editor             │
└────────────────────────────┬─────────────────────────────────┘
                             │ MCP Protocol
                             ▼
┌──────────────────────────────────────────────────────────────┐
│                       MCP SERVER                             │
│          Interfaces between Zed agent and backend tool         │
└────────────────────────────┬─────────────────────────────────┘
                             ▼
┌──────────────────────────────────────────────────────────────┐
│            BACKEND SERVICES (Docker Containerized)           │
│                                                              │
│  • Meta-Orchestrator (Claude decides which agents to spawn)  │
│  • Agent Executor (ReAct loops with tool use)                │
│  • Inference Engine (CrusoeAI for model inference)           │
│  • Knowledge Graph (cross-repo dependency tracking)          │
│  • Tracing & Token Tracking (PaidAI - sustainable usage)     │
│                                                              │
│  AGENTS: Security | Intent | Cross-Repo |                    │
│          Business Logic | DORA | AI Act                      │
└────────────────────────────┬─────────────────────────────────┘
                             ▼
┌──────────────────────────────────────────────────────────────┐
│                 DASHBOARD (Docker Containerized)             │
│                                                              │
│  • Real-time analysis via WebSocket                          │
│  • Audit trail visualization                                 │
│  • Dependency graph explorer                                 │
│  • Project and Org Management                                │
│  • GitHub / GitLab / Git Integration (Any CI/CD platform)    │
└────────────────────────────┬─────────────────────────────────┘
                             ▼
┌──────────────────────────────────────────────────────────────┐
│                       LANDING PAGE (Loveable)                 │
│                                                              │
│  • Marketing & Product showcase                              │
│  • Interactive demo                                          │
│  • Documentation & Onboarding                                │
└──────────────────────────────────────────────────────────────┘ 

Key Technical Components:

1. Meta-Orchestrator - Claude analyzes PR context and outputs an execution plan:

{
    "risk_assessment": "high",
    "agents": [
        {"name": "security_scanner", "priority": 10},
        {"name": "business_logic_auditor", "priority": 9},
        {"name": "cross_repo_analyzer", "dependencies": ["security_scanner"]}
    ],
    "execution_order": [["security_scanner", "business_logic_auditor"], ["cross_repo_analyzer"]]
}

2. ReAct Agent Executor - Each agent runs Think → Act → Observe loops:

while not done:
    thought = await llm.reason(context)           # Think
    tool_call = await llm.decide_action(tools)    # Act  
    result = await tools.execute(tool_call)       # Observe
    audit_log.append(thought, tool_call, result)  # Log everything

3. Tool Registry — 12+ specialized tools across categories:

  • Code Analysis: parse_diff, get_file_content, search_codebase
  • Security: scan_secrets, check_vulnerabilities
  • Cross-Repo: query_repo_graph, find_contract_consumers, check_breaking_changes
  • Business Logic: extract_business_rules, generate_adversarial_test
  • Compliance: check_dora_requirements, classify_ai_risk_level

4. Knowledge Graph - PostgreSQL storing repo relationships for cross-repo queries

5. Zed Extension + MCP Server - IDE integration for inline analysis

Built collaboratively using Zed's multiplayer rooms and agentic AI capabilities for rapid prototyping.


Challenges we ran into

Ideation paralysis - Existing market leaders (Snyk, Entire, Syncable, Cursor, Codacy) made us question differentiation. Solution: We realized they all share some limitation - they can't reason about business logic or cross-repo impacts or create a knowledge graph of Git (in Long-term / immutable short term memory buckets -which becomes not only security guard rail strict but also reusable sustainable Ops cycle - saving tokens next time your org decides to work on some repo.

Expertise inequality - Team had varying depths in security, compliance, and AI/ML. Solution: Pair programming in Zed rooms enabled real-time knowledge transfer.

Scope creep via generative AI - Claude generates code fast. We built features we didn't need because we could. Solution: Ruthless prioritization - "Does this help find IDOR vulnerabilities? No? Cut it."

Multi-tool integration nightmare - Simultaneously integrating Paid AI billing, Zed Extension development, MCP Server protocol, and knowledge graph construction.

Zed extension limitations - We wanted more control over prompt I/O, MCP server logs visibility, and shared dev servers in collaborative sessions. The extension ecosystem needs more documentation for agentic workflows.


Accomplishments that we're proud of

Finding vulnerabilities scanners miss - Our business logic agent catches IDOR vulnerabilities that Semgrep, SonarQube, and CodeQL all pass:

PR #142: "Fix invoice formatting" (3 lines)
❌ BLOCKED — IDOR Vulnerability

def get_invoice(request, invoice_id):
    return Invoice.objects.get(id=invoice_id)  # No ownership check!

Reasoning: [12 steps showing how Claude identified the implicit 
business rule and generated an adversarial test]

Cross-repo intelligence - First tool to detect breaking changes across microservices before merge

Full explainability (XAI) - Every decision has a traceable reasoning chain, not just "AI found issue"

Dynamic orchestration - 60% fewer unnecessary checks on simple PRs, 3x faster analysis

Regulatory compliance depth - DORA Article-level and AI Act risk classification with effort-estimated remediation like a real consultant


What we learned

Dynamic agent orchestration beats hardcoded pipelines - Letting Claude decide the analysis pipeline produces dramatically better results than static configurations

ReAct loops enable true reasoning - The Think → Act → Observe pattern transformed agents from "prompt-and-pray" to genuine iterative analysis

Knowledge graphs unlock cross-repo intelligence - Building semantic dependency graphs was the breakthrough for detecting breaking changes and reusable sustainable Ops cycle.

Pair programming scales with Zed - Real-time collaboration with agentic AI capabilities enabled rapid prototyping of complex systems

Regulatory compliance requires reasoning, not patterns - Checking "Does logging meet DORA Article 11 requirements?" requires understanding intent, not pattern matching

Set token guardrails early - Learned this the expensive way 💸


What's next for Sentinel.Ai

VS Code Extension - Port Zed integration to broader IDE ecosystem

GitHub Action - Native CI/CD integration for automated PR analysis

More Compliance Frameworks - SOC 2, HIPAA, PCI-DSS agents

Fine-tuned Models - Domain-specific models for faster, cheaper analysis

Community Rules - Shareable business rule definitions across organizations


Built With

  • anthropic-claude
  • zed
  • postgresql
  • crusoe
  • mcp-protocol
  • react
  • loveable
  • opentelemetry
  • tools

Built With

Share this project:

Updates