Inspiration
Every day, engineering teams merge code that passes all their security scanners - and still ships vulnerabilities. We observed fundamental limitations in existing tools:
Pattern-matching tools (Semgrep, SonarQube) can't understand business logic - they miss vulnerabilities like "User A can see User B's invoices"
Hardcoded rule engines can't adapt to different PR contexts - they run the same checks on a typo fix as a payment system change
Single-repo analyzers are repo-blind - they can't detect "this API change breaks 3 downstream services"
Binary pass/fail results provide no reasoning - teams don't understand why something failed
We were inspired by how senior consultants actually work. They don't run the same checklist on every engagement. They assess context first, understand regulatory complexity (DORA, EU AI Act), and explain their reasoning.
What if we could build a platform that gives security audits on each change in org like a Senior consultant?
What it does
Sentinel.ai is a self-orchestrating multi-agent PR security and compliance system where Claude dynamically decides which specialized agents to spawn based on PR context.
Core Capabilities:
Dynamic Agent Orchestration - Claude (tools builder) analyzes each PR and decides: "This touches auth code, I need Security + Business Logic agents. This is a typo fix, skip most agents."
Semantic Intent Tracking - Detects when PR description says "fix typo" but code adds an admin endpoint
Cross-Repo Impact Analysis - Finds "This API change breaks 3 downstream services" using a knowledge graph (built and updated everytime) of service dependencies
Business Logic Vulnerability Detection - Catches IDOR, access control issues, race conditions that static scanners fundamentally cannot detect
Regulatory Compliance - DORA Article-level checking, EU AI Act risk classification with effort-estimated remediation
Full Explainability (XAI) - Every finding has a 10-50 step reasoning chain showing exactly how Claude reached its conclusion
How we built it
Architecture:
┌──────────────────────────────────────────────────────────────┐
│ ZED IDE EXTENSION │
│ Custom commands for change analysis in editor │
└────────────────────────────┬─────────────────────────────────┘
│ MCP Protocol
▼
┌──────────────────────────────────────────────────────────────┐
│ MCP SERVER │
│ Interfaces between Zed agent and backend tool │
└────────────────────────────┬─────────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────────┐
│ BACKEND SERVICES (Docker Containerized) │
│ │
│ • Meta-Orchestrator (Claude decides which agents to spawn) │
│ • Agent Executor (ReAct loops with tool use) │
│ • Inference Engine (CrusoeAI for model inference) │
│ • Knowledge Graph (cross-repo dependency tracking) │
│ • Tracing & Token Tracking (PaidAI - sustainable usage) │
│ │
│ AGENTS: Security | Intent | Cross-Repo | │
│ Business Logic | DORA | AI Act │
└────────────────────────────┬─────────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────────┐
│ DASHBOARD (Docker Containerized) │
│ │
│ • Real-time analysis via WebSocket │
│ • Audit trail visualization │
│ • Dependency graph explorer │
│ • Project and Org Management │
│ • GitHub / GitLab / Git Integration (Any CI/CD platform) │
└────────────────────────────┬─────────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────────┐
│ LANDING PAGE (Loveable) │
│ │
│ • Marketing & Product showcase │
│ • Interactive demo │
│ • Documentation & Onboarding │
└──────────────────────────────────────────────────────────────┘
Key Technical Components:
1. Meta-Orchestrator - Claude analyzes PR context and outputs an execution plan:
{
"risk_assessment": "high",
"agents": [
{"name": "security_scanner", "priority": 10},
{"name": "business_logic_auditor", "priority": 9},
{"name": "cross_repo_analyzer", "dependencies": ["security_scanner"]}
],
"execution_order": [["security_scanner", "business_logic_auditor"], ["cross_repo_analyzer"]]
}
2. ReAct Agent Executor - Each agent runs Think → Act → Observe loops:
while not done:
thought = await llm.reason(context) # Think
tool_call = await llm.decide_action(tools) # Act
result = await tools.execute(tool_call) # Observe
audit_log.append(thought, tool_call, result) # Log everything
3. Tool Registry — 12+ specialized tools across categories:
- Code Analysis: parse_diff, get_file_content, search_codebase
- Security: scan_secrets, check_vulnerabilities
- Cross-Repo: query_repo_graph, find_contract_consumers, check_breaking_changes
- Business Logic: extract_business_rules, generate_adversarial_test
- Compliance: check_dora_requirements, classify_ai_risk_level
4. Knowledge Graph - PostgreSQL storing repo relationships for cross-repo queries
5. Zed Extension + MCP Server - IDE integration for inline analysis
Built collaboratively using Zed's multiplayer rooms and agentic AI capabilities for rapid prototyping.
Challenges we ran into
Ideation paralysis - Existing market leaders (Snyk, Entire, Syncable, Cursor, Codacy) made us question differentiation. Solution: We realized they all share some limitation - they can't reason about business logic or cross-repo impacts or create a knowledge graph of Git (in Long-term / immutable short term memory buckets -which becomes not only security guard rail strict but also reusable sustainable Ops cycle - saving tokens next time your org decides to work on some repo.
Expertise inequality - Team had varying depths in security, compliance, and AI/ML. Solution: Pair programming in Zed rooms enabled real-time knowledge transfer.
Scope creep via generative AI - Claude generates code fast. We built features we didn't need because we could. Solution: Ruthless prioritization - "Does this help find IDOR vulnerabilities? No? Cut it."
Multi-tool integration nightmare - Simultaneously integrating Paid AI billing, Zed Extension development, MCP Server protocol, and knowledge graph construction.
Zed extension limitations - We wanted more control over prompt I/O, MCP server logs visibility, and shared dev servers in collaborative sessions. The extension ecosystem needs more documentation for agentic workflows.
Accomplishments that we're proud of
Finding vulnerabilities scanners miss - Our business logic agent catches IDOR vulnerabilities that Semgrep, SonarQube, and CodeQL all pass:
PR #142: "Fix invoice formatting" (3 lines)
❌ BLOCKED — IDOR Vulnerability
def get_invoice(request, invoice_id):
return Invoice.objects.get(id=invoice_id) # No ownership check!
Reasoning: [12 steps showing how Claude identified the implicit
business rule and generated an adversarial test]
Cross-repo intelligence - First tool to detect breaking changes across microservices before merge
Full explainability (XAI) - Every decision has a traceable reasoning chain, not just "AI found issue"
Dynamic orchestration - 60% fewer unnecessary checks on simple PRs, 3x faster analysis
Regulatory compliance depth - DORA Article-level and AI Act risk classification with effort-estimated remediation like a real consultant
What we learned
Dynamic agent orchestration beats hardcoded pipelines - Letting Claude decide the analysis pipeline produces dramatically better results than static configurations
ReAct loops enable true reasoning - The Think → Act → Observe pattern transformed agents from "prompt-and-pray" to genuine iterative analysis
Knowledge graphs unlock cross-repo intelligence - Building semantic dependency graphs was the breakthrough for detecting breaking changes and reusable sustainable Ops cycle.
Pair programming scales with Zed - Real-time collaboration with agentic AI capabilities enabled rapid prototyping of complex systems
Regulatory compliance requires reasoning, not patterns - Checking "Does logging meet DORA Article 11 requirements?" requires understanding intent, not pattern matching
Set token guardrails early - Learned this the expensive way 💸
What's next for Sentinel.Ai
VS Code Extension - Port Zed integration to broader IDE ecosystem
GitHub Action - Native CI/CD integration for automated PR analysis
More Compliance Frameworks - SOC 2, HIPAA, PCI-DSS agents
Fine-tuned Models - Domain-specific models for faster, cheaper analysis
Community Rules - Shareable business rule definitions across organizations
Built With
- anthropic-claude
- zed
- postgresql
- crusoe
- mcp-protocol
- react
- loveable
- opentelemetry
- tools
Built With
- cybersecurity
- dora
- euaiact
- javascript
- loveable
- react
- rust
- security
- zed
Log in or sign up for Devpost to join the conversation.