Inspiration

Insider threats cost enterprises an average of $15.4 million annually, yet most security systems only detect breaches after the damage is done. Traditional SIEM tools are batch-based—analyzing logs hours or days later when sensitive data has already been exfiltrated.

We asked ourselves: What if we could catch the bad actor in the moment they act?

We envisioned a system that combines the real-time streaming power of Confluent Kafka with the contextual intelligence of Google Vertex AI to make split-second security decisions. The name "Moment" reflects our core philosophy: security decisions must happen in the moment, not after the fact.

What it does

Moment is a real-time AI-powered enterprise security platform that detects and blocks insider threats as they happen.

Core Capabilities:

  • Streams every enterprise action through Confluent Kafka (file downloads, admin access, bulk exports)
  • Calculates risk scores using multiple signals: action frequency, geographic anomalies, resource sensitivity, and user role
  • Makes AI-powered decisions using Google Vertex AI (Gemini) to block, throttle, escalate, or allow actions
  • Provides explainable reasoning for every decision—no black-box AI
  • Visualizes threats in real-time with a professional security operations dashboard

Key Features:

  • Sub-100ms decision latency for production-grade performance
  • Hybrid decision engine combining fast rule-based decisions with AI for complex cases
  • ksqlDB real-time aggregations for windowed user behavior analysis
  • Attack scenario simulations (Insider Threat, Brute Force, Data Exfiltration)
  • Live Confluent metrics showing actual cluster performance

How we built it

Confluent Stack (Full Integration):

Component Purpose
Apache Kafka 3 topics for events, signals, and decisions
Schema Registry Avro serialization for data contracts
ksqlDB Real-time SQL analytics with 5-minute tumbling windows
Metrics API Live cluster monitoring in the dashboard

Google Cloud:

Service Purpose
Vertex AI Gemini 1.5 Flash for intelligent, context-aware decisions
Cloud Run Serverless deployment with auto-scaling

Tech Stack:

  • Backend: Python/FastAPI with async WebSocket streaming
  • Frontend: Modular ES6 JavaScript with real-time updates
  • Visualization: Chart.js for risk trend analysis
  • Templating: Jinja2 with component-based architecture

Challenges we ran into

1. Async/Threading Deadlocks Our frequency tracker used threading locks that deadlocked with Python's async event loop. We resolved this by leveraging ksqlDB aggregations instead of in-memory tracking.

2. AI Latency vs. Throughput Vertex AI calls take 200-500ms, but security decisions need to be instant. We built a hybrid decision engine with a rule-based fast-path for clear cases (risk < 0.3 or > 0.8) and AI for ambiguous middle-ground decisions.

3. ksqlDB Data Format Mismatch Events published to Kafka didn't include risk scores (calculated after publishing). We implemented a local aggregation fallback that mirrors ksqlDB's windowed behavior.

4. WebSocket Blocking Long-running simulations blocked the WebSocket handler. We used asyncio.create_task() for fire-and-forget simulation execution.

5. Confluent Service Graceful Degradation Not all Confluent services are always available. We built fallback mechanisms so the app continues working with reduced features.

Accomplishments that we're proud of

  • Full Confluent Stack Integration: Not just Kafka—we use Schema Registry, ksqlDB, AND the Metrics API
  • Production-Ready Performance: <100ms decision latency with our hybrid engine
  • Explainable AI: Every blocked action has human-readable reasoning
  • Beautiful Real-Time Dashboard: Professional SOC-style interface with live updates
  • Attack Scenario Library: Pre-built simulations demonstrating real threat patterns
  • Deployed & Live: Running on Cloud Run at production scale

What we learned

  • Streaming + AI is powerful but tricky: The latency characteristics of AI models don't naturally fit real-time streaming. Hybrid approaches are essential.
  • ksqlDB is underrated: Writing SQL for stream processing is incredibly productive compared to custom code.
  • Schema Registry matters: Data contracts prevent subtle bugs when multiple services consume the same topics.
  • Graceful degradation is essential: Cloud services fail. Build fallbacks from day one.
  • Async Python has gotchas: Mixing threading and asyncio requires careful design.

What's next for Moment

Priority Feature
1 ML Model Training - Train custom models on historical enterprise data
2 Multi-Tenant Support - Allow multiple organizations to use the platform
3 Integration Connectors - Connect to Okta, AWS CloudTrail, Google Workspace
4 Automated Response - Trigger remediation workflows beyond blocking
5 Compliance Reporting - Generate SOC 2, GDPR audit reports
6 Mobile Alerts - Push notifications for security teams

Built With

Share this project:

Updates