Inspiration

As a Kubernetes operator working with Cilium, I've experienced the pain of manually creating CiliumNetworkPolicies. The process is time-consuming, error-prone, and often results in overly permissive policies that don't follow the least-privilege principle.

I realized that Cilium's Hubble already captures all the network traffic I need—why not use that observability data to automatically generate the right policies? This would bridge the gap between Cilium's powerful eBPF-based observability (Hubble) and its security capabilities (CiliumNetworkPolicies), making network policy management accessible to everyone.

What it does

Cilium PolicyPilot is a CLI tool that transforms real network traffic observed by Hubble into secure, least-privilege CiliumNetworkPolicies. It automates the entire policy creation workflow:

  • Learn: Captures and parses Hubble network flows from JSON files
  • Propose: Generates CiliumNetworkPolicies based on observed traffic patterns
  • Verify: Validates policy syntax and structure before deployment
  • Explain: Creates beautiful HTML reports with network topology graphs

The tool takes Hubble flow data and produces production-ready CiliumNetworkPolicy YAML files that can be directly applied to your Kubernetes cluster. It automatically handles DNS egress rules, port aggregation, and policy validation to ensure compliance with Cilium's requirements.

How I built it

Built entirely in Go 1.23+ with a modular architecture:

  1. Flow Parsing (internal/hubble/): Parses Hubble JSON/NDJSON formats, handles field normalization, and extracts network metadata (endpoints, ports, protocols, namespaces)

  2. Policy Synthesis (internal/synth/): Groups flows by destination endpoint and generates ingress rules based on observed source endpoints. Automatically splits large port lists (>40 ports) to comply with Cilium's validation limits and adds DNS egress rules for proper connectivity.

  3. Validation (internal/verify/): Validates policy YAML syntax, structure, and CiliumNetworkPolicy-specific requirements

  4. Visualization (internal/graph/ + internal/explain/): Generates network topology graphs using Mermaid.js and creates comprehensive HTML reports with statistics, policy details, and visual network diagrams

  5. CLI Interface (cmd/cpp/): Built with Cobra framework, providing an intuitive command-line interface with flags for input/output files, namespace filtering, and more

The project follows clean architecture principles with clear separation of concerns, comprehensive error handling, and extensive documentation.

Challenges I ran into

  1. Hubble Format Variations: Hubble outputs different JSON formats (single object vs NDJSON) with inconsistent field naming (IP vs ip, string vs integer ipVersion). Solved by implementing robust parsing logic that handles multiple formats and normalizes field names.

  2. Cilium Policy Limits: Discovered Cilium's 40-port limit per toPorts[].ports array during validation. Implemented automatic port splitting logic that breaks large port lists into multiple PortRule entries while maintaining functionality.

  3. Large Graph Visualization: Mermaid.js couldn't render graphs with 1000+ edges. Implemented edge aggregation that combines multiple flows between the same endpoints and simplifies labels for better visualization.

  4. DNS Connectivity: Initial policies were ingress-only, causing DNS resolution failures. Added automatic DNS egress rules for kube-dns and kube-system namespace to ensure basic connectivity.

  5. Policy Completeness: Ensuring generated policies follow least-privilege while maintaining application functionality required careful analysis of traffic patterns and endpoint relationships.

Accomplishments that I'm proud of

Complete End-to-End Workflow: From Hubble flows to deployable CiliumNetworkPolicies in minutes

Production-Ready Code: Comprehensive error handling, validation, and edge case handling

Beautiful Visualizations: Interactive HTML reports with network topology graphs that make complex policies easy to understand

Real-World Testing: Successfully tested with HipsterShop microservices example, generating 8 policies that work correctly in a live Kubernetes cluster

Comprehensive Documentation: 700+ line README with architecture diagrams, examples, troubleshooting guides, and best practices

Robust Parsing: Handles multiple Hubble output formats, field name variations, and type inconsistencies gracefully

Policy Intelligence: Automatically handles DNS rules, port aggregation, and Cilium validation requirements

What I learned

  • eBPF Observability Power: Hubble's flow data contains rich metadata that can be leveraged for security automation

  • Cilium Policy Complexity: Understanding CiliumNetworkPolicy structure, validation rules, and best practices required deep diving into Cilium documentation

  • Go Best Practices: Building a maintainable CLI tool with proper error handling, testing, and modular architecture

  • Network Policy Security: The importance of least-privilege policies and how to generate them from observed traffic patterns

  • Visualization Challenges: Rendering complex network topologies requires careful aggregation and simplification strategies

  • Real-World Integration: Testing with actual Kubernetes clusters revealed edge cases that unit tests couldn't catch (DNS, port limits, etc.)

What's next for Cilium PolicyPilot (CPP)

Short-term:

  • Direct Hubble API integration for real-time flow capture
  • Egress policy generation from observed outbound traffic
  • Policy diff and update capabilities for existing policies

Medium-term:

  • L7 (HTTP) policy support with path and method matching
  • Policy recommendation engine based on security best practices
  • Integration with GitOps workflows (ArgoCD, Flux)

Long-term:

  • Web UI for policy visualization and management
  • Policy testing framework with simulated traffic
  • Multi-cluster policy synchronization
  • Machine learning for anomaly detection in network patterns

The goal is to make CiliumNetworkPolicy management as easy as kubectl apply, while ensuring security and compliance by default.

Built With

Share this project:

Updates