Inspiration

With cloud computing’s rapid expansion, the hidden cost of massive energy consumption and high carbon emissions often goes unnoticed. Kubernetes clusters frequently suffer from over-provisioned workloads, wasting valuable resources. We were inspired to build GreenOps Agent to bridge this gap by enabling smarter, greener, and more efficient Kubernetes infrastructure through AI-driven optimization.

What it does

GreenOps Agent monitors Kubernetes workloads by collecting real-time CPU and memory metrics from Prometheus and fetching live carbon intensity data from electricity grids. Using an LSTM machine learning model, it predicts upcoming workload spikes and generates actionable optimization recommendations. Users can interact with it via a user-friendly CLI or a REST API, ensuring maximum flexibility. It also supports a mock mode for development and testing without requiring live Kubernetes environments.

How we built it

We built GreenOps Agent using Python, with FastAPI serving as the API backend and Typer for the CLI interface. Metrics are collected via the Prometheus HTTP API, and carbon data is fetched from the ElectricityMaps API. Machine learning predictions were implemented using a lightweight LSTM model in TensorFlow/Keras. The entire system is containerized with Docker and deployable with Helm charts on Kubernetes clusters. GitHub Actions automates our CI/CD workflows, ensuring code quality and reliability.

Challenges we ran into • Balancing rich feature sets (real-time monitoring, prediction, carbon integration) within a 48-hour hackathon window. • Handling API authentication and rate limits efficiently without compromising performance. • Designing a time-efficient, accurate LSTM model given limited historical data and compute constraints. • Ensuring modular, scalable, and clean code architecture under intense time pressure.

Accomplishments that we’re proud of • Delivering a fully operational, end-to-end AI-enhanced Kubernetes agent within 48 hours. • Successfully integrating diverse systems: Prometheus metrics, ElectricityMaps carbon data, LSTM predictions. • Maintaining high code quality, modular structure, extensive documentation, and automated CI/CD. • Building a mock data engine for easier local development, testing, and demonstrations.

What we learned • How to pragmatically combine DevOps practices with environmental sustainability goals. • The challenges and importance of carbon-aware infrastructure optimization. • The power of predictive analytics (even with lightweight ML models) in improving resource planning. • The significance of good project planning, modular coding, and testing even under strict time constraints.

What’s next for GreenOps Agent • Slack and Teams Integration: Real-time carbon alerts and optimization notifications. • Enhanced Energy Metrics: Incorporate Kepler or similar tools for direct pod-level energy usage tracking. • Autoscaling Extensions: Implement predictive autoscaling based on ML forecasts. • Dashboard Development: Create a visual interface for cluster health, carbon trends, and optimization history. • Multi-cluster Management: Support optimization across multiple Kubernetes clusters and cloud regions for global efficiency.

GreenOps Agent aims to not only optimize workloads but also make a positive impact on the planet. Together, we can build a greener cloud future!

Built With

  • and-docker-image-builds)-?-testing-framework:-pytest-(for-unit-and-integration-testing)-?-mock-data-support:-custom-mock-engine-for-simulating-cluster-metrics-and-carbon-data-?-development-tools:-visual-studio-code
  • and-resource-usage-metrics)-?-carbon-intensity-data-source:-electricitymap-api-(for-real-time-grid-carbon-intensity)-?-ci/cd-automation:-github-actions-(for-continuous-integration
  • docker
  • docker-compose
  • docker-compose-(for-local-development)-all-components-are-integrated-with-a-modular
  • electricitymap-api
  • fastapi
  • github-actions
  • helm
  • kubernetes
  • memory
  • production-grade-architecture
  • prometheus
  • pytest
  • python-3.10+
  • studio
  • tensorflow/keras
  • testing
  • typer
  • typer-cli
  • visual
Share this project:

Updates