Inspiration: Every engineering team has experienced it — a production system silently degrading at 2am while everyone sleeps. CPU climbs, response times creep up, and by the time a human notices, users are already affected.

We built SentinelGo because we believe cloud infrastructure should be able to watch itself and respond before anyone has to wake up. Modern AWS environments generate enormous amounts of metric data through CloudWatch — most of it never acted on in real time. We wanted to change that: turn raw metric streams into intelligent, automated responses using the speed and concurrency of Go.

What it does: SentinelGo is a fully serverless, autonomous infrastructure monitoring and self-healing system built on AWS and written in Go.

Every 5 minutes, an EventBridge rule triggers a Go Lambda function that fans out across five CloudWatch metric streams in parallel — CPU utilization, network throughput, Lambda error rates, RDS connections, and ALB response times — using goroutines to poll all of them concurrently in under 200ms.

Each metric stream is analyzed by a sliding-window z-score detector. When a data point exceeds 2.5 standard deviations from its 1-hour baseline, SentinelGo classifies it as an anomaly and takes immediate action:

· Publishes a structured alert to an SNS topic, delivering an email notification with full anomaly context — metric name, current value, baseline mean, and severity classification. · For CPU anomalies, automatically triggers EC2 Auto Scaling to increase the desired capacity of the affected group — remediating the root cause without human intervention. · Logs every decision as structured JSON to CloudWatch Logs for full audit traceability.

The entire system is deployed with a single SAM command and costs $0 at idle.

How we built it: SentinelGo is written entirely in Go using the official aws-sdk-go-v2 SDK and the aws-lambda-go runtime. We structured the project into three internal packages:

· metrics/poller.go — fans out CloudWatch GetMetricStatistics calls across goroutines using sync.WaitGroup, collecting 60 minutes of data per metric at 5-minute resolution. · detector/detector.go — implements a sliding-window z-score algorithm in pure Go. No external ML library needed: mean and standard deviation computed over a configurable window, anomalies classified as "warning" (z > 2.5) or "critical" (z > 3.75). · notifier/notifier.go — dispatches SNS alerts with structured JSON payloads and calls the EC2 Auto Scaling API to increment desired capacity on CPU anomalies.

Infrastructure is defined entirely in a SAM template.yaml with least-privilege IAM — only the exact API actions the Lambda needs. Deployed via a three-line Makefile: cross-compile for Linux, zip, sam deploy.

Challenges we ran into

Accomplishments that we're proud of

What we learned

What's next for SentinelGo

Built With

  • amazon
  • amazon-web-services
  • auto
  • automation
  • aws-lambda-go
  • aws-sdk-go-v2
  • cloudwatch
  • ec2
  • eventbridge
  • go
  • go.uber.org/zap
  • iam
  • infrastructure
  • lambda
  • sam
  • scaling
  • serverless
  • sns
Share this project:

Updates