Inspiration

Modern AWS environments quickly grow crowded with IAM users, roles, and policies. A single accidental attachment of AdministratorAccess can expose your organization to massive risk—often discovered only after an audit or, worse, a breach.

We wanted a self-healing security layer that:

  • Detects dangerous IAM changes in real-time
  • Instantly rolls them back if needed
  • Explains what happened in plain language
  • Runs for (almost) free, with no servers to manage

Using just Lambda, EventBridge, and S3, we built IAM Policy Monitor—a lightweight but powerful guardrail for AWS IAM.

What it does

  1. Real-time detection – CloudTrail events stream into a Detector Lambda that evaluates every IAM API call against YAML-defined rules.
  2. AI triage – For medium-severity changes, the Detector queries Claude 3 Sonnet via Amazon Bedrock to explain the potential risk.
  3. Automated rollback – High and critical violations go straight to an SQS queue, where a Remediator Lambda detaches or deletes the offending policy—usually within seconds.
  4. Audit + analytics – Every event is written to versioned S3 buckets, partitioned for low-cost querying in Athena.
  5. Alerting + dashboards – Slack/SNS alerts plus a CloudWatch dashboard give real-time visibility into IAM trends.

All of this deploys with a single terraform apply and runs entirely within the AWS free tier.

How we built it

  • Runtime – Python 3.13 Lambdas packaged as a reproducible ZIP, built with Bash.
  • Infrastructure – Terraform module (~2 KLOC), with optional submodules for Bedrock, Athena, dashboards, and CodeBuild tests.
  • Rules engine – YAML-as-code evaluated with glob matching; no heavyweight policy-as-code tools.
  • Safety net – A remediator-config.json file lets teams exclude critical users or policies.
  • CI/CD – GitHub Actions runs lint/tests and builds; CodeBuild runs live sandbox tests.

Challenges we ran into

  • Cold starts – Bedrock SDK bloated the ZIP. We trimmed dependencies and lazy-loaded the client to keep p95 < 600 ms.
  • IAM eventual consistency – DetachPolicy took up to 2 mins to reflect in ListAttached* calls. We added retries with backoff.
  • Overbroad protection – Our early blocklist prevented remediation of AdministratorAccess. We refined it and added S3-based overrides.

Accomplishments that we're proud of

  • Secure by default: 0 publicly exposed endpoints and strict least-privilege permissions.
  • Average time from detection to remediation: 18 seconds.
  • Entire solution—Terraform, Lambda, tests, docs—fits in < 5 MB.

What we learned

  • A simple rules engine beats complex frameworks when scoped well.
  • Bedrock is worth the latency/cost for ambiguous cases.
  • Terraform's filemd5() trick enables hot-reload of config files without redeploy.

What's next for IAM Policy Monitor

  • Multi-account support – One deploy, org-wide impact via EventBridge Pipes.
  • Custom remediators – Teams can plug in their own logic, e.g., to update SCPs.
  • Vault-style approvals – Slack buttons for "Fix / Ignore / Justify" workflows.
  • OpenTelemetry – Trace flows into X-Ray for observability.

Built With

Share this project:

Updates