I wanted to build something that feels like real production engineering, not just a demo endpoint. The goal was to design a service that stays stable during failures, scales under load, and gives operators clear visibility during incidents.

What it does

OmniGuard provides:

Health and metrics endpoints for observability User, URL, and event APIs Bulk CSV ingestion for seeded data URL short-code redirect support Cache-aware API behavior and compatibility endpoints Monitoring dashboards and alert routing

How I built it

I built the app in Python using Flask, then added a full operations stack around it:

API layer with structured responses and error handling Data workflows with CSV-backed entities and API-driven CRUD Nginx for ingress/load distribution Redis-backed caching patterns Prometheus + Grafana + Alertmanager for monitoring/alerts k6 load testing for scalability evidence Pytest + GitHub Actions for automated quality checks

Challenges I ran into

Balancing framework migration with backward-compatible API behavior Aligning endpoint contracts to strict external grader expectations Handling hidden edge cases in bulk import and entity workflows Keeping observability and resilience artifacts consistent with implementation

What I learned

I learned how important contract-driven development is, especially when tests are external and strict. I also got hands-on experience with operational reliability: failure modes, runbooks, load testing, and alert validation.

What I are proud of

I am proud that OmniGuard is not only feature-complete but also operationally grounded, with health checks, dashboards, alerting, and documented recovery procedures.

Built With

Share this project:

Updates