Production-Ready URL Shortener

Inspiration

What it does

How we built it

Challenges we ran into

Accomplishments that we're proud of## What it does

A production-grade URL shortener API with full lifecycle management:

Create short URLs with POST /shorten (custom or auto-generated codes)
Redirect via GET /<code> with a 302, logging every click
Update URLs with PUT /urls/<id> (title, destination, active status)
Deactivate with DELETE /urls/<id> (soft delete, preserving audit trail)
Monitor via GET /health (deep check: DB + Redis) and GET /metrics (system stats)

Every operation is validated, rate-limited, cached, logged, and auditable.

## How we built it

Application Layer:

Flask + Peewee ORM + PostgreSQL
Redis for caching redirect lookups (300s TTL, invalidated on mutation)
Flask-Limiter for rate limiting (200 req/min global, 30 req/min on writes)
Structured JSON logging with X-Request-ID for request tracing

Infrastructure:

3 gunicorn instances (4 workers, 2 threads each = 24 concurrent handlers)
Nginx load balancer with max_fails / fail_timeout health-aware routing
Docker Compose with health checks, resource limits, and restart: always
Non-root container user, security headers (X-Content-Type-Options, X-Frame-Options)

Quality & Operations:

59 pytest tests at 73% coverage, running on SQLite in-memory for speed
GitHub Actions CI that blocks any push dropping below 70% coverage
k6 load testing from 50 → 200 → 500 concurrent users
Connection pooling via PooledPostgresqlDatabase (20 max, 300s stale timeout)

Documentation:

README.md — setup, API reference, architecture diagram
RUNBOOK.md — start/stop/restart, troubleshooting, alert response
DECISIONS.md — 9 architectural decision records with rationale
FAILURE_MODES.md — 9 failure scenarios, capacity limits, known limitations
SLO.md — availability, latency, error rate targets with actuals
BOTTLENECK_REPORT.md — before/after performance analysis

## Challenges we faced

The 27% Problem. Our first load test at 500 concurrent users had a 27% error rate. Flask's built-in dev server is single-threaded — it simply can't handle concurrent connections. The fix wasn't obvious at first (we thought it was a database issue), but profiling showed the bottleneck was the WSGI layer itself. Switching to gunicorn with horizontal scaling dropped the error rate to 0%.