Short Description

Link-Shrink is a Flask + PostgreSQL URL shortener built for the MLH Production Engineering Hackathon Incident Response track. It goes beyond CRUD by adding JSON logging, Prometheus metrics, Grafana dashboards, Alertmanager notifications to Discord, a runbook, and a simulated incident investigation workflow.

Project URL

https://link-shrink.duckdns.org

GitHub Repo

https://github.com/Shreyp087/PE-Hackathon-Template-2026

Video Demo

https://youtu.be/i1zeoz1skAs

Inspiration

Most demo apps stop at “it works.” We wanted to build the next layer: “how do we know it broke, how fast can we detect it, and how do we recover under pressure?” The MLH Incident Response track felt like the perfect excuse to turn a simple URL shortener into a small but realistic production system with monitoring, alerting, incident documentation, and operational playbooks.

What It Does

Link-Shrink lets users create short links and redirect through them, backed by PostgreSQL for persistence and Flask for the application layer. On top of that, it exposes production-grade observability:

  • Structured JSON logs with timestamps, log levels, and components
  • Prometheus metrics for traffic, latency, errors, and internal DB failures
  • A /system endpoint for CPU, memory, and disk telemetry without SSH
  • Grafana dashboards covering the four Golden Signals
  • Prometheus alert rules for service down, high error rate, slow response time, and high CPU
  • Alertmanager routing to Discord with firing and resolved notifications
  • A runbook for on-call response
  • A simulated post-incident report documenting a realistic SEV-2 event

This means the project is not just a working app; it is a working service with the tooling needed to detect, diagnose, and respond to incidents.

How We Built It

The core application is a Flask service using Peewee ORM with PostgreSQL as the database. We kept the product simple on purpose so we could focus on operational depth.

The observability stack is built with Prometheus, Grafana, Alertmanager, and Discord webhooks. The app exports metrics at /metrics, host telemetry at /system, and writes structured JSON logs to stdout so they can be collected and inspected without directly SSHing into the server. Grafana is provisioned with a Golden Signals dashboard, and Prometheus is configured with alert rules for outage, error rate, latency, and CPU pressure.

For Incident-Response Maturity, We Added

  • A runbook with alert-by-alert response steps
  • Simulation scripts to generate realistic failure modes
  • A post-incident report documenting a root-cause-analysis workflow
  • CI smoke tests that validate key endpoints and logging behavior on every push

The service is deployed behind Caddy for HTTPS and reverse proxying, which made it easy to expose the app and the observability tools cleanly.

Built With

Share this project:

Updates