Inspiration
Our inspiration? It came from the most unexpected place - the problem statement of our hackathon xD
What it does
It is exactly what we state it to be - a scalable URL shortener. We can spin up/down instances based on demand.
How we built it
Flask API with Peewee ORM and PostgreSQL. Two app instances behind Nginx load balancer, Redis for caching redirects and shared rate limiting. Prometheus scrapes metrics every 15 seconds, Grafana displays four golden signals, Alertmanager sends alerts to Discord via a webhook proxy. GitHub Actions CI runs 161 tests on every push with a 70% coverage gate. Docker Compose orchestrates all 8 services with restart policies for chaos resilience. Locust load tests prove it handles 200+ concurrent users.
Challenges we ran into
- CI coverage gate — Hard 70% threshold meant CI failed every time we added
features without tests. Forced test-first discipline.
- Rate limiter vs load testing — All Locust users share one IP, hitting rate
limits instantly. Had to mark 429s as expected behavior, not errors.
- PostgreSQL sequence collision — Bulk CSV import with explicit IDs didn't
advance auto-increment. New URL creation silently failed until we reset
sequences with setval(). - Gunicorn stale DB connections — Worked on dev server, 100% failure under
gunicorn. Forked workers inherited a dead connection from the parent process. - Podman DNS — Nginx cached container IPs at startup. After restarts, all requests got 502 until nginx was manually restarted.
- Rate limiter vs load testing — All Locust users share one IP, hitting rate
limits instantly. Had to mark 429s as expected behavior, not errors.
Accomplishments that we're proud of
Some of the accomplishments that we were proud of was - even though the problem statement asked us to test on Locust with 500 users, we were able to run with 1000 with 0% error rate!
What we learned
We learned a lot about the basics of production engineering, and how just creating an end-to-end Flask + React application is almost never enough. For production, you need fault tolerance and scalability. And making our applications bullet proof is what we learnt today.
What's next for the project
For future scalability, we would add Kubernetes - which adds a lot of features out of the box, and would make our product even more scalable. That, coupled with KEDA autoscaling, and we should be on track to handle hundreds of thousands of users :)