Incident Response / Observability:
#41 — Structured Logging: Configure JSON logs (timestamps, INFO/WARN/ERROR)
#43 — Screenshot of clean JSON logs
#45 — Manual Check: View logs without SSH
#70 — Manual Check: View logs without SSH (duplicate, also closed)
#46 — Set Traps: Configure alerts for "Service Down" and "High Error Rate"
#47 — Fire Drill: Connect alerts to a channel (Discord)
This was my favorite part. I've always wanted to do something with a bot and now I know how
Reliability:
#13 — Screenshot of a blocked deploy due to a failed test
Scalability:
#39 — "Bottleneck Report" (2-3 sentences on what you fixed)
Documentation (looks like you ended up doing some of these after all):
#17 — Failure Manual: Document what happens when things break
#20 — Link to "Failure Mode" documentation
#53 — The Runbook: "In Case of Emergency" guide
#56 — Link to the Runbook
#59 — README: Setup instructions
#62 — Deploy Guide: How to go live + rollback
#64 — Config: List all Environment Variables
#65 — Runbooks: Step-by-step guides for specific alerts
Log in or sign up for Devpost to join the conversation.