Making being on-call easy by providing observable stats of your backend infrastructure using RedisTimeSeries based slack-bot. The bot can provide stats of your systems on-the-go when:
- Traveling home back on Train
- Being in the coffee shop
- Riding an Uber ride
- When paged at 3 AM on Saturday morning
What it does
The ops bot provides quick observations about what's happening in your system when being on call, on receiving the alert. You can check:
- What was the max latency in the last 30 minutes on Nginx?
- Were there any connection drops in the last 20 minutes?
- What were the max ops on Database in the last 5 mins?
How I built it
Using RedisTimeSeries, Python Flask, Redis-Timeseries-Adapter, MongoDB exporters, Redis exporters
Challenges I ran into
Request routing using Flask for different stats
Accomplishments that I'm proud of
Providing low-latency access to observable stats of your backend system using the slack bot on your Slack mobile app on-the-go.
What I learned
Using RedisTimeSeries, Flask, Slack Inbound-Outgoing webhooks. Redis-TimeSeries Adapter. MongoDB exporter, Redis exporter
What's next for Story of being on-call
Adding more features to slack-bot to collect stats from more components of the backend systems. Making slack-bot available in the slack-app directory for other teams to use.