Keeping track and optimizing AWS cost is a difficult and time consuming thing to do for AWS customers. Most AWS customers I talk to are concerned about cost management and they've mentioned the following problems:
- They're often overwhelmed by the complexity of AWS pricing.
- They often don't know how to monitor AWS cost or don't have time to do it effectively.
- Some are familiar with AWS Cost Explorer, but don't have time to visit the AWS console every day and monitor cost closely.
- They're afraid they will spend too much money on AWS at the end of the month. Some have learned the hard way and I've heard a few stories about a "bad AWS billing surprise".
While there are tools out there, such as AWS Cost Explorer and 3rd party -such as Cloudyn, Cloudability, etc.- there is no easy way to integrate AWS Cost and Usage information with popular collaboration tools, such as Slack.
In many cases, it doesn't make sense for customers to pay for sophisticated cost management tools, which start at a few hundred dollars per month. This is especially true for customers that spend < $5,000 per month and even for some larger ones.
MiserBot saves AWS customers money and it increases productivity by making AWS cost information easily accessible, in a user-friendly chatbot format, from a proven collaboration tool such as Slack.
What it does
In this initial version, MiserBot supports the following features:
- Integration with Slack (this is the only entry point to use the bot at the time)
- IAM Role creation and bot OAuth authorization flow.
- It returns the total accumulated AWS cost for the current month.
- It informs customers when a new Cost and Usage report is ready, and it tells them the new total AWS cost for the month.
- It returns a list of AWS services and their respective cost (e.g. EC2, S3, etc.)
- It returns a list of AWS usage types and their cost in the current month (e.g. t2.2xlarge box usage, EBS storage, etc.)
- It returns a list of AWS resources IDs and their associated cost (e.g. EC2 instance ID, EBS volume ID, etc.). The list is sorted in descending order. This feature allows customers to see which resources incur in the most cost. This view is not available in AWS Cost Explorer.
How I built it
MiserBot is implemented as a conversational bot using Amazon Lex, fulfilled by Lambda functions in the backend.
A considerable portion of MiserBot consists in processing customers' AWS Cost and Usage reports, analyzing them and making the digested data available to the Lambda functions that fulfill Lex intents.
As a result, there are multiple processes running in the backend, making data available for the bot.
MiserBot is implemented 100% Serverless, using the following AWS services:
- API Gateway
- CloudWatch Events
- Dynamo DB
- Dynamo DB Streams
Cost and Usage Report Processor
This is the block that fetches Cost and Usage reports from customers and then processes results. It takes care of the following tasks:
- Customers create an IAM Role that gives Concurrency Labs the following permissions: read-only for AWS resources, and only write permissions for the AWS Cost and Usage Report Service.
- A CloudWatch Events schedule triggers a function that looks for new Cost and Usage reports. This function uses control data stored in a Dynamo DB table to determine if new reports are ready for processing. If new reports are found, it starts Step Function executions to process them.
- Step Function orchestrates a number of tasks required to process Cost and Usage reports: 1)Copy reports to Concurrency Labs S3 bucket and do some transformations, 2)Create/Update Athena tables, 3)Pre-warm data by executing a number of Athena queries and storing results in S3, 4)Update control data in Dynamo DB.
Code is publicly available in the following GitHub repo: https://github.com/ConcurrenyLabs/aws-cost-analysis
This is the actual chatbot interface that gives customers cost and usage data.
Code is available in the following private BitBucket repo: https://bitbucket.org/concurrencylabs/miserbot
Lex definition is available under
Challenges I ran into
One challenge was using Lex's built-in integration with Slack. I tried Lex's built-in integration with Slack and although it's a great time saver, unfortunately I found it insufficient for the features I want the bot to have from day-1.
The #1 reason I decided to implement my own integration is because Lex's built-in one doesn't give me access to Slack OAuth tokens. This is essential in order for MiserBot to initiate chat messages.
- When the bot has been added successfully to the Slack channel and I want to communicate instructions for creating the required IAM role.
- When a new cost and usage report is available.
- When a cost anomaly or recommendation is found.
- New feature announcements.
The #2 reason I built my own integration with Slack is because Lex's built-in integration limits flexibility regarding message formatting options, such as buttons, attachments, icons, emojis, fields, etc. By building my own integration, I can do anything that Slack supports in terms of message formatting.
In general, working with a relatively new type of interface for me -such as bots- was an interesting and challenging experience.
Accomplishments that I'm proud of
I built a complex platform using 100% serverless components. I've been using these components already for my own cost analysis and optimization tasks and they have saved me a lot of time. I hope the chatbot interface makes it even easier to reduce AWS cost. I will for sure use it with my clients.
What I learned
Previous to this project, I hadn't used the Serverless Application Model extensively. I made it my goal to model the bot infrastructure 100% using SAM. This was a good learning experience and I'm planning to keep using SAM in future projects.
Using Lex has been a learning experience too. I will continue to use it for other projects.
What's next for MiserBot - AWS Chatbot Challenge
I'm just getting started with MiserBot. At the time it supports basic functionality, such as total cost, cost by service, cost by usage and cost by resource.
These are examples of the coming features in the pipeline:
- Add support to gather customer feedback directly from the bot. I want to make it easy for customers to tell the bot the features they'd like to see.
- Add graphs and visualizations to responses.
- Add more queries, such as: usage cost associated to a specific resource (e.g. compute time, EBS storage, data transfer for a specific EC2 instance)
- Add cost anomaly detection (compare against previous cost and trends)
- Integration with JIRA for assigning tasks to team members (i.e. investigate an anomaly, or a cost item).
- Integration with other channels, such as Facebook and Twilio.
- Integration with CloudWatch Billing Alarms. Send a message to Slack when a cost threshold has been exceeded.
- Fine-grained billing alerts (i.e. monitor when cost exceeds a threshold for a specific AWS resource, tag or service).
- Calculate and compare cost using AWS Price List API.
- Add more "help" utterances: explain price dimensions, cost saving tips.
- Find cost-savings recommendations (i.e. reserved instance recommendations, under-utilized resources, etc.)