Automated Video Encoding Pipeline
- Source Code: Github Repo
- Demo Video : YouTube Link
- Architecture : Direct Link
- Presentation : PDF Version (Google Drive)
This is a pretty simple and straightforward architecture that gives you the power to build/automate your very own "video encoding" workflow on graviton 2 processors. It's a complete solution in form of CloudFormation template and very little manual setup is required.
You can configure whatever you want according to your requirements. This setup is focused on using Graviton 2 because it offers 40% higher performance at 20% lower cost than its predecessors. Even Netflix uses these processors.
You can run this setup as is and every time you want to trigger an encode, just upload a valid video file in the
Input "folder" in the S3 bucket created via this template. S3 doesn't have the concept of "Folders/Directories", however, to visualize the data, you can create a "Folder".
Whenever you add any object in this folder, there's a Lambda event triggered, which will add an AWS BATCH JOB in the queue, i.e., video encoding process.
You can read more in detail on this particular project on Building an automated Video Encoding Pipeline on AWS
Had never heard of Graviton processors from AWS and when I came across it and saw the capabilities, I just couldn't resist automating something that I used to do manually a few years back.
A few years back when I was on very limited storage space, I had to run my precious videos through heavy encoding methods on my sluggish computer. It took at least 40 minutes to encode a 20 minute video and that too in 848x480 resolution. And running multiple videos through the same tool manually wasn't an ideal situation as well. I've always wanted to automate that process, as it can be handy not just for people like me, but for other businesses as well.
What it does
It's a completely automated system in AWS cloud that could take videos and compress them into smaller sizes with a bare minimum reduction in video quality. And the user could control almost every aspect of the encoding settings without directly interacting with FFmpeg or any other command-line utility. This project is completely built in a CloudFormation template, which makes it even easier for people/businesses to try out Graviton 2.
This project can serve as a way for businesses to test and port their existing workloads to Graviton 2 processors without the headache of setting up things manually. Just spin up a CloudFormation stack from this template, test your work and decide whether you're ready for the switch or not.
How It Works
The base of this pipeline is that we set up an S3 bucket and we trigger an event to invoke a lambda function when we have an
“ObjectCreated” event. Then this Lambda function takes that event info and creates a job in our AWS Batch.
We’ve got our Graviton 2 instances running here & ECS provisions dockers on these Graviton instances. The encodes are ran inside these dockers and when the job is done, it’ll put the final encoded file in the S3 bucket of our choice.
So, all you have to do is … add a file in that S3 bucket and this whole pipeline will take care of everything else.
What this project doesn't do?
Do keep in mind that this is NOT a complete replacement option for AWS Elastic Transcoder. Why do I say so? Let’s say we have a 20-second video clip. If I encode it on Elastic Transcode, I’ll be paying for 60 seconds (60 seconds is the minimum what you pay for). But with this pipeline, I don’t have to.
Also, we have an added advantage of customizing this template to our requirements or needs.
How I built it
Initially I made multiple iterations of how this whole thing should be built. I began with a very manual setup of creating the resources like an EC2 instance and then setting up FFmpeg on it to encode videos. But, it wasn't a handy solution and would be very difficult for people to use it. Took me a few weeks and trials and errors to write a CloudFormation template that actually works in 2021.
Challenges I ran into
First things first, this is my first time actually writing a CloudFormation template. Manually creating the services via AWS Console is easy, but isn't "transportable". I found some instances of older CloudFormation templates to create batch jobs, however, as expected, they were outdated and now I have 5+ CloudFormation stacks that are stuck and wouldn't delete.
Working with something like a CloudFormation template is tricky because you'll miss a very small thing and it'll haunt you for days.
I got stuck for a few days while building the "Compute Environment" in AWS Batch and even Stackoverflow couldn't come to the rescue. So, I tried joining the AWS slack and Discord channels. Well, as expected, no help there as well. But, a friend noticed the mistake and helped me out of the pickle. Here's the StackOverflow question for interested folks: Can't perform Sts:AssumeRole
Last but not the least, I worked on this whole thing alone (architecture design, implementation, documentation, 2 POC web apps, demo video, video editing etc.) and it's challenging to work on such a piece all alone and especially when you have a Full-time job as an SWE. It's difficult to spend workdays and weekends doing the same thing.
Accomplishments that I am proud of
It's a big one. The idea was pretty interesting to build everything into a single CloudFormation template, but midway I had almost given up on it due to the issues I kept on running into. But, finally, I was able to pick everything I wanted and put them in a working CoudFormation template.
Another thing I'm proud of is that on top of building this AWS CloudFormation template, I was able to build 2 POCs to showcase how this pipeline could be used. Another feat is that there's almost no up-to-date documentation on how to use
aws-sdk with Angular. Well, I was able to work it out and run those 2 web apps and communicate with AWS services just fine.
What I learned
I had an idea about "dockers", but had never worked with them and didn't fully understand their working. With this project, I can say that I have a much better understanding of dockers now. I was able to build multi-arch supported docker images and was also able to use Amazon ECR and ECS.
What's next for Automated Video Encoding and Transcoding Pipeline
There are few things that still need some manual setup (not more than 5 minutes of following small blogs). I'd like to remove that from the equation altogether.
When I started writing the
video encoding python script, I had many ideas in mind on how to make it much more flexible. I've implemented some parts of it, but I know few areas that I can optimize further to make this whole architecture much more customizable and flexible.
I haven't made the Angular apps publicly running to let users upload video or stream it because if it's public, I'll easily exhaust my resource quotas in AWS. I've given a walkthrough in the demo video properly. That should be more than enough.
Also, this project is much more of an architectural idea/implementation and those 2 web apps are just POCs (Proof of concept) to show how this pipeline can be utilized. You can, however, go through the Cloudformation template and code of both the POCs on my Github.