AWS Serverless Vision Models As Service

Road Segmentation
Object Detection
Super Resolution
AWS Serverless Diagram

Inspiration

The inspiration behind this project was to explore how Computer Vision Deep Learning models can be deployed and served as scalable microservices using fully serverless AWS infrastructure. The goal was to build a lightweight, low-maintenance backend that delivers fast ML inference with minimal operational overhead, using AWS SAM, API Gateway, and Lambda (container image-based) deployment.

What it does

This project provides serverless API endpoints for 4 key vision-based deep learning tasks:

Image Classification
Road Segmentation
Image Super Resolution
Object Detection

The whole inference pipeline is deployed as a single containerized Lambda function, triggered by API Gateway routes.

How we built it

We built the project using the following stack:

AWS Serverless Application Model (SAM) for Infrastructure as Code (IaC)
AWS Lambda (Container Image deployment)
Amazon API Gateway for REST API exposure
Flask (inside Lambda container) as the API framework
Intel OpenVINO for optimized inference on classification, segmentation, and super-resolution models
OpenCV DNN for object detection
Dockerfile for container build & deployment

Challenges we ran into

Lambda Image Size Limits: Fitting all required OpenVINO runtime files + models within AWS Lambda container size limits was a challenge.
Binary Media Types on API Gateway: Had to configure BinaryMediaTypes to handle image input/output correctly and sending image parameter in post request to the Lambda API endpoints was challenging issue because you have to transform it to Base64 format to handle request body size limits and also response output need to transform it again from Base64 to numpy array of the image to process it.
OpenCV and OpenVINO compatibility: Making sure all ML frameworks and model files worked inside the AWS Lambda container environment.

Accomplishments that we're proud of

Successfully deploying multiple CV models as independent API endpoints from a single Lambda container
Building an API-driven, production-style ML inference service with zero EC2 servers or dedicated backend infrastructure
Running real-time inference for image classification, semantic segmentation, object detection, and super resolution fully serverlessly
Having a Dockerized, OpenVINO-accelerated AI backend running on AWS Lambda at low cost

What we learned

How to build container-based Lambda functions for ML inference
Deep dive into AWS SAM deployment strategies for container apps
Working with OpenVINO Runtime and OpenCV DNN inside cloud environments
Fine-tuning API Gateway + Lambda for ML-specific payloads like base64 images

What's next for AWS Serverless Vision Models As Service

Add support for more CV models
Add S3 pre-signed POST URLs for direct uploads to the S3 bucket of the inference image result
Integrate a DynamoDB database to persist inference results for Lambda API endpoints, enabling fast retrieval based on image ID
Set up SNS to SES subscriptions for email notifications, triggered by both failed inference and successful inference requests
Integrate with front-end app clients for real-time ML inference demos
Implement S3-triggered batch inference for large datasets
Add API Key Authorization or Cognito-based authentication for production hardening
Enable Lambda SnapStart (once supported for container images) to mitigate cold starts
Explore cost optimization and concurrency scaling for higher traffic workloads