Inspiration
The inspiration behind this project was to explore how Computer Vision Deep Learning models can be deployed and served as scalable microservices using fully serverless AWS infrastructure. The goal was to build a lightweight, low-maintenance backend that delivers fast ML inference with minimal operational overhead, using AWS SAM, API Gateway, and Lambda (container image-based) deployment.
What it does
This project provides serverless API endpoints for 4 key vision-based deep learning tasks:
- Image Classification
- Road Segmentation
- Image Super Resolution
- Object Detection
The whole inference pipeline is deployed as a single containerized Lambda function, triggered by API Gateway routes.
How we built it
We built the project using the following stack:
AWS Serverless Application Model (SAM) for Infrastructure as Code (IaC)
AWS Lambda (Container Image deployment)
Amazon API Gateway for REST API exposure
Flask (inside Lambda container) as the API framework
Intel OpenVINO for optimized inference on classification, segmentation, and super-resolution models
OpenCV DNN for object detection
Dockerfile for container build & deployment
Challenges we ran into
Lambda Image Size Limits: Fitting all required OpenVINO runtime files + models within AWS Lambda container size limits was a challenge.
Binary Media Types on API Gateway: Had to configure BinaryMediaTypes to handle image input/output correctly and sending image parameter in post request to the Lambda API endpoints was challenging issue because you have to transform it to Base64 format to handle request body size limits and also response output need to transform it again from Base64 to numpy array of the image to process it.
OpenCV and OpenVINO compatibility: Making sure all ML frameworks and model files worked inside the AWS Lambda container environment.
Accomplishments that we're proud of
Successfully deploying multiple CV models as independent API endpoints from a single Lambda container
Building an API-driven, production-style ML inference service with zero EC2 servers or dedicated backend infrastructure
Running real-time inference for image classification, semantic segmentation, object detection, and super resolution fully serverlessly
Having a Dockerized, OpenVINO-accelerated AI backend running on AWS Lambda at low cost
What we learned
How to build container-based Lambda functions for ML inference
Deep dive into AWS SAM deployment strategies for container apps
Working with OpenVINO Runtime and OpenCV DNN inside cloud environments
Fine-tuning API Gateway + Lambda for ML-specific payloads like base64 images
What's next for AWS Serverless Vision Models As Service
Add support for more CV models
Add S3 pre-signed POST URLs for direct uploads to the S3 bucket of the inference image result
Integrate a DynamoDB database to persist inference results for Lambda API endpoints, enabling fast retrieval based on image ID
Set up SNS to SES subscriptions for email notifications, triggered by both failed inference and successful inference requests
Integrate with front-end app clients for real-time ML inference demos
Implement S3-triggered batch inference for large datasets
Add API Key Authorization or Cognito-based authentication for production hardening
Enable Lambda SnapStart (once supported for container images) to mitigate cold starts
Explore cost optimization and concurrency scaling for higher traffic workloads
Project Sponsor
Built With
- ai
- amazon-web-services
- lambda
- ml
- python
- serverless
- vision

Log in or sign up for Devpost to join the conversation.