Inspiration

The inspiration behind this project was to explore how Computer Vision Deep Learning models can be deployed and served as scalable microservices using fully serverless AWS infrastructure. The goal was to build a lightweight, low-maintenance backend that delivers fast ML inference with minimal operational overhead, using AWS SAM, API Gateway, and Lambda (container image-based) deployment.

What it does

This project provides serverless API endpoints for 4 key vision-based deep learning tasks:

  • Image Classification
  • Road Segmentation
  • Image Super Resolution
  • Object Detection

The whole inference pipeline is deployed as a single containerized Lambda function, triggered by API Gateway routes.

How we built it

We built the project using the following stack:

  • AWS Serverless Application Model (SAM) for Infrastructure as Code (IaC)

  • AWS Lambda (Container Image deployment)

  • Amazon API Gateway for REST API exposure

  • Flask (inside Lambda container) as the API framework

  • Intel OpenVINO for optimized inference on classification, segmentation, and super-resolution models

  • OpenCV DNN for object detection

  • Dockerfile for container build & deployment

Challenges we ran into

  • Lambda Image Size Limits: Fitting all required OpenVINO runtime files + models within AWS Lambda container size limits was a challenge.

  • Binary Media Types on API Gateway: Had to configure BinaryMediaTypes to handle image input/output correctly and sending image parameter in post request to the Lambda API endpoints was challenging issue because you have to transform it to Base64 format to handle request body size limits and also response output need to transform it again from Base64 to numpy array of the image to process it.

  • OpenCV and OpenVINO compatibility: Making sure all ML frameworks and model files worked inside the AWS Lambda container environment.

Accomplishments that we're proud of

  • Successfully deploying multiple CV models as independent API endpoints from a single Lambda container

  • Building an API-driven, production-style ML inference service with zero EC2 servers or dedicated backend infrastructure

  • Running real-time inference for image classification, semantic segmentation, object detection, and super resolution fully serverlessly

  • Having a Dockerized, OpenVINO-accelerated AI backend running on AWS Lambda at low cost

What we learned

  • How to build container-based Lambda functions for ML inference

  • Deep dive into AWS SAM deployment strategies for container apps

  • Working with OpenVINO Runtime and OpenCV DNN inside cloud environments

  • Fine-tuning API Gateway + Lambda for ML-specific payloads like base64 images

What's next for AWS Serverless Vision Models As Service

  • Add support for more CV models

  • Add S3 pre-signed POST URLs for direct uploads to the S3 bucket of the inference image result

  • Integrate a DynamoDB database to persist inference results for Lambda API endpoints, enabling fast retrieval based on image ID

  • Set up SNS to SES subscriptions for email notifications, triggered by both failed inference and successful inference requests

  • Integrate with front-end app clients for real-time ML inference demos

  • Implement S3-triggered batch inference for large datasets

  • Add API Key Authorization or Cognito-based authentication for production hardening

  • Enable Lambda SnapStart (once supported for container images) to mitigate cold starts

  • Explore cost optimization and concurrency scaling for higher traffic workloads

Project Sponsor

Yasmeen AI

Built With

Share this project:

Updates