For my entry into the AI Hackathon, I wanted to build something that made use of the variety of offerings from AWS, not just use a single service.

The basic premise was to build a pipeline where you can provide an image containing some text and have the text extracted, language identified, sentiment evaluated and a translation into English provided if needed. There are some real-world use-cases where this could be useful, ie. taking a picture of the whiteboard after a meeting to have a record of the discussion and use the text for indexing. Or seeing an article about a favourite celebrity in a foreign language and finding out what it says about them.

Originally I wanted to build a loosely coupled architecture using SQS to move the messages through the pipeline, or use a State Model to co-ordinate all the actions but after PoCing the Rekognition, Comprehend and Translate parts of the solution, it became clear that their performance made it unnecessary to over engineer. That being said, an future extension of the solution would probably be better served by decoupling all the actions to make it more robust in handling exceptions or unexpected performance degradations from the AWS services.

For the security and privacy part of the solution, the pipeline is only addressable from the single email address,, and the results are returned only to the original sender. Additionally, all emails and attachments are deleted from the S3 buckets after processing. No sensitive data is logged to the Cloudwatchh logs either.

Tools used

  1. Python
  2. AWS Lambda
  3. AWS Rekognition
  4. AWS Comprehend
  5. AWS Simple Email Service (SES)
  6. AWS S3

Known Limitations.

  1. The AWS Rekognition Detect Text Action can only detect 50 words from a single image.
  2. Only jpeg and png images are supported by Rekognition.


To use the tool, simply sent an email to with an image attached.

Built With

Share this project: