Mail Image OCR

Workflow
Working Solution Example 2
Working Solution Example 1

For my entry into the AI Hackathon, I wanted to build something that made use of the variety of offerings from AWS, not just use a single service.

The basic premise was to build a pipeline where you can provide an image containing some text and have the text extracted, language identified, sentiment evaluated and a translation into English provided if needed. There are some real-world use-cases where this could be useful, ie. taking a picture of the whiteboard after a meeting to have a record of the discussion and use the text for indexing. Or seeing an article about a favourite celebrity in a foreign language and finding out what it says about them.

Originally I wanted to build a loosely coupled architecture using SQS to move the messages through the pipeline, or use a State Model to co-ordinate all the actions but after PoCing the Rekognition, Comprehend and Translate parts of the solution, it became clear that their performance made it unnecessary to over engineer. That being said, an future extension of the solution would probably be better served by decoupling all the actions to make it more robust in handling exceptions or unexpected performance degradations from the AWS services.

For the security and privacy part of the solution, the pipeline is only addressable from the single email address, AIHackathon@perrie.uk, and the results are returned only to the original sender. Additionally, all emails and attachments are deleted from the S3 buckets after processing. No sensitive data is logged to the Cloudwatchh logs either.