Menu Vision

Inspiration

Many times at restaurants, I found myself having to search up pasta types like penne, fettuccine, or linguine just to remember what they looked like.
I also lived in Thailand for five years, where I recognized well-known local dishes like pad thai or tom yum kung from the menu, but most others were unfamiliar and couldn’t even imagine how they might look.

MenuVision was created to solve this problem: to help people understand and visualize what’s on a menu, especially in a foreign country or culture.

What it does

MenuVision is a web app that:

Extracts text from uploaded menu PDFs using Amazon Textract
Translates foreign languages into English using Amazon Translate
Displays the translated text for users to explore
Lets users select a dish name and generates a realistic image of it using Amazon Bedrock

How I built it

I used:

Flask for the backend
HTML/CSS/JavaScript for the frontend
Amazon S3 for storing PDFs and image output
AWS Step Functions to orchestrate the flow
Three Lambda functions:
- extractTextLambda for text extraction via Textract
- translateTextLambda for translation via Amazon Translate
- generateImageLambda for creating AI food images using Bedrock

Pre-signed S3 URLs allowed direct uploads from the browser, and the app was tested and run locally.

Challenges I ran into

CORS errors while connecting the frontend with Flask and AWS resources
Textract failing on some non-standard or unsupported PDF formats
Handling foreign text that was romanized or partially translated
Prompt tuning for consistent, high-quality image generation
Synchronizing frontend display with backend Step Function completion

Accomplishments that I am proud of

Successfully integrating multiple AWS services with smooth coordination
Creating an intuitive user experience with real-time translation and image generation
Making the app visually appealing and usable, even in a local environment
Building a full-stack AI product that solves a relatable global problem

What I learned

Deep understanding of AWS Lambda, Step Functions, Textract, Translate, and Bedrock
How to structure and connect serverless workflows
Frontend-to-backend orchestration using pre-signed S3 uploads and JSON APIs
The value of prompt engineering and error handling in AI apps