🌟 Inspiration

When travelers visit Vietnam, they face two major challenges:

  1. Fear of Being Scammed 💸 Tourist traps are everywhere. Overpriced bills, fake "local" restaurants, and inflated prices for foreigners create anxiety and distrust. Many travelers end up paying 2-3x the fair price without even knowing it.

  2. Food Paralysis 🤔 Vietnamese cuisine is incredibly diverse with hundreds of regional dishes. Travelers don't know:

    • What dishes to try
    • How to eat them properly (phở, bánh xèo, nem rán...)
    • Which places are authentic vs tourist traps
    • How to communicate in restaurants when they don't speak Vietnamese

The Result? Travelers miss out on authentic experiences, waste money, and never truly discover the incredible food culture Vietnam has to offer.

Phở.AI solves this. We combine AI-powered computer vision, natural language processing, and local knowledge to give travelers superpowers in Vietnamese restaurants.


🎯 What It Does

Phở.AI is your all-in-one Vietnamese food assistant with 5 powerful features:

📸 Menu Scanner

  • Snap a photo of any Vietnamese menu → Instant translation & explanation
  • Get detailed info on each dish: ingredients, taste profile, spice level, allergens
  • Learn how to eat it properly with cultural context
  • Available in English & Vietnamese

🍲 Food Recognition

  • Don't know what you're eating? Take a photo and find out
  • Learn the dish name, origin story, cultural significance
  • Get proper eating instructions (utensils, condiments, dipping sauces)
  • See fair price estimates for your current location

🗣️ Voice Assistant

  • Speak in your language → AI translates to Vietnamese
  • Supports: English, Korean (한국어), Chinese (中文), Japanese (日本語)
  • Get pronunciation help so locals understand you
  • Translate common restaurant phrases

🎯 Smart Recommendations

  • Get a personalized food itinerary for your trip
  • Filter by: budget, dietary restrictions (halal, vegetarian, gluten-free)
  • Discover hidden local gems tourists don't know about
  • Optimized routes based on your travel days and locations

💰 Price Check & Scam Alert

  • Scan your bill → AI checks if prices are fair
  • Instant scam alerts if you're being overcharged
  • See average prices for each item in your area
  • Detailed breakdown: which items are fair vs overpriced

🚀 How We Built It

Tech Stack

Frontend:

  • Next.js 14 (App Router) with TypeScript
  • Tailwind CSS for styling
  • Shadcn/ui components
  • React Webcam for camera integration

AI & Vision:

  • Gemini-3.0_Flash for image analysis, OCR, and NLP
  • Custom prompts for Vietnamese food expertise
  • Multi-language translation pipeline

Backend:

  • Next.js API Routes
  • IndexedDB for client-side history/caching
  • Image compression for performance

Infrastructure:

  • Vercel for hosting
  • OpenStreetMap Nominatim for geolocation
  • Web Speech API for voice features

Key Technical Achievements

  1. Smart Image Compression - Automatically compresses images to 5MB while maintaining quality for AI analysis
  2. Offline History - Uses IndexedDB to cache analysis results, saving API costs
  3. Multi-language Support - Built i18n system supporting Vietnamese and English
  4. Responsive Camera - Works on desktop, mobile, with file upload fallback
  5. Location-Aware Pricing - Geolocation integration for accurate price estimates

💪 Challenges We Ran Into

1. Menu OCR Accuracy

Vietnamese menus often have:

  • Handwritten text
  • Mixed Vietnamese-English
  • Low contrast photos
  • Decorative fonts

Solution: Fine-tuned Gemini prompts with context about Vietnamese cuisine and tested with 50+ real menu photos.

2. Price Database

No existing API has Vietnamese street food prices by district.

Solution: Built prompts that leverage Gemini's training data on Vietnamese prices, combined with location context for accuracy.

3. Image Size Limits

Gemini API has 20MB limits, but phone photos are often 10-15MB.

Solution: Implemented smart compression using Canvas API that reduces size by 70% while preserving OCR quality.

4. Cross-Browser Voice Recognition

Web Speech API has inconsistent browser support.

Solution: Built fallback system with clear user guidance and browser detection.

5. Mobile Camera Access

HTTPS required for camera access in production.

Solution: Deployed on Vercel with automatic HTTPS, added file upload as backup.


🏆 Accomplishments We're Proud Of

  • Shipped 5 complete features in a tight timeline
  • 94%+ AI accuracy on menu translation (tested with 50+ menus)
  • Mobile-first design that works on any device
  • Zero backend costs - uses IndexedDB for caching
  • Bilingual UI with seamless language switching
  • Production-ready with proper error handling and loading states
  • Accessible - keyboard navigation, screen reader support

📚 What We Learned

Technical Skills

  • Gemini API mastery - learned to craft effective prompts for vision + NLP tasks
  • Next.js 14 App Router - modern React patterns with server/client components
  • IndexedDB - client-side database for offline-first apps
  • Image optimization - balancing quality vs API limits
  • Geolocation APIs - reverse geocoding without API keys

Product Design

  • User empathy - talked to travelers to understand real pain points
  • Feature prioritization - focused on high-impact features first
  • Progressive enhancement - built fallbacks for unsupported features

AI/ML Insights

  • Gemini-3.0_Flash is incredibly fast and cost-effective for vision tasks
  • Prompt engineering is crucial - small wording changes = 30%+ accuracy improvement
  • Context matters - providing location/culture context improves AI responses

🔮 What's Next

Short Term

  • [ ] User accounts - save favorite dishes, history sync
  • [ ] Offline mode - PWA with cached translations
  • [ ] More languages - Spanish, French, German
  • [ ] Restaurant reviews - community ratings and tips
  • [ ] Map integration - find recommended places near you

Long Term

  • [ ] AI Chat - conversational food advisor
  • [ ] Dietary tracking - calories, allergens, nutrition
  • [ ] Social features - share itineraries, follow foodies
  • [ ] AR menu overlay - point camera at menu for instant AR translation
  • [ ] Marketplace - book food tours, cooking classes

Built With

Share this project:

Updates