Inspiration

Most people want to eat healthier, waste less food, and spend less on groceries.
Yet one simple question remains surprisingly difficult to answer every day:

“What should I eat today?”

When standing in front of the fridge, people often don’t know whether what they are about to eat actually fits their health goals—such as calorie limits, protein intake, or dietary restrictions. At the same time, many households lose track of what they have already bought, leading to ingredients expiring before they are used.

Food decisions are deeply personal. However, most nutrition advice and food apps provide generic recommendations that ignore these personal constraints. Existing health apps mainly act as passive trackers. They log what you have eaten but provide little help in deciding what you should eat next, especially based on what is already in your fridge.

As a result, many everyday food decisions lead to less healthy meals, wasted ingredients, and unnecessary grocery spending.

That’s why we set out to build an AI system that actively helps people make smarter food decisions in real time—turning everyday kitchen moments into opportunities for healthier living and more sustainable food consumption.


What it does

We built an agentic AI assistant that acts as both a personal dietitian and grocery manager.

It can see your fridge, predict ingredient spoilage, and autonomously design your next meal plan to meet your nutrition goals while minimizing food waste.

Users can:

  • Set personal goals and constraints (maintain health, lose weight, build muscle)
  • Define allergies, dietary preferences, time availability, and budget constraints
  • Upload a fridge photo, meal photo, or grocery receipt
  • Provide customized instructions or requests through chat

The agent will:

  • Store and update user eating habits, health goals, and budget preferences
  • Automatically detect ingredients, portion sizes, and spoilage risks from uploaded images
  • Plan personalized meals by balancing:
    • available ingredients
    • nutrition targets
    • sustainability metrics
    • cost constraints
  • Retrieve recipes through a Retrieval-Augmented Generation (RAG) pipeline
  • Provide:
    • recommended recipes
    • step-by-step cooking instructions
    • nutrition breakdowns
    • ingredient substitutions
    • missing ingredients
    • optional grocery lists

As users interact with the system, the agent continuously updates memory and replans, creating a feedback loop of personalization and improvement.


How we built it

We combined several technologies and frameworks to build a multimodal agentic system.

Vision and Language Models

  • OpenAI / Gemini models perform semantic segmentation on images and OCR on receipts
  • Extract structured ingredient lists and quantities from user inputs

Agent Orchestration

  • A central agent loop following the ReAct (Reason + Act) framework: perceive → prioritize → retrieve context → query recipe → formulate plan → reflect → act

  • We utilized Railtracks to manage the agent pipeline, tool execution, and safe fallback behavior across the planning loop.

Memory Layer

  • Stores user nutrition goals
  • Dietary restrictions
  • Grocery budgets
  • Historical meals and preferences

Recipe Retrieval

  • External recipe API for detailed recipe instructions and nutrition data
  • RAG pipeline built on curated recipe datasets with embeddings and vector search
  • Retrieves grounded recipes matching ingredient availability and nutrition goals

Constraint-Based Planning

A constraint solver balances:

  • ingredient spoilage urgency
  • macro targets
  • calorie limits
  • cooking time
  • grocery budget

User Interface

A lightweight Streamlit prototype UI allows users to:

  • upload images
  • set goals
  • interact with the agent through chat
  • receive recommendations

Tech Stack (Current Implementation)

Layer Stack
Frontend Next.js 14 (App Router), React 18, Tailwind CSS
Backend API FastAPI, Pydantic, SQLAlchemy, Uvicorn
Agent Orchestration Railtracks (stage-based ReAct workflow)
LLM / Vision / Embedding Google Gemini API (gemini-3.1-pro, gemini-embedding-001)
Retrieval ChromaDB vector store (via Railtracks) + recipe API metadata
Auth AWS Cognito (email OTP + JWT verification)
Storage SQLite (memory/file mode + snapshot persistence)
Async Jobs FastAPI BackgroundTasks
Deployment Frontend on Vercel, Backend on EC2 + Nginx + systemd
CI/CD GitHub Actions backend deploy workflow (SSH deploy to EC2)

Challenges we ran into

  • Agent orchestration complexity
    Coordinating perception, planning, retrieval, and execution within a reliable agent loop required careful workflow design.

  • Ingredient detection from real-world images
    Fridge images often contain cluttered layouts and partially visible ingredients, making detection and interpretation challenging.

  • Balancing multiple constraints simultaneously
    The agent must optimize across nutrition goals, ingredient availability, spoilage risk, cooking time, and budget.

  • Time constraints of a hackathon environment
    Integrating multimodal perception, agent orchestration, and recipe retrieval while maintaining a functional MVP required careful prioritization.


Accomplishments that we're proud of

  • Built a fully agentic workflow rather than a simple chatbot
    Our system uses a structured ReAct-style loop that allows the agent to perceive multimodal inputs, reason over constraints, retrieve knowledge through RAG, and generate actionable meal plans.

  • Integrated multimodal food understanding
    The agent interprets fridge photos, meal photos, and grocery receipts to automatically extract ingredients and understand the user’s real food environment.

  • Combined health optimization with sustainability goals
    The system balances nutrition targets, ingredient spoilage risk, cooking time, and grocery budgets to promote healthier eating while reducing food waste.

  • Implemented a practical planning pipeline
    The agent retrieves grounded recipes through a RAG pipeline and validates plans against calorie and dietary constraints before presenting them.

  • Designed an extensible agent architecture
    By separating planning, tools, memory, and retrieval layers, the system can easily integrate future data sources such as wearables, environmental metrics, or grocery delivery APIs.

  • Created a real-world actionable assistant
    The agent doesn’t simply suggest meals—it transforms abstract health goals into concrete daily decisions. By providing step-by-step recipes, nutrition breakdowns, ingredient substitutions, and optimized grocery lists, the system helps users take immediate action toward healthier eating while reducing food waste and unnecessary spending.


What we learned

Building an agentic system requires more than simply connecting a language model to tools. Effective agent behavior requires a clear separation between planning, memory, retrieval, and execution layers.

We also learned that multimodal input significantly improves usability. Allowing users to simply take a photo of their fridge or meal dramatically reduces the friction associated with manual food logging.

Another key lesson was the importance of constraint-based planning. Real-world decision systems must consider multiple competing factors simultaneously, and designing an agent capable of balancing those constraints is essential for producing practical recommendations.


What's next for SmartDiet Copilot

  • Expand multimodal inputs: integrate wearable health data (sleep, steps, heart rate) and lab results
  • Refine recipe retrieval: improve recipe diversity, filtering, and portion control
  • Calendar integrations: bi-directional sync for meal prep scheduling and grocery planning
  • Shopping integration: automatically order missing ingredients or suggest local alternatives
  • Social features: share meal plans and sustainability progress with friends
Share this project:

Updates