SousChef

Homescreen
Specific Recipe
Images Generated
Take a Picture
Fridge
Pantry

Inspiration

We identified two major pain points for modern home cooks: food waste due to disorganized inventory, and decision fatigue when planning meals. People are busy and need more than just a recipe book; they need a genuine kitchen co-pilot. This led us to create Sous Chef: an intelligent assistant that merges computer vision with sophisticated reasoning to manage your pantry and provide personalized, real-time cooking guidance, transforming the chaotic kitchen experience into a smooth, enjoyable one.

What it Does

What It Does: Your Smart Kitchen Assistant

Our app is a smart sous-chef designed to make home cooking easier and more creative, from the moment you get home from the store to serving the final meal. It works in three simple steps:

Instantly Track Your Groceries

How: Just snap a picture of your groceries.

What it does: Our "Grocery Agent" (using the Gemini API) sees, identifies, and counts every item. It automatically adds "5 apples" or "2 lbs ground beef" to a live, searchable inventory, so you always know what you have.

Get Personalized Recipes You'll Actually Make

How: Ask for a recipe based on what you have, your diet, or your cravings.

What it does: Our "Recipe Agent" thinks like a chef. You can talk to it: "I have ground beef, but make it high-protein and I hate mushrooms." It will take your inventory, follow your rules, and create the perfect recipe for you.

Cook Hands-Free with a Voice Guide

How: Pick a recipe and start cooking.

What it does: We provide step-by-step instructions with clear images (using Imagen) for each step. Best of all, it's voice-activated. If you have messy hands and a question, just ask, "Hey Sous-Chef, how long do I brown the onions?" or "I burned them, what now?" and get an instant answer without touching your screen.

How We Built It

The core of Sous Chef leverages the Google AI suite for specialized functions:

Groceries Detection & Classification: We used the Gemini API's multimodal capabilities for rapid image analysis, object identification, counting, and classification (fridge/pantry).
Agent Framework: The Recipe Agent was built using Google-ADK's agent reasoning pipeline, allowing it to process natural language commands, access database information (ingredients), and generate a structured, personalized response (recipe).
Media Generation: We integrated Imagen (from Google Suite) to generate high-quality, context-specific images for each critical step of the recipe, replacing traditional text descriptions with modern visuals.
Voice Interface: We used ElevenLabs to provide a seamless, high-quality voice interface, enabling true hands-free interaction while the user is cooking.
Data Persistence: Ingredient and recipe data is managed via a MongoDB database for robust, scalable inventory tracking. We also used ngrok for local development and deployment testing.

Challenges We Ran Into

Our initial challenge was defining what a true "agent" is. We first realized we were just using simple prompt engineering, which led to a necessary and impactful pivot. We quickly moved to designing a more complex, multi-step reasoning engine for the Recipe Agent, which greatly increased the complexity and power of the system. We also attempted to incorporate video generation (e.g., using a tool like Veo), but due to complexity and time constraints, we successfully pivoted to using Imagen for still images, ensuring we could deliver the vital visual guidance component on time.

Accomplishments We're Proud Of

Creating a True Reasoning Agent: Our biggest success was the successful shift from simple prompting to a structured, multi-tool AI Agent architecture that can dynamically modify recipes and handle complex, conditional requests.
Seamless Vision Integration: Getting the Gemini API to accurately identify and inventory multiple, often overlapping items from a single photo was a significant technical feat.
Improved Workflow & Execution: We successfully employed rapid prototyping, which allowed us to identify bottlenecks (like the video generation) and pivot early, resulting in a much more complete and polished final product than our previous hackathon attempts.

What We Learned

We gained deep, practical knowledge in several key areas: AI Agent Design and the difference between prompting and reasoning; implementing real-time data persistence with non-relational databases; and the critical importance of iterative prototyping and recognizing when to pivot on ambitious feature sets.

What's Next for Sous Chef

Advanced Agents: We plan to introduce an Expiry Date Agent that proactively suggests recipes to use up items nearing their spoil date.
Video Integration: Revisit integrating Google’s video generation tools to provide full, follow-along video demonstrations for recipes.
Mobile App Development: Convert the prototype into a fully optimized, market-ready mobile application.