Chefeze

Print tela inicial
Architecture
Logo

Inspiration

Every home cook knows the frustration: hands covered in flour, a sauce reducing on the stove, and the recipe you need is locked behind a screen you can't touch. Traditional cooking apps demand constant visual attention and manual interaction — exactly what you can't give when you're actually cooking.

We asked ourselves: what if your kitchen had an AI sous-chef that could listen to you, see what you're doing, know what's in your pantry, respect your allergies, and guide you step by step — all without you ever touching your phone?

That question became Chefeze — a real-time, voice-first cooking copilot powered by Gemini Live and Google ADK. The name blends "chef" with "eze" (ease), capturing the mission: make cooking with AI genuinely effortless.

Three things drove the design from day one:

Hands-free is non-negotiable. A cooking assistant you have to type into isn't solving the real problem.
Grounded, not hallucinated. Recipe suggestions must come from real data — actual pantry contents, verified allergen databases, real ingredient prices — not fabricated convenience.
Trust through transparency. When the agent warns you about cross-contamination or tells you a dish is over budget, it must show its evidence, not just assert confidence.

What it does

Chefeze is a multimodal live cooking agent that operates through voice and vision in real time:

Talk naturally — Push-to-Talk or fully hands-free Cook Mode with barge-in (interrupt the AI mid-sentence when your timer goes off).
Show your kitchen — Point your camera at your pantry for instant ingredient extraction, or enable Cook Mode's continuous camera stream where the AI monitors your cooking for safety hazards (dangerous temperatures, cross-contamination, knife safety) at up to 2 frames per second.
Get grounded answers — Every recipe suggestion passes through a 5-step hybrid GraphRAG pipeline: vector search (pgvector HNSW), full-text search (tsvector + trigram), knowledge graph expansion (ltree, 1-2 hops), cross-retriever fusion (Reciprocal Rank Fusion, $k=60$), and a confidence gate ($\theta = 0.15$) — all running on PostgreSQL 17.
Stay safe — Real-time allergen detection across 14 EU allergen categories using Open Food Facts data, with STOP/WARN/TIP alert levels. A dedicated food safety sentinel checks internal temperatures, danger zone time limits, and cross-contamination risks.
Play the Budget Challenge — Set a budget and number of people, and the AI gamifies meal planning with real ingredient prices from the Open Prices API. Earn badges like under_budget, zero_waste, or pantry_master.
See your food before you cook it — Gemini generates photorealistic hero images and illustrated step cards, uploaded to Cloud Storage and displayed inline.
Save, remix, and fork recipes — Build a personal cookbook. Remix any recipe into a variant (e.g., "make it vegetarian") with visible lineage and a "View original" link.
Cook in your language — Full 4-language support (English, Portuguese BR, Portuguese PT, Spanish) across voice, UI, units, and currency — with 775 i18n keys per locale.

How we built it

Architecture

Chefeze is a monorepo with three main layers:

┌─────────────────────────────────────────────────┐
│  PWA / Android (Ionic React + Capacitor + Vite) │
│  Audio capture (16kHz PCM) · Camera frames      │
│  14-type card renderer · Audio visualizer orb   │
└───────────────────┬─────────────────────────────┘
                    │ WebSocket
┌───────────────────▼─────────────────────────────┐
│  FastAPI WebSocket Gateway (state machine)      │
│  Resume tokens · Sequence validation · Barge-in │
│  Session compression at 70% context capacity    │
└───────────────────┬─────────────────────────────┘
                    │ ADK streaming bridge
┌───────────────────▼─────────────────────────────┐
│  Chef Agent (Google ADK Hub-Specialist Pattern) │
│                                                 │
│  ┌──────────┐ ┌──────────┐ ┌──────────────────┐│
│  │ Safety   │ │ Budget   │ │ Creative         ││
│  │Specialist│ │Specialist│ │Specialist        ││
│  └──────────┘ └──────────┘ └──────────────────┘│
│  ┌──────────┐ ┌──────────┐                     │
│  │Retrieval │ │  Game    │   9 ADK Tools       │
│  │Specialist│ │ Master   │   55 Cuisine Skills │
│  └──────────┘ └──────────┘                     │
└───────────────────┬─────────────────────────────┘
                    │
┌───────────────────▼─────────────────────────────┐
│  Data Layer                                     │
│  PostgreSQL 17 (pgvector + ltree + pg_trgm)     │
│  Redis 7.4 (sessions, rate limits, flags)       │
│  Google Cloud Storage (images)                  │
│  MCP: Open Food Facts · Open Prices             │
└─────────────────────────────────────────────────┘

Multi-Agent Orchestration

The core of Chefeze is a hub-specialist agent pattern built on Google ADK. A coordinator agent receives every user turn and routes it to the right specialist using confidence-scored intent detection:

Specialist	Responsibility	Key Tools
Safety	Allergen checks, food safety, cross-contamination	`check_allergens`, `food_safety_sentinel`
Budget	Cost estimation, budget challenge scoring	`estimate_cost` (MCP Open Prices)
Creative	Recipe composition, image generation	`compose_recipe`, `generate_images`
Retrieval	Knowledge base search, recipe discovery	`retrieve_recipes` (GraphRAG)
Game Master	Challenge gamification, badge scoring	`score_challenge`, `ui_action_plan`

Intent routing uses multi-signal confidence scoring with ambiguity detection (threshold 0.35, gap 0.10). Safety always gets priority weight $1.5\times$ — because catching a peanut allergy matters more than suggesting a garnish.

The GraphRAG Pipeline

Recipe retrieval is a 5-step hybrid pipeline, not a single vector lookup:

Query Normalization — Keyword-based intent extraction (diet, allergens, equipment, budget, cuisine) across 4 languages. No LLM call needed.
Embedding — gemini-embedding-2-preview (3072-dim, multimodal). Content-hash cache prevents redundant API calls.
Vector Retrieval — HNSW cosine search over pgvector halfvec(3072), top-40 candidates, similarity threshold $\geq 0.30$.
Lexical Retrieval — Dual-path: PostgreSQL tsvector full-text search + pg_trgm trigram similarity ($> 0.3$), both unaccent-normalized for diacritics. Run in parallel, union-deduplicated.
Graph Expansion — 1-2 hop traversal of an ontology graph (ingredients, substitutions, cuisines, techniques, allergen flags, diet tags) using ltree paths.
Scoring Fusion — Reciprocal Rank Fusion ($k=60$) across retrieval channels, then:

$$ \text{score} = 0.40 \cdot s_{\text{vec}} + 0.15 \cdot s_{\text{lex}} + 0.20 \cdot c_{\text{pantry}} - 0.10 \cdot p_{\text{cost}} - 0.05 \cdot p_{\text{equip}} + 0.10 \cdot b_{\text{personal}} $$

Allergen-matched recipes are hard-blocked ($\text{score} = -\infty$). A confidence gate at $\theta = 0.15$ filters noise.

Gemini Models Used

Purpose	Model
Live voice streaming	`gemini-2.5-flash-native-audio-preview`
Text/vision reasoning	`gemini-3-flash-preview`
Multimodal embeddings (3072-dim)	`gemini-embedding-2-preview`
Image generation (hero)	`gemini-3-pro-image-preview`
Image generation (steps)	`gemini-2.5-flash-image`
Cook mode frame safety	`gemini-3-flash-preview` (vision)

Privacy and Safety

Consent-gated memory: user preferences, allergies, and taste profiles are only persisted when the user opts in. Three consent levels: private (session-only), personalized (turns persisted), contribute (full memory graph).
Guardrail engine: ALLOW / REDACT / REWRITE / BLOCK / ESCALATE policies on both input and output, enforced via ADK callbacks.
Semantic log redaction: 17 sensitive field patterns globally redacted from structured logs.
GDPR deletion: DELETE /auth/me with FK-ordered cascade across 6 child tables.

Testing and Quality

We invested heavily in layered validation:

Layer	Count	What it proves
Backend unit/integration	3,594	Domain logic, DB writes, WS auth, GraphRAG pipeline, tool side effects
Frontend unit/component	1,942	Component behavior, hooks, state, accessibility (axe-core)
Ingestion pipeline	110	Embedding, seeding, ontology graph correctness
E2E (Playwright)	114	Cross-boundary journeys: login → live → pantry → recipe → challenge → safety
Security/abuse	40+	Auth bypass, BOLA/IDOR, brute force, prompt injection, secret scanning
i18n parity	775 × 4	Key coverage across all 4 locales

Total: 5,760+ automated tests, with a 75% backend coverage gate enforced in CI.

Challenges we ran into

1. Real-time interruption safety. When a user says "wait, stop" while the agent is mid-sentence describing a recipe, you need true barge-in — not a polite queue. We implemented server-owned audio queues with drop-oldest backpressure (4 bounded asyncio queues: audio_in=32, audio_out=64, control=32, tool_results=16) and a WebSocket state machine that handles interruption as a first-class event, not an edge case.

2. Grounding vs. hallucination in live voice. It's tempting to let the LLM freestyle recipe suggestions. But a cooking copilot that invents ingredients you don't have or misses your peanut allergy is worse than useless — it's dangerous. Building the full GraphRAG pipeline with allergen hard-blocks and pantry coverage scoring was the hardest engineering investment, but it's what makes the agent trustworthy.

3. Vision at kitchen speed. Cook Mode captures camera frames at up to 2 FPS for safety analysis. But network hiccups, slow model responses, and frame backlogs can cascade. We added a circuit breaker (3 failures → open), rate limiting, a vision queue with max depth 8 and drop-oldest policy, and per-session failure tracking. The system degrades gracefully instead of crashing.

4. Multilingual voice + UI + data consistency. Supporting 4 languages isn't just translating strings — it means locale-aware voice prompts, unit conversions (metric/imperial), currency formatting, recipe retrieval fallback chains (requested locale → en → any), and unaccent-normalized search across Portuguese diacritics. We wrote 775 i18n keys per locale and enforced parity in CI.

5. Privacy-safe observability. We needed operator telemetry (session durations, tool call rates, error patterns) without leaking user content into logs. The solution was semantic-key global redaction in structlog (17 field patterns), protected /metrics behind admin authentication, and span attribute scrubbing before OpenTelemetry export.

Accomplishments that we're proud of

A working live voice agent that you can actually interrupt, that sees your kitchen, and that grounds every suggestion in real pantry and allergen data — not a demo with pre-recorded responses.
5,760+ tests with zero mocks at the integration layer — every DB write, WebSocket handshake, and GraphRAG query hits real PostgreSQL and Redis.
14 allergen safety regression tests covering edge cases like casein detection, cross-contamination warnings, and multilingual allergen names.
The scoring fusion formula that balances vector similarity, lexical relevance, pantry coverage, cost penalty, equipment penalty, and personalization — making recipe suggestions feel genuinely tailored.
55 cuisine skills with structured flavor principles, safety notes, and guardrails — from Brazilian to Japanese to Mediterranean — that activate progressively per session.

What we learned

Real-time agent quality comes from state handling, not prompt engineering. The difference between a demo and a product is how the agent behaves when the WebSocket drops, the user interrupts, or the MCP circuit breaker opens. We spent more time on retry/offline/interruption logic than on prompt tuning.
Specialist agents work best when the live bridge owns state and boundaries. Letting each specialist run independently caused chaos. The hub-specialist pattern with explicit tool allowlists, turn limits, and timeout budgets per specialist made the system predictable.
Compose + targeted browser proof beats broad mock-heavy confidence. Running tests inside Docker Compose against real PostgreSQL and Redis caught bugs that mocked tests would have missed — including a migration that worked in SQLite but failed on real Postgres.
The last-mile hackathon risk is proof packaging, not missing features. We had a working product days before the deadline. The hard part was capturing evidence, recording the demo, and making the submission tell a coherent story.
Privacy and safety aren't features you bolt on — they're architecture decisions. Consent-gated memory, semantic log redaction, and allergen hard-blocks had to be designed into the data model and agent routing from the start. Retrofitting them would have been a rewrite.

What's next for Chefeze

Pantry intelligence — Auto-compose meals from what you already have, with expiration-aware prioritization.
Personal recipe library — Upload your own PDFs and handwritten recipe photos; the agent parses and indexes them into your private knowledge base.
Google Identity Platform — Replace identity-key auth with Google Sign-In for seamless mobile login.
Real-device accessibility — VoiceOver and TalkBack validation on physical iOS and Android devices.
Community recipes — Let users publish and discover recipes from other Chefeze cooks.

Built With

docker
gcp
gemini
gemini-live-api
google-adk
postgresql
pydantic-ai
python
react
redis

Updates

Rafael Bittencourt started this project — Mar 16, 2026 07:47 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.