HelloChef — Hands-Free AI Cooking Assistant (Video Updated - it's also part of my coursework now)

Built for the Google DeepMind Gemini 3 Hackathon


Most “AI cooking” apps are just recipe viewers with better UI. That’s not the problem.

The real problem starts after you hit play on a recipe.

Your hands are messy. The video is ahead. You missed an ingredient. You keep pausing, rewinding, guessing.

Screens don’t belong in kitchens.

So I built HelloChef — an AI that actually cooks with you, not just shows instructions.


What HelloChef actually does

You start a recipe and stop touching your phone completely.

  • Say “next step” → moves forward, syncs video to exact timestamp, reads instructions
  • Say “set timer 5 minutes” → timer starts instantly
  • Say “what temperature?” → contextual answer from the recipe
  • Say “go to step 3” → jumps with video sync
  • Say “I’m done” → marks complete and updates inventory

This runs on real-time voice with tool-calling — not command matching.


Where it gets interesting

This isn’t one AI feature slapped on top. It’s a system.

  • Converts YouTube cooking videos into step-by-step structured recipes with exact timestamps
  • Scans your fridge → suggests recipes you can actually cook right now
  • Tracks inventory → auto-deducts ingredients after cooking
  • Generates recipes from text → respects diet & allergies
  • Supports 28 voice languages
  • Works as a PWA — no install friction

Under the hood (what most people skip)

  • Real-time audio streaming with tool execution (Gemini Live API)
  • Video understanding → transcript → structured recipe pipeline
  • Strict JSON schema outputs (so AI doesn’t hallucinate structure)
  • WebSocket-based voice loop with sub-2s interaction latency
  • Model fallback chain to avoid breaking during outages
  • Full auth + secure API proxy (no exposed keys)

This is not a demo hack. It’s production-grade architecture.


The hard truth

The difficult part wasn’t building features. It was making everything work together in real time without breaking UX.

  • Audio streaming in browsers is unreliable
  • Video timestamps drift if prompts are weak
  • Tool calls + voice responses can desync
  • Rate limits kill demos if you don’t plan for them

Most projects ignore this. That’s why they feel fake.


What this proves

Voice + AI is not about chat. It’s about action + context + timing.

When the system:

  • understands what you're doing
  • knows where you are
  • executes actions instantly

…it stops feeling like AI and starts feeling like a capability.


What’s next

  • Step-level memory (your custom tweaks remembered)
  • Short-form video support (Reels/TikTok → recipes)
  • Weekly meal planning based on your inventory
  • Smarter pantry (expiry + usage prediction)
  • Community-driven recipes

Bottom line

HelloChef is not a recipe app.

It’s a shift from: “read instructions” → “guided execution in real time.”

That’s where AI actually becomes useful.


If you want to try it: https://pakao-ai-gemini-3-hackathon.netlify.app

GitHub: https://github.com/iShelar/cooking-ai


Built With

  • cloud-run
  • cloud-storage
  • custom-temporal-cooking-state-engine
  • fastapi
  • fcm)
  • firebase-(auth
  • firebase-admin
  • firestore
  • firestore-auth
  • firestore-fcm
  • gemini-2.5-live
  • gemini-3
  • gemini-3-flash
  • google-genai
  • netlify-(frontend)
  • node.js/fastapi-backend
  • postgresql
  • pyjwt
  • python
  • react-19
  • react-native-mobile-app
  • real-time-voice-agent
  • redis
  • speech)
  • tailwind-css-(cdn)
  • typescript
  • uvicorn
  • vector-database
  • vision
  • vite
  • vite-plugin-pwa
  • websockets
Share this project:

Updates