Inspiration
Cooking at home should be enjoyable, not frustrating. Your hands are covered in flour and you can't scroll. Your favorite cookbook sits unused because it's not searchable. You get to the store and realize you forgot half the ingredients. We built CookShelf to solve these everyday cooking problems with Google Gemini's latest multimodal AI capabilities.
What it does
CookShelf is your hands-free cooking companion powered by Google Gemini 3 Flash and Gemini 2.5 Flash - bringing together voice, vision, grounding search, and image generation to transform cooking.
🎙️ Talk to Your Kitchen Assistant (Gemini 2.5 Flash Native Audio)
Have natural conversations with your AI chef while cooking:
- "What's the next step?" - Navigate recipes hands-free
- "Set a timer for 15 minutes" - Control timers with voice
- "Add eggs to shopping list" - Update lists while your hands are busy
- "Do I have chicken?" - Check your pantry instantly
The AI responds naturally using Gemini 2.5 Flash Native Audio API for real-time voice conversations with ultra-low latency. You can interrupt mid-sentence - just like talking to a real person.
📸 Scan Any Recipe (Gemini 3 Flash Vision)
Turn physical cookbooks into searchable digital recipes:
- Point your camera at cookbook pages, magazines, or recipe cards
- Gemini 3 Flash Vision instantly extracts and structures the recipe
- Recognizes ingredients, instructions, cooking times, and servings
- Works with handwritten recipes, printed books, and screenshots
🔗 Extract Recipes from Websites (Gemini 3 Flash Grounding)
Save recipes from anywhere on the web:
- Paste any recipe URL
- Gemini 3 Flash with Google Search Grounding extracts the actual recipe content from cluttered web pages
- Filters out ads, life stories, and unnecessary content
- Gets you just the ingredients and instructions you need
🎨 Generate Recipe Images (Gemini 2.5 Flash Image Generation)
Create beautiful visual guides for your recipes:
- Gemini 2.5 Flash Imagen generates cover photos for recipes without images
- Creates step-by-step visual instructions
- Helps visualize the final dish
- Makes your recipe library visually appealing
🥘 Smart Pantry & Shopping Lists
- Track available ingredients in real-time
- Get AI recipe suggestions based on what you have
- Automatically generate shopping lists from recipes
- Update pantry with voice commands while putting groceries away
👨🍳 Interactive Cooking Mode
- Voice-guided step-by-step instructions
- Automatic timers that start when mentioned in recipes
- Ask questions about techniques or substitutions
- Navigate forward/backward through steps hands-free
📚 Organized Recipe Library
- All recipes in one searchable place
- Tag and categorize (quick dinners, vegetarian, etc.)
- Cloud sync across devices
- Store photos of finished dishes
How we built it - Gemini Integration Deep Dive
CookShelf showcases Google Gemini's complete multimodal ecosystem, integrating four different Gemini capabilities across the latest models:
1. Gemini 2.5 Flash Native Audio API
The core of our voice assistant:
- Real-time bidirectional audio streaming with <100ms latency
- 13 custom function tools: recipe navigation, timer control, shopping lists, pantry management
- Voice Activity Detection for natural turn-taking
- Barge-in support - interrupt AI mid-response
- Processes audio chunks at 16kHz with 50ms intervals
- WebSocket bridge coordinates mobile app → backend → Gemini API
- Maintains conversation context throughout cooking sessions
2. Gemini 3 Flash Vision
Powers recipe scanning with frontier intelligence:
- OCR extraction from physical cookbooks and magazines
- Analyzes recipe images to identify structure
- Extracts quantities, units, and ingredient names accurately
- Handles various fonts, layouts, and image qualities
- Parses handwritten recipes and messy recipe cards
3. Gemini 3 Flash with Google Search Grounding
Extracts recipes from web URLs:
- Takes any recipe website URL as input
- Grounding with Google Search finds and verifies recipe content
- Filters out ads, comments, and irrelevant text
- Extracts structured ingredient lists with correct quantities
- Identifies recipe metadata (prep time, cook time, servings)
- Converts cluttered blogs into clean recipe format
4. Gemini 2.5 Flash Imagen
Creates visual content for recipes:
- Generates appealing cover images for recipes without photos
- Creates step-by-step visual guides
- Produces photorealistic images of final dishes
- Helps users visualize cooking techniques
- Makes the recipe library visually consistent
Technical Architecture
Mobile: React Native with Expo 54
- Voice-first UI with real-time visual feedback
- Offline-capable with local state persistence
- 4-tab navigation: Recipes, Kitchen Mode, Shopping & Pantry, Profile
Backend: Node.js/Express with WebSocket
- Real-time audio streaming bridge to Gemini 2.5 Native Audio API
- Tool execution system allowing Gemini to trigger mobile actions
- Coordinates between mobile device, multiple Gemini models, and backend services
Data & Storage:
- Supabase PostgreSQL with Row Level Security
- Cloudflare R2 for image storage
- Zustand + AsyncStorage for offline support
- Cloud sync across devices
Development:
- Monorepo structure with shared TypeScript contracts
- Type-safe API communication
- Production-ready deployment via EAS
- Built for Android with iOS compatibility
Built With
- cloudflare-r2
- expo-54
- express.js
- gemini-live-audio-api
- google-gemini-2.5-flash
- node.js
- postgresql
- react-native
- supabase
- typescript
- websocket
- zustand

Log in or sign up for Devpost to join the conversation.