🍳 CookifyAI – Your AI Sous Chef

πŸ”₯ Inspiration

Cooking can be intimidating for beginners who struggle with timing, techniques, and presentation.
The goal was to build an AI sous chef that provides real-time, conversational guidance while helping users create visually appealing dishes.

Inspired by the rise of cooking content on social media, we wanted to bridge the gap between learning to cook and creating engaging food videos.


🍽️ What It Does

CookifyAI is a voice-powered cooking assistant that combines:

  • πŸŽ™οΈ Real-time, natural voice conversations with AI guidance
  • 🧠 Visual food analysis for plating, finishing touch suggestions, and presentation tips
  • πŸŽ₯ Automatic video generation for social media platforms like TikTok and Instagram Reels

Users can talk naturally to the AI while cooking and get instant feedback or recommendations β€” just like having a real sous chef by their side!


🧠 How We Built It

CookifyAI is built using a hybrid AI architecture, integrating multiple advanced technologies:

  • πŸ—£οΈ OpenAI Realtime API – natural, low-latency (sub-200ms) voice-to-voice conversations
  • 🍽️ AWS Bedrock Agent (PreparationAgent) – domain-specific cooking expertise
  • πŸŽ₯ AWS Nova Models – image analysis and AI-driven video generation
  • πŸ”— LiveKit – real-time WebRTC communication infrastructure
  • βš™οΈ FastAPI – async backend handling multiple AI services
  • πŸ’» Next.js / React + TypeScript – responsive, modern frontend
  • 🎨 Tailwind CSS – sleek, animated user interface

The system routes voice interactions through OpenAI, leverages Bedrock for culinary intelligence, and uses Nova for visual creativity.


βš™οΈ Challenges We Ran Into

  • Latency optimization across multiple AI services for smooth, real-time voice interaction
  • Session management to maintain conversation context between models
  • Audio clarity and stability using WebRTC under concurrent sessions
  • Model coordination between OpenAI (conversation) and Bedrock (cooking expertise)
  • File handling and image validation for user uploads
  • Real-time synchronization of voice, visual, and text outputs during live sessions

πŸ† Accomplishments We’re Proud Of

  • 🀝 Seamlessly integrated OpenAI, AWS Bedrock, and AWS Nova
  • ⚑ Achieved sub-200ms voice response time for natural cooking conversations
  • πŸ§‘β€πŸ³ Built a beautiful, intuitive interface supporting both voice and visual modes
  • 🎬 Created automated video generation for shareable, social-media-ready content
  • πŸ› οΈ Implemented robust error handling and fallback systems
  • ☁️ Designed a scalable architecture supporting concurrent users

πŸ’‘ What We Learned

  • πŸ€– Hybrid AI integration unlocks powerful new capabilities by combining model strengths
  • βš™οΈ Real-time voice AI requires careful attention to latency, buffering, and network optimization
  • 🎨 UX design is critical for multimodal interactions (voice, text, visual)
  • πŸ“Έ AWS Nova models excel at multimodal content creation
  • 🧩 Graceful degradation is essential when relying on multiple APIs
  • πŸ—£οΈ Voice-first interfaces redefine how users engage with AI-driven applications

Made with ❀️ by the CookifyAI Team

Built With

Share this project:

Updates