π³ CookifyAI β Your AI Sous Chef
π₯ Inspiration
Cooking can be intimidating for beginners who struggle with timing, techniques, and presentation.
The goal was to build an AI sous chef that provides real-time, conversational guidance while helping users create visually appealing dishes.
Inspired by the rise of cooking content on social media, we wanted to bridge the gap between learning to cook and creating engaging food videos.
π½οΈ What It Does
CookifyAI is a voice-powered cooking assistant that combines:
- ποΈ Real-time, natural voice conversations with AI guidance
- π§ Visual food analysis for plating, finishing touch suggestions, and presentation tips
- π₯ Automatic video generation for social media platforms like TikTok and Instagram Reels
Users can talk naturally to the AI while cooking and get instant feedback or recommendations β just like having a real sous chef by their side!
π§ How We Built It
CookifyAI is built using a hybrid AI architecture, integrating multiple advanced technologies:
- π£οΈ OpenAI Realtime API β natural, low-latency (sub-200ms) voice-to-voice conversations
- π½οΈ AWS Bedrock Agent (PreparationAgent) β domain-specific cooking expertise
- π₯ AWS Nova Models β image analysis and AI-driven video generation
- π LiveKit β real-time WebRTC communication infrastructure
- βοΈ FastAPI β async backend handling multiple AI services
- π» Next.js / React + TypeScript β responsive, modern frontend
- π¨ Tailwind CSS β sleek, animated user interface
The system routes voice interactions through OpenAI, leverages Bedrock for culinary intelligence, and uses Nova for visual creativity.
βοΈ Challenges We Ran Into
- Latency optimization across multiple AI services for smooth, real-time voice interaction
- Session management to maintain conversation context between models
- Audio clarity and stability using WebRTC under concurrent sessions
- Model coordination between OpenAI (conversation) and Bedrock (cooking expertise)
- File handling and image validation for user uploads
- Real-time synchronization of voice, visual, and text outputs during live sessions
π Accomplishments Weβre Proud Of
- π€ Seamlessly integrated OpenAI, AWS Bedrock, and AWS Nova
- β‘ Achieved sub-200ms voice response time for natural cooking conversations
- π§βπ³ Built a beautiful, intuitive interface supporting both voice and visual modes
- π¬ Created automated video generation for shareable, social-media-ready content
- π οΈ Implemented robust error handling and fallback systems
- βοΈ Designed a scalable architecture supporting concurrent users
π‘ What We Learned
- π€ Hybrid AI integration unlocks powerful new capabilities by combining model strengths
- βοΈ Real-time voice AI requires careful attention to latency, buffering, and network optimization
- π¨ UX design is critical for multimodal interactions (voice, text, visual)
- πΈ AWS Nova models excel at multimodal content creation
- π§© Graceful degradation is essential when relying on multiple APIs
- π£οΈ Voice-first interfaces redefine how users engage with AI-driven applications
Made with β€οΈ by the CookifyAI Team
Built With
- amazon-web-services
- bedrock
- kiro
- livekit
- nova
- openai


Log in or sign up for Devpost to join the conversation.