Inspiration
While learning or following recipes, we noticed that many cooking instructions assume prior knowledge. Terms like “a pinch,” “medium flame,” “sauté lightly,” or even ingredients like “clove (elaichi)” are obvious to experienced cooks but confusing to beginners and non-native speakers. Most recipe platforms rely heavily on text, which often fails to explain what an instruction actually looks like in practice. This gap between written instructions and real-world understanding inspired us to build Vyanjan AI.
What it does
Vyanjan AI is a visual cooking assistant that converts a recipe into simple, beginner-friendly steps. For each step, it generates a structured JSON prompt and a corresponding visual explanation. The JSON describes the scene, ingredients, quantities, and actions, while the image shows exactly what the step should look like. This removes ambiguity and helps users understand cooking instructions without guessing.
How we built it
We designed Vyanjan AI around a structured pipeline rather than free-form prompting. A recipe is first broken down into clear, simple steps. Each step is converted into a detailed JSON prompt that explicitly defines the visual intent. This JSON is then used to generate realistic cooking images using a local image generation pipeline compatible with FIBO. The frontend displays the JSON transparently alongside each generated image, making the AI process explainable and easy to understand.
Challenges we ran into
One major challenge was ensuring reliability in image generation, as external APIs were unstable during development. To solve this, we shifted to a local image generation setup, allowing us to maintain consistency and avoid downtime. Another challenge was balancing detail and simplicity — we had to ensure the JSON prompts were rich enough for accurate visuals but still readable and understandable for demonstration purposes.
Accomplishments that we’re proud of
We successfully built a system that doesn’t hide AI behavior behind a black box. By exposing the JSON prompts and pairing them with visuals, Vyanjan AI demonstrates transparency, structure, and real-world usability. We’re especially proud of creating a beginner-focused experience that bridges the gap between abstract cooking instructions and practical understanding.
What we learned
We learned that clarity is more important than complexity. A well-structured system with clear intent can be more impactful than a feature-heavy application. We also learned the value of explainable AI — showing how an output is generated builds more trust than simply presenting the output itself.
What’s next for Vyanjan AI
Next, we plan to expand Vyanjan AI beyond cooking by adapting the same visual-grounding approach to other instruction-based domains such as education and DIY tasks. We also aim to improve automatic step extraction and integrate real-time image generation directly into the workflow, making the system more dynamic and scalable.
Built With
- css
- html
- javascript
- prompt
- python
Log in or sign up for Devpost to join the conversation.