HawkerLens

💡 What inspired us Walking through a bustling Malaysian hawker centre is a sensory masterpiece—the sound of spatulas hitting hot woks, the smoky aroma of wok hei, and the vibrant colors of diverse dishes. However, for tourists, expats, and even locals with specific dietary needs, this culinary paradise can be a minefield. Language barriers, handwritten menu boards, and complex hidden ingredients often lead to hesitation. Questions like "Does this contain peanuts?", "Is this Muslim-Friendly?", or "What exactly is in this dark sauce?" are hard to answer just by looking. We built HawkerLens to bridge this gap, ensuring that everyone can explore authentic street food safely and confidently. 🛠️ How we built it HawkerLens is built as a modern, responsive web application designed for mobile-first usage (perfect for on-the-go scanning at food stalls). Frontend: React 18, TypeScript, Vite, and Tailwind CSS for a snappy, beautiful user interface. AI Engine: We heavily leveraged the Google Gemini API: gemini-3-flash-preview: Used for blazing-fast multimodal vision analysis. It processes images of menus or dishes and returns strictly typed JSON containing ingredients, cultural origins, and safety warnings. gemini-2.5-flash-image: Powers our generative AI chat. When a user describes a dish via text or voice, this model generates a high-quality, authentic-looking image of the food. Voice Integration: We utilized the native Web Speech API to allow users to seamlessly describe dishes hands-free. 🚧 Challenges we faced Strict Data Structuring: LLMs are notoriously bad at outputting consistent, complex JSON. We had to utilize Gemini's responseSchema to force the model into returning exact types (e.g., ensuring Halal status only returned as "Halal" | "Non-Halal" | "Muslim-Friendly" | "Unknown"). Cultural Nuance: Street food varies wildly. A "Rojak" in Penang is very different from a "Rojak" in Kuala Lumpur. We had to carefully engineer our system prompts to prioritize Malaysian hawker context over generic global food data. Allergen Risk Modeling: We needed a way to quantify the risk of cross-contamination or hidden allergens. We conceptualized an internal risk probability model. The Allergen Risk Probability Model To ensure user safety, we evaluate the probability of an allergen being present. Let represent the identified dish and represent a specific allergen (e.g., peanuts, shellfish). The total risk score for a user with allergies is calculated as the complement of the probability that none of the allergens are present: Where is the conditional probability of the allergen existing in the dish, weighted by the visual confidence score and the historical recipe context : If exceeds a critical threshold (e.g., ), HawkerLens automatically flags the dish with a high-priority Emergency View warning. 🧠 What we learned Advanced Prompt Engineering: We learned how to write highly constrained system instructions to turn a generative AI into a deterministic data-extraction engine. Multimodal Workflows: We successfully chained different AI models together—using a text/vision model to identify a dish from a voice prompt, and then piping that identification into an image-generation model to visualize it. Accessibility Matters: Integrating the Web Speech API taught us the importance of multimodal inputs. Not everyone can type out a complex query while holding a tray of food in a crowded market! 🚀 What's next for HawkerLens We plan to introduce real-time AR (Augmented Reality) bounding boxes over live camera feeds, allowing users to pan their phone across a hawker centre and see translated names, prices, and Halal statuses floating above the stalls in real-time.