Inspiration
Deciding where to eat is a struggle we have all faced before. Whether you're with friends who just simply can't agree, or just someone staring blankly at a delivery app with far too many options, food choice paralysis is real. So we wanted to build something that made the decision more fun and engaging, not just another list ot scroll through. The swipe mechanic pioneered by dating apps is intuitive and addictive so then we asked "why hasn't anyone done this for food?" and GrubR is the answer swipe right on what looks good, and allow our GrubR agent figure out where to send you to.
What it does
GrubR is a Tinder-style restaraunt and dish discovery app. The user can swipe through various food options and allow GrubR's agentic AI learn their preferences in real time in order to surface better and better suggestions. The result is a more personalized frictionless dining decision, no more just endlessly scrolling, no more "I don't care, you pick." Just simply swipe, taste and then go.
How we built it
Grubr is a full-stack agentic application built in two core layers: Frontend & API (Next.js / TypeScript) The swipe UI, routing, and API layer are all built in Next.js. Each swipe, chat message, and food image interaction fires events to the backend, feeding the agent a continuous stream of signals. Agentic AI Backend (Python + Gemini on Vertex AI) The core of Grubr is an agentic system powered by Gemini via Vertex AI. This is not a simple prompt-response loop. The agent:
Reasons step-by-step before producing any recommendation — it thinks through the user's swipe history, stated cravings, and semantic preference profile before deciding what to surface next Autonomously calls tools and APIs — including image analysis, embedding lookups, and external data sources — without being explicitly told which tool to invoke on each turn Uses multimodal Gemini capabilities to visually understand food images and map them to flavor profiles and cuisine categories Builds a semantic preference model via embeddings, so that "I want something spicy and comforting" maps meaningfully to past swipes even without exact keyword matches
The Python backend handles all agent orchestration and communicates with Vertex AI for Gemini inference, returning structured recommendations to the Next.js frontend. Deployment Frontend on Vercel, API backend running on Fly.io
Challenges we ran into
Cold-start problem: Getting the agent to produce meaningful recommendations from the very first interaction, before it has a rich preference history to reason from, required careful prompt design and smart default priors Multimodal coordination: Connecting the image recognition pipeline, the embedding store, and the chat interface into one coherent agent loop — rather than three disconnected features — was the core architectural challenge Structured agentic output: Getting Gemini to reason step-by-step and reliably return parseable, structured recommendations required significant prompt engineering iteration Latency: Chaining tool calls (image analysis → embedding lookup → recommendation generation) introduced latency that we had to optimize to keep the swipe experience feeling snappy
Accomplishments that we're proud of
Built a genuinely agentic system — Gemini autonomously decides which tools to call, reasons through preference signals, and drives the recommendation loop without hardcoded logic dictating each step Three distinct Gemini capabilities (NLP chat, image recognition, semantic embeddings) working together seamlessly as a single agent The swipe UX is smooth, fast, and actually fun Shipped a fully deployed, working product within the hackathon window
What we learned
What it actually means to build an agent versus calling an AI API — autonomous tool use and step-by-step reasoning are what make the difference How to use Vertex AI to orchestrate multimodal Gemini workflows in production Semantic embeddings as a preference model is a powerful pattern — it captures taste in a way that keyword filtering never could Designing for agentic latency: users tolerate a thinking moment if the result feels intelligent, not random
What's next for Grubr
Real restaurant data via Google Places API — swipes and cravings map to actual nearby spots with directions Group mode — everyone swipes independently, the Gemini agent finds consensus across the group's preference embeddings Persistent memory — the agent retains your taste profile across sessions, getting smarter every time you use it Voice input — lean further into Gemini's multimodal strengths so you can just say "something warm and cheesy" on the way out the door Mobile app (React Native) for true on-the-go use
Built With
- fly.io
- next.js
- python
- react
- typescript
Log in or sign up for Devpost to join the conversation.