Inspiration
Everyone has a junk drawer. Most people have an entire junk house — shelves of things they forgot they owned, food that expired last month, medicine bought twice because no one remembered the first bottle. The problem isn't organization. It's that keeping an inventory by hand is tedious enough that nobody does it.
We wanted to build something that removes the friction entirely. Not a spreadsheet, not a manual entry form — just point your phone at something and let AI handle the rest. Dobby is the house-elf that manages your home so you don't have to.
What it does
Dobby is an AI-powered home inventory agent with two core workflows:
🧠 Smart Intake Agent Photograph a single item or an entire grocery receipt. Gemini Vision extracts every item — name, category, quantity, and expiry date. A multi-step reasoning agent then queries your existing inventory via the MongoDB MCP server, identifies the best cabinet for each item based on semantic similarity, and presents a pre-filled confirmation. One tap saves everything.
🔍 Inventory Discovery Agent Ask your home a question in plain English: "Any food expiring soon?", "What can I drink?", "Do we have dish soap?" The agent runs a keyword search across Elasticsearch, passes the candidates to Gemini, which reasons about intent, substitutes, and relevance — returning structured results with match types and exact locations.
Both agents are deployed as a Python FastAPI service on Google Cloud Run, with an iOS companion app for on-the-go access and a web UI for browser-based demos.
How we built it
Backend — Google Cloud Run
- Python + FastAPI, containerized with Docker
- Two agent modules (
intake_agent.py,discovery_agent.py), each implementing a multi-step reasoning loop - Gemini 2.5 Flash via Vertex AI for both vision extraction and text reasoning
- MongoDB Atlas as the primary inventory store
- Elasticsearch for fast full-text candidate search
MongoDB MCP Integration
Rather than querying MongoDB directly from Python, the intake agent spawns the official @mongodb-js/mongodb-mcp-server as a stdio subprocess and calls its find tool through the MCP protocol. Node.js 22 and the MCP server are co-installed in the same Docker container alongside Python. This keeps the agent architecture modular — the reasoning logic has no direct database dependency.
iOS App
- SwiftUI + CoreData + CloudKit for local-first storage with family sharing
- Fully bilingual (English / Chinese) via a
LanguageManagersingleton — language selection persists and propagates instantly across all views - Communicates with the Cloud Run backend for all AI operations; no API keys on device
Web Demo UI
- Vanilla HTML/CSS/JS served directly from Cloud Run
- Two-tab interface mirroring both agent workflows for easy browser-based demonstration
Challenges we ran into
Transitive dependency conflicts. Adding mcp==1.27.2 to an existing Python environment cascaded into version floor violations across pydantic (>=2.11.0), uvicorn (>=0.31.1), and httpx (>=0.27.1) — all previously pinned below those minimums. The fix was switching from exact == pins to minimum >= bounds and letting pip resolve the full graph.
Single-container MCP. The MongoDB MCP server is a Node.js binary; our agent is Python. We solved this by installing Node.js 22 and the MCP server globally inside the same Docker image and spawning it as a stdio subprocess per agent call. Each invocation gets its own MCP session — stateless and Cloud Run-friendly.
iOS sheet sequencing. SwiftUI silently drops a second sheet presented while the first is still dismissing. Triggering item confirmation inside onChange of a still-animating camera sheet caused results to disappear. The fix was moving recognition into the onDismiss callback so the confirmation sheet only fires after the capture sheet is fully gone.
Accomplishments that we're proud of
- A genuinely useful app — not a demo toy. It handles real receipts, real cabinets, real expiry tracking.
- The MCP integration works in production inside a single Docker container, with zero sidecar infrastructure.
- The intake agent achieves $O(1)$ exact-match lookups before calling Gemini, so repeat items (e.g. restocking milk) never incur an unnecessary LLM call — only truly novel items go to Gemini.
- An iOS app with CloudKit family sharing, so the whole household shares one live inventory.
What we learned
MCP is more than a protocol — it's an architectural pattern. Wrapping a database behind a tool interface forces you to think about what your agent actually needs from the data layer, rather than writing ad-hoc queries everywhere. It makes the agent database-agnostic by design.
Multi-step agents don't need to be complex to be useful. Dobby's intake agent is six steps: fetch cabinets → build exact-match index → split items into matched and unmatched → call Gemini only for unmatched → merge results → return plan. Each step is simple. Together they produce results that feel intelligent — because they are.
What's next for Dobby
- Proactive alerts — push notifications when items are about to expire, or when stock of a tracked essential drops to zero
- Shopping list generation — the Discovery agent suggests what to restock based on consumption patterns
- Coupon recommendations — automatically search and surface relevant coupons for frequently purchased items, so Dobby doesn't just track what you have — it helps you save on what you buy next
- Multi-home support — manage inventory across a primary home, a vacation property, or a storage unit

Log in or sign up for Devpost to join the conversation.