Inspiration
Redesigning a room in India usually means either hiring a professional interior designer (₹50,000–₹5,00,000+) or experimenting blindly and making expensive mistakes.
We wanted to make professional-quality design advice accessible to anyone with a smartphone — simply by talking. No forms, no typing, and no knowledge of design terminology required.
What it Does
J's Room AI is a live, voice-first AI interior designer.
Users can show their room using a camera or photo, have a natural voice conversation about their style and budget, and receive a photorealistic redesign of their actual room.
The AI then provides a shopping list of real products from Indian retailers, including actual prices and purchase links.
How We Built It
The system is built on Google Gemini using two models:
Gemini 2.5 Flash (Native Audio) Handles real-time bidirectional voice conversations as the core AI agent, using the Live API with autonomous tool calling.
Imagen 4.0 Generates photorealistic redesigns while preserving the room's original layout.
Tech Stack
- Frontend: Next.js 14, React 18, TypeScript, Tailwind CSS
- Voice streaming: Gemini Live API via WebSockets directly from the browser
AI tools (called autonomously by the agent):
- Room analysis
- Image generation
- Product search (SerpAPI)
- Shopping list generation
The application is deployed on Google Cloud Run with an automated deployment script.
Challenges We Ran Into
Live API Tool Calling Ensuring Gemini reliably triggered tools during live voice conversations required careful prompt engineering and WebSocket event handling.
Streaming Transcripts Audio transcripts arrive in small fragments. We built a merge system with timing windows to combine fragments into clean chat bubbles without losing text.
Layout Preservation Getting Imagen to redesign a room without moving or removing furniture required a two-step pipeline:
- Gemini analyzes the image and inventories all objects and their positions.
- Gemini generates a detailed prompt for Imagen to produce the redesign.
Cloud Run Deployment Containerizing a Next.js application using standalone output and ensuring environment variables flowed correctly through the Docker build process.
Accomplishments We're Proud Of
Truly Voice-First Experience The entire design consultation happens through natural speech — no buttons to press and no typing required.
Autonomous AI Agent The AI independently decides when to analyze the room, generate designs, search products, or compile a shopping list.
Real Products, Real Prices Recommendations come from actual Indian retailers with INR pricing and direct purchase links — not hallucinated suggestions.
Production Deployment The application is live on Google Cloud Run, with a one-command automated redeployment script.
What We Learned
- Gemini's Live API with native audio generation feels far more natural than traditional text-to-speech pipelines. Users can even interrupt the AI mid-sentence, and it adapts smoothly.
- The two-step generation approach (vision analysis → image generation) is essential for preserving room layout. Without the analysis step, image generation tends to completely reimagine the room.
- Autonomous tool calling through voice interaction creates a much more natural conversational experience.
- Next.js standalone output mode is crucial when deploying to Docker/Cloud Run to ensure all dependencies are included.
What's Next for J's Room AI
- Regional language support — Hindi, Tamil, Telugu for broader accessibility across India
- WhatsApp integration — Send a room photo and receive voice design advice
- AR overlays — View generated designs overlaid on the live camera feed
- Retailer partnerships — Direct “Add to Cart” integration with Flipkart, Amazon India, and Pepperfry
- Multi-room projects — Design entire homes with consistent styles across rooms
Built With
- docker
- gemini-live-api
- google-cloud-run
- google-gemini-2.5-flash
- imagen-4.0
- indexeddb
- next.js-14
- react-18
- serpapi
- tailwind-css
- typescript
- unsplash-api
- web-audio-api
Log in or sign up for Devpost to join the conversation.