Inspiration
The architectural design industry has long been restricted by traditional CAD software that requires years of training and expensive licenses. This creates a significant accessibility barrier for homeowners, small businesses, and aspiring designers who want to participate in shaping their own spaces.
We watched friends struggle to communicate renovation ideas to architects, often relying on rough sketches that failed to convey their actual vision. That frustration led to a simple but powerful question:
What if you could design a building as naturally as describing it to a friend?
That question became DesignForge — a platform that democratizes architectural design by accepting input in the most natural forms: voice, sketches, photos, and text.
Inspired by the rapid evolution of multimodal AI systems and the power of Azure's cognitive services combined with generative AI, we envisioned a future where a contractor could photograph a plot of land, describe their idea verbally, and instantly receive a professional 3D model — without opening traditional CAD software.
What it does
DesignForge is an AI-powered architectural design platform that transforms natural human input into professional 3D CAD models using a structured multi-agent system.
Core Capabilities
Multimodal Input Processing
- Text-to-CAD: Natural language descriptions into structured 3D buildings
- Sketch-to-CAD: Hand-drawn floor plans into spatially accurate models
- Photo-to-CAD: Images or layouts into architectural structures
- Voice-to-CAD: Speech transcription into real-time 3D generation
AI-Powered Generation Pipeline
Our system uses a 4-agent orchestration architecture inspired by real architectural workflows:
- Interpreter Agent: Extracts spatial intent and constraints from multimodal input
- Architect Agent: Designs dimensionally accurate 3D structures
- Validator Agent: Enforces structural and logical constraints
- Refinement Agent: Optimizes layout aesthetics and usability
This modular design significantly reduces hallucinations and improves reliability.
AI Interior Designer
DesignForge includes an automated interior designer capable of generating stylistically consistent interiors in:
- Modern
- Industrial
- Minimalist
Furniture placement is optimized using spatial scoring:
$$ P = \arg\max(A_{accessibility} + A_{aesthetics} - C_{collision}) $$
where \(A_{accessibility}\) measures movement efficiency, \(A_{aesthetics}\) evaluates visual balance, and \(C_{collision}\) prevents object overlap.
AR Visualization
- GLB export for AR compatibility
- QR-based mobile viewing
- Real-world scale projection
- First-person walkthrough mode
Users can visualize their generated design directly within their physical space.
Real-Time Collaboration
- Multi-user project management
- Version control and history tracking
- Team invitations via email
- Project-level discussions
- Activity logging and auditing
How we built it
Frontend Stack
- Next.js 15 (App Router) with React 19 and TypeScript
- Three.js with React Three Fiber for rendering
- Shadcn/UI component system
- React Hook Form with Zod validation
- Framer Motion for animations
Backend Architecture
Serverless APIs deployed on Vercel:
/api/cad-generator/api/multimodal-processor/api/speech-to-text/api/interior-design/api/projects/*
Database layer:
- Prisma ORM
- PostgreSQL (Supabase)
- Clerk JWT authentication
AI/ML Pipeline
Input Analysis
- Azure Computer Vision for spatial extraction
- Azure Speech Services (>95% transcription accuracy)
- Custom sketch geometry parser
Generation
- GPT-4o produces structured CAD JSON
- Temperature tuning for consistency and creativity
Validation
- Hard spatial constraints (minimum room sizes, window ratios)
- Structural sanity checks
- Automatic correction of invalid geometries
Challenges we ran into
Multimodal Input Fusion
Combining sketch geometry, photo analysis, and voice transcripts into a single coherent AI prompt required a weighted fusion strategy. We prioritized explicit geometry (sketch) over inferred layouts (photo) and descriptive modifiers (speech).
Speech Format Mismatch
Browsers record WebM/Opus while Azure requires WAV. We implemented client-side WAV conversion using the Web Audio API, enabling seamless speech-to-CAD flow.
Serverless Database Connection Limits
Vercel functions created excessive database connections, causing failures. By enabling PgBouncer pooling with connection limits, we reduced database errors from approximately 40% to under 0.1%.
AI Hallucination
The model occasionally generated impossible structures (e.g., negative dimensions). The Validator Agent enforced strict geometric constraints and automatically corrected invalid outputs.
Three.js Performance
Large models caused frame drops below 30 FPS. We implemented geometry merging, level-of-detail rendering, and frustum culling to achieve stable 60 FPS even with complex models.
Accomplishments that we're proud of
- A fully integrated multimodal CAD pipeline
- A structured 4-agent AI orchestration system
- Real-time streaming generation with progress tracking
- AR-ready export pipeline
- Production-grade serverless architecture
- Version-controlled collaborative design environment
Performance Metrics
| Metric | Value |
|---|---|
| Average CAD generation | 8.5s |
| Photo-to-CAD | 12s |
| Sketch analysis accuracy | 92% |
| Speech transcription accuracy | 96% |
| Database latency (p95) | <100ms |
| Render performance | 60 FPS |
What we learned
- Multi-agent systems require strict contracts to prevent context drift
- Validation layers are essential in generative architectural systems
- Serverless architectures demand intelligent connection pooling
- Showing real-time progress improves perceived performance more than raw speed
- Type-safe APIs significantly reduce runtime failures
What's next for DesignForge
Short-term
- Smart cost estimation engine
- Enhanced building code compliance checks
- Expanded furniture and asset library
Medium-term
- Structural load simulation
- Energy efficiency optimization
- AI-generated materials
Long-term
- BIM export (Revit, IFC compatibility)
- Plugin ecosystem
- VR walkthrough integration
- AI-powered construction project management
Our Vision
DesignForge represents a future where architectural creativity is accessible to everyone — not just trained CAD professionals.
By handling the complexity of technical modeling, we allow users to focus on imagination and intent.
We believe democratizing design tools amplifies human creativity rather than replacing it.
DesignForge is our step toward making architectural design natural, intuitive, and universally accessible.
Built With
- azure-computer-vision
- azure-openai-(gpt-4o)
- azure-speech-services
- clerk-authentication
- cognitive
- framer-motion
- glb/obj/stl-exporters
- lucide-icons
- microsoft
- next.js-15
- pgbouncer
- postgresql-(supabase)
- prisma-orm
- qr-code-generator
- radix-ui
- react-19
- react-hook-form
- react-three-fiber
- services
- shadcn/ui
- speech
- tailwind-css
- three.js
- typescript
- vercel
- web-audio-api
- zod
Log in or sign up for Devpost to join the conversation.