Inspiration

The architectural design industry has long been restricted by traditional CAD software that requires years of training and expensive licenses. This creates a significant accessibility barrier for homeowners, small businesses, and aspiring designers who want to participate in shaping their own spaces.

We watched friends struggle to communicate renovation ideas to architects, often relying on rough sketches that failed to convey their actual vision. That frustration led to a simple but powerful question:

What if you could design a building as naturally as describing it to a friend?

That question became DesignForge — a platform that democratizes architectural design by accepting input in the most natural forms: voice, sketches, photos, and text.

Inspired by the rapid evolution of multimodal AI systems and the power of Azure's cognitive services combined with generative AI, we envisioned a future where a contractor could photograph a plot of land, describe their idea verbally, and instantly receive a professional 3D model — without opening traditional CAD software.

What it does

DesignForge is an AI-powered architectural design platform that transforms natural human input into professional 3D CAD models using a structured multi-agent system.

Core Capabilities

Multimodal Input Processing

  • Text-to-CAD: Natural language descriptions into structured 3D buildings
  • Sketch-to-CAD: Hand-drawn floor plans into spatially accurate models
  • Photo-to-CAD: Images or layouts into architectural structures
  • Voice-to-CAD: Speech transcription into real-time 3D generation

AI-Powered Generation Pipeline

Our system uses a 4-agent orchestration architecture inspired by real architectural workflows:

  • Interpreter Agent: Extracts spatial intent and constraints from multimodal input
  • Architect Agent: Designs dimensionally accurate 3D structures
  • Validator Agent: Enforces structural and logical constraints
  • Refinement Agent: Optimizes layout aesthetics and usability

This modular design significantly reduces hallucinations and improves reliability.

AI Interior Designer

DesignForge includes an automated interior designer capable of generating stylistically consistent interiors in:

  • Modern
  • Industrial
  • Minimalist

Furniture placement is optimized using spatial scoring:

$$ P = \arg\max(A_{accessibility} + A_{aesthetics} - C_{collision}) $$

where \(A_{accessibility}\) measures movement efficiency, \(A_{aesthetics}\) evaluates visual balance, and \(C_{collision}\) prevents object overlap.

AR Visualization

  • GLB export for AR compatibility
  • QR-based mobile viewing
  • Real-world scale projection
  • First-person walkthrough mode

Users can visualize their generated design directly within their physical space.

Real-Time Collaboration

  • Multi-user project management
  • Version control and history tracking
  • Team invitations via email
  • Project-level discussions
  • Activity logging and auditing

How we built it

Frontend Stack

  • Next.js 15 (App Router) with React 19 and TypeScript
  • Three.js with React Three Fiber for rendering
  • Shadcn/UI component system
  • React Hook Form with Zod validation
  • Framer Motion for animations

Backend Architecture

Serverless APIs deployed on Vercel:

  • /api/cad-generator
  • /api/multimodal-processor
  • /api/speech-to-text
  • /api/interior-design
  • /api/projects/*

Database layer:

  • Prisma ORM
  • PostgreSQL (Supabase)
  • Clerk JWT authentication

AI/ML Pipeline

Input Analysis

  • Azure Computer Vision for spatial extraction
  • Azure Speech Services (>95% transcription accuracy)
  • Custom sketch geometry parser

Generation

  • GPT-4o produces structured CAD JSON
  • Temperature tuning for consistency and creativity

Validation

  • Hard spatial constraints (minimum room sizes, window ratios)
  • Structural sanity checks
  • Automatic correction of invalid geometries

Challenges we ran into

Multimodal Input Fusion

Combining sketch geometry, photo analysis, and voice transcripts into a single coherent AI prompt required a weighted fusion strategy. We prioritized explicit geometry (sketch) over inferred layouts (photo) and descriptive modifiers (speech).

Speech Format Mismatch

Browsers record WebM/Opus while Azure requires WAV. We implemented client-side WAV conversion using the Web Audio API, enabling seamless speech-to-CAD flow.

Serverless Database Connection Limits

Vercel functions created excessive database connections, causing failures. By enabling PgBouncer pooling with connection limits, we reduced database errors from approximately 40% to under 0.1%.

AI Hallucination

The model occasionally generated impossible structures (e.g., negative dimensions). The Validator Agent enforced strict geometric constraints and automatically corrected invalid outputs.

Three.js Performance

Large models caused frame drops below 30 FPS. We implemented geometry merging, level-of-detail rendering, and frustum culling to achieve stable 60 FPS even with complex models.

Accomplishments that we're proud of

  • A fully integrated multimodal CAD pipeline
  • A structured 4-agent AI orchestration system
  • Real-time streaming generation with progress tracking
  • AR-ready export pipeline
  • Production-grade serverless architecture
  • Version-controlled collaborative design environment

Performance Metrics

Metric Value
Average CAD generation 8.5s
Photo-to-CAD 12s
Sketch analysis accuracy 92%
Speech transcription accuracy 96%
Database latency (p95) <100ms
Render performance 60 FPS

What we learned

  • Multi-agent systems require strict contracts to prevent context drift
  • Validation layers are essential in generative architectural systems
  • Serverless architectures demand intelligent connection pooling
  • Showing real-time progress improves perceived performance more than raw speed
  • Type-safe APIs significantly reduce runtime failures

What's next for DesignForge

Short-term

  • Smart cost estimation engine
  • Enhanced building code compliance checks
  • Expanded furniture and asset library

Medium-term

  • Structural load simulation
  • Energy efficiency optimization
  • AI-generated materials

Long-term

  • BIM export (Revit, IFC compatibility)
  • Plugin ecosystem
  • VR walkthrough integration
  • AI-powered construction project management

Our Vision

DesignForge represents a future where architectural creativity is accessible to everyone — not just trained CAD professionals.

By handling the complexity of technical modeling, we allow users to focus on imagination and intent.

We believe democratizing design tools amplifies human creativity rather than replacing it.

DesignForge is our step toward making architectural design natural, intuitive, and universally accessible.

Built With

  • azure-computer-vision
  • azure-openai-(gpt-4o)
  • azure-speech-services
  • clerk-authentication
  • cognitive
  • framer-motion
  • glb/obj/stl-exporters
  • lucide-icons
  • microsoft
  • next.js-15
  • pgbouncer
  • postgresql-(supabase)
  • prisma-orm
  • qr-code-generator
  • radix-ui
  • react-19
  • react-hook-form
  • react-three-fiber
  • services
  • shadcn/ui
  • speech
  • tailwind-css
  • three.js
  • typescript
  • vercel
  • web-audio-api
  • zod
Share this project:

Updates