InspirationThe genesis of CodeCanvas AI came from a universal frustration shared by developers and designers: the "Translation Gap."We realized that engineers spend countless hours manually translating whiteboard sketches, screenshots, and architecture diagrams into actual code. The creative process is fluid, but the implementation is rigid and tedious. We asked ourselves: What if the whiteboard itself was the IDE?We wanted to build a bridge where a messy hand-drawn sketch could instantly become a production-ready React component, or where a system topology diagram could auto-generate its own Docker infrastructure. We were inspired by the multi-modal capabilities of Gemini 1.5 Pro and Gemini 3, seeing them not just as chatbots, but as visual reasoning engines capable of understanding intent from pixels.What it doesCodeCanvas AI is a multi-modal development assistant that transforms visual inputs into executable code across the full stack:Frontend Synthesis (UI Mode): Upload a wireframe or sketch, and it generates a fully styled React + Tailwind CSS interface with a live, interactive preview.Visual Replica (Reverse Engineering): Upload a screenshot of any website (e.g., Airbnb, Spotify), and it reconstructs the code structure, creating a pixel-perfect React replica.DevOps Architect Mode: Draw a system architecture diagram (boxes and arrows), and it generates the corresponding docker-compose.yml files and Mermaid graphs to spin up the infrastructure.The Refinement Studio: A conversational agent that lets you iterate on the generated code using natural language (e.g., "Scale the worker service to 3 replicas" or "Change the brand color to Spotify Green"), maintaining the state of the project.How we built itWe built CodeCanvas using a modern, edge-ready stack designed for speed and resilience.The Core Brain (Gemini Models):We utilized Gemini 3 Pro Vision for high-fidelity image analysis (understanding nuanced UI sketches).We implemented a Fallback Mechanism that automatically switches to Gemini 2.5 Flash if the primary model is overloaded (handling 503 Service Unavailable errors gracefully).We used distinct "System Personas" for different modes. For example, the DevOps Agent is prompted to be a strict Infrastructure Engineer, while the Refinement Agent acts as a Frontend Specialist.Frontend & Rendering:Built with React 18 and Vite for a snappy developer experience.We engineered a custom Sandboxed Preview Engine that renders the AI-generated code in real-time within a secure iframe, allowing users to interact with their "dreamed" interface immediately.State Management: We used complex React state to track the "conversation history" of the code, allowing the AI to understand context (e.g., knowing what "make it blue" refers to).The "Sanitization Layer":LLMs often wrap code in Markdown backticks or add conversational filler ("Here is your code!"). We wrote a robust parsing utility in geminiService.ts that regex-matches code blocks and strips JSON wrapping to ensure the output is always executable.Challenges we ran intoThe "Over-Polite" AI: Early versions of the model were too conversational. It would return JSON wrapped in text like "Sure, here is the JSON you asked for...", or wrap the code in a JSON array ["code"], which broke our parsers and caused syntax errors. We had to implement a Smart Unwrapper to detect and extract pure code from mixed responses.Handling Model Overloads: During testing, we encountered 503 Service Unavailable errors from the Gemini 3 API. Instead of letting the app crash, we built a robust error-handling layer that seamlessly downgrades the request to the Gemini 2.5 Flash model, ensuring the user always gets a result.Visual Accuracy vs. Code Quality: There was a trade-off between making the code look exactly like the sketch vs. writing clean code. We tuned our prompts to prioritize semantic HTML and accessibility (A11y) over pure pixel-matching.Accomplishments that we're proud ofContext-Aware Refinement: We are incredibly proud of the Refinement Studio. It doesn't just regenerate the whole file; it understands the specific component you want to change. If you ask to "Swap Postgres for MongoDB," it intelligently updates the environment variables ($MONGO_URI$), service definitions, and dependency links in the docker-compose.yml file without breaking the rest of the stack.Resilient Architecture: Proving that our fallback system works in real-time was a huge win. Seeing the console log switch from gemini-3 to gemini-2.5-flash automatically was a "eureka" moment for production readiness.Visual Replica Accuracy: The model's ability to identify specific UI libraries (like recognizing a "Card" component or a "Hero Section") and mapping them to correct Tailwind classes is remarkably accurate.What we learnedPrompt Engineering is Logic Programming: We learned that writing a prompt is like writing a function. You need to define types, constraints, and edge cases (e.g., "Return ONLY raw code, no explanations").The Power of Multi-Modality: Images are an incredibly high-bandwidth input method. Describing a UI in text takes paragraphs; sketching it takes seconds. AI bridges that compression gap.Stateful AI Interactions: Maintaining a "memory" of the code evolution is crucial. The AI needs to know the previous version of the code to make a meaningful edit.What's next for CodeCanvas AIDirect GitHub Integration: One-click push to create a repository from the generated code.Figma Plugin: Bringing the CodeCanvas engine directly into design tools to streamline the designer-to-developer handoff.Self-Healing Code: Implementing an agent that runs the generated code, detects runtime errors, and auto-corrects them before showing the result to the user.

Built With

  • docker-compose
  • express.js
  • fastapi
  • google-gemini-2.5-flash-infrastructure-&-deployment:-vercel-libraries-&-tools:-google-generative-ai-sdk
  • javascript-frontend:-react-18
  • languages:-typescript
  • lucide-react-ai-models:-google-gemini-3-pro-vision
  • mermaid.js-generated-stacks:-docker
  • mongodb
  • postgresql
  • prisma
  • tailwind-css
  • vite
Share this project:

Updates