Variantlab
About the Project
Variantlab is a cutting-edge, real-time React and TSX component development environment that marries the power of AI with an intuitive visual canvas. It allows developers to create, edit, and manage UI components instantly, offering dynamic previews and AI-driven code generation. With the recent integration of the Gemini Live API, Variantlab now offers hands-free voice control, transforming how users interact with their development workflow.
Inspiration
The primary inspiration behind Variantlab was to bridge the gap between ideation and implementation in UI development. We wanted to create a tool that not only accelerated the prototyping process but also made it more natural and conversational. Drawing from the potential of large language models, we envisioned a world where developers could simply describe a component or even show an image, and the code would materialize, ready for immediate iteration. The introduction of voice control further pushes this boundary, aiming for a truly seamless, hands-free creative flow.
What it does
Variantlab offers a comprehensive set of features designed to enhance component development:
- Real-time Code Editor & Live Preview: Write React/TSX code in an integrated editor and see your components render instantly on the canvas.
- AI-Powered Code Generation: Leverage the Google Gemini API to generate or modify components using natural language prompts, supporting both text and image inputs.
- Canvas-based Component Management: Organize, move, and visualize all your components as interactive, draggable nodes on a flexible canvas.
- "Vary" for Rapid Iteration: Easily create variations (forks) of any component's state, allowing for quick experimentation with different design and code alternatives without losing previous work.
- Hands-Free Voice Control (New!): Control the entire application using natural voice commands via the Gemini Live API. This includes:
- Creating new components.
- Deleting components by their title.
- Creating variations of existing components.
- Opening chat or code panels for specific components.
- Sending code modification prompts directly to the active chat.
- Receiving spoken confirmations from the AI assistant.
- Virtual File System (VFS): Components are managed within an in-browser VFS, providing a familiar file structure for editing.
- On-the-fly Bundling:
esbuild-wasmcompiles and bundles your TSX code directly in the browser for instant feedback. - Theming: Toggle between light and dark modes for a comfortable coding environment.
How we built it
Variantlab is built with a modern web stack and leverages powerful AI services:
- Frontend Framework: React and TypeScript for a robust and type-safe user interface.
- Styling: Tailwind CSS for utility-first styling, ensuring a clean and responsive design.
- AI Integration: The core AI capabilities are powered by the
@google/genaiSDK, specifically:-
ai.models.generateContentfor text-to-code and image-to-code generation. -
ai.live.connectfor real-time voice interaction, including audio input/output and function calling.
-
- In-browser Bundling:
esbuild-wasmis utilized for client-side compilation of TSX files, enabling real-time code execution and previews within the browser sandbox. A customesbuildplugin handles VFS resolution. - 3D Graphics: The initial "Welcome Component" demonstrates 3D rendering using
Three.jsdirectly, without additional React wrappers, showcasing flexibility. - Web Audio API: Essential for handling real-time audio input from the microphone and playing back AI-generated speech responses in the voice control feature.
- Canvas Interaction: Custom React hooks and state management handle node dragging, panning, zooming, and panel resizing.
Challenges we ran into
Developing Variantlab presented several exciting challenges:
- Real-time In-Browser Compilation: Integrating
esbuild-wasmto work with a dynamic virtual file system and resolving module imports correctly within the browser environment was complex. - Gemini Live API Integration: Managing the full lifecycle of a real-time audio session (connecting, streaming, receiving messages, handling disconnections) and ensuring smooth, low-latency audio playback was a significant undertaking.
- Robust Function Calling: Defining clear
FunctionDeclarationobjects and implementing the logic to parse Gemini'sFunctionCallresponses, execute corresponding application actions, and send tool responses back to the model required careful design. - Synchronized State Management: Keeping the UI, VFS, chat history, and AI models in sync across various user interactions (typing, AI generation, undo/redo, voice commands) was crucial.
- Error Handling in Live Previews: Implementing a resilient
ErrorBoundaryand handlingesbuildcompilation errors gracefully to prevent the entire application from crashing. - Natural Language to UI Actions: Translating ambiguous voice commands into precise application actions required thoughtful prompt engineering and function naming for the AI assistant.
Accomplishments that we're proud of
We are particularly proud of:
- Seamless Voice Control: The ability to entirely control the application's core functionalities (component creation, deletion, modification, panel management) using natural voice commands, providing an unparalleled hands-free development experience.
- Iterative AI-Powered Development: The "Vary" feature combined with AI code generation allows for extremely fast iteration and exploration of design ideas.
- Interactive Canvas: A highly responsive and intuitive canvas where developers can visually arrange and manage their components.
- Robust Client-Side Tooling: Successfully integrating
esbuild-wasmandThree.jsto run complex development tools directly in the browser. - Clean and Modern UI/UX: A thoughtfully designed interface that prioritizes developer experience and aesthetics.
What we learned
Through building Variantlab, we gained deep insights into:
- The Power of Multimodal AI: How integrating different AI capabilities (text generation, image understanding, real-time voice) can create truly novel and productive user experiences.
- Web Audio API and Real-time Streaming: The complexities and best practices for working with browser audio streams for low-latency, real-time communication.
- Effective Function Calling with LLMs: Strategies for defining clear tool interfaces and handling the back-and-forth between an LLM and an application to achieve complex control flows.
- Browser-based Development Environments: The opportunities and constraints of building powerful development tools that run entirely in the browser.
- User Intent vs. Explicit Command: Designing AI interactions that can interpret user intent while also allowing for precise, explicit commands when needed.
What's next for Variantlab
We envision several exciting directions for Variantlab:
- Enhanced Conversational Context: Improving the AI's ability to maintain context across longer conversations, enabling more complex multi-step voice commands.
- Visual-to-Code Interaction: Allowing users to directly manipulate components on the canvas (e.g., resizing, repositioning) with the AI instantly updating the corresponding code.
- Component Library Integration: Tools to import and export components to common libraries (e.g., Storybook, Material UI, Shadcn UI).
- Code Refactoring & Optimization: Voice commands to refactor code, optimize performance, or ensure accessibility compliance.
- Collaborative Features: Enabling real-time, multi-user collaboration on the canvas and in code editing.
- More Advanced 3D Capabilities: Deeper integration of 3D modeling and animation features, potentially even AI-generated 3D assets.
- User-defined Tools: Allowing users to register their own custom functions or scripts that the AI can invoke, extending Variantlab's capabilities.

Log in or sign up for Devpost to join the conversation.