About SVG Mint 🎨

💡 Inspiration

The inspiration for SVG Mint came from a simple observation: while AI can generate stunning images, vector graphics remain the domain of specialized design tools that require significant expertise. I asked ourselves: "What if creating and editing vector graphics could be as simple as having a conversation?"

Traditional vector editors like Adobe Illustrator or Figma are powerful but have steep learning curves. Meanwhile, AI image generators produce raster outputs that can't be easily edited or scaled. I saw an opportunity to bridge this gap using Google's Gemini AI—creating a tool that generates semantic, structured SVG code that's both AI-editable and human-readable.

The key insight was treating SVG generation not as an image problem, but as a code generation problem. By leveraging Gemini's multimodal capabilities and code understanding, I could create vectors with meaningful IDs and hierarchical structure, enabling precise, intent-preserving edits.

What I Learned

Building SVG Mint taught us invaluable lessons across multiple domains:

1. Prompt Engineering for Structured Output

I discovered that generating valid, semantic SVG requires extremely precise prompting. Unlike natural language tasks, SVG generation demands:

Geometric precision: Coordinates, paths, and transforms must be mathematically correct
Semantic naming: Every element needs a meaningful id for future editing
Hierarchical thinking: Proper use of <g> groups and <defs> for reusability

Our prompts evolved from simple descriptions to detailed construction blueprints. For example, instead of "create a balloon", I learned to prompt:

"Construct a balloon using a radial gradient. 1) Define a <radialGradient> centered at 30% 30%. 2) Create a vertical ellipse (rx=150, ry=180) with id='balloon-body'. 3) Add a triangle knot at the bottom. 4) Draw a bezier curve string flowing down in an S-shape."

This level of specificity increased our success rate from ~60% to ~95%.

2. Multi-Model Strategy

I learned that different AI models excel at different tasks:

Gemini 2.5 Pro: Superior understanding for complex compositions and image-to-SVG conversion

Implementing a dynamic model selection strategy based on prompt complexity improved both speed and quality. The system analyzes prompt length, image attachments, and edit scope to choose the optimal model.

3. State Management Complexity

Managing application state for a real-time editor proved challenging. I needed to handle:

Undo/redo history: Storing SVG snapshots without memory bloat
Preview mode: Showing AI edits before committing
Multi-selection: Tracking locked, hidden, and selected elements
Persistence: LocalStorage for session recovery

I learned to use React's useEffect hooks strategically and implement efficient state updates to prevent unnecessary re-renders.

4. SVG Manipulation at Scale

Working with SVG as both a visual format and XML structure taught us:

DOM parsing: Using DOMParser to safely manipulate SVG elements
Transform mathematics: Computing rotation, scale, and translation matrices
Layer ordering: Understanding SVG's painter's algorithm for z-index control
Export formats: Converting SVG to PNG, JPG, and PDF using canvas rendering

The mathematical foundation for transforms was particularly enlightening. For rotation around a center point $(c_x, c_y)$ by angle $\theta$:

$$ \text{transform} = \text{translate}(c_x, c_y) \cdot \text{rotate}(\theta) \cdot \text{translate}(-c_x, -c_y) $$

🛠️ How I Built It

Architecture Overview

SVG Mint follows a clean client-server architecture:

┌─────────────────────────────────────────────────────────┐
│                     Frontend (React)                     │
│  ┌─────────────┐  ┌──────────────┐  ┌───────────────┐  │
│  │ SVG Preview │  │ Layer Tree   │  │ Manual Controls│  │
│  │  Component  │  │  Component   │  │   Component    │  │
│  └─────────────┘  └──────────────┘  └───────────────┘  │
│         │                 │                  │           │
│         └─────────────────┴──────────────────┘           │
│                           │                              │
│                    ┌──────▼──────┐                       │
│                    │   App.jsx   │                       │
│                    │ (State Mgmt)│                       │
│                    └──────┬──────┘                       │
└───────────────────────────┼──────────────────────────────┘
                            │ HTTP/REST
┌───────────────────────────▼──────────────────────────────┐
│                  Backend (FastAPI)                       │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │   Generate   │  │     Edit     │  │   Cleanup    │  │
│  │   Endpoint   │  │   Endpoint   │  │   Endpoint   │  │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘  │
│         └──────────────────┴──────────────────┘          │
│                           │                              │
│                  ┌────────▼─────────┐                    │
│                  │  SVG Pipeline    │                    │
│                  │  (Validation &   │                    │
│                  │   Processing)    │                    │
│                  └────────┬─────────┘                    │
│                           │                              │
│                  ┌────────▼─────────┐                    │
│                  │  Gemini Client   │                    │
│                  │  (2.0 Flash +    │                    │
│                  │   1.5 Pro)       │                    │
│                  └──────────────────┘                    │
└──────────────────────────────────────────────────────────┘

Frontend Stack

React 19: Component-based UI with hooks for state management
Vite: Lightning-fast development server and build tool
Tailwind CSS 4: Utility-first styling with custom animations
Lucide React: Beautiful, consistent icon system
Axios: HTTP client for API communication

Backend Stack

FastAPI: Modern Python web framework with automatic OpenAPI docs
Google Gemini AI: Multimodal LLM for SVG generation and editing
Pydantic: Data validation and settings management
Python 3.11+: Type hints and modern async/await patterns

Key Technical Implementations

1. Semantic SVG Generation Pipeline

Our generation pipeline ensures every SVG is structured and editable:

def generate_svg(prompt: str, image: Optional[bytes] = None):
    # 1. Construct detailed system prompt
    system_prompt = build_generation_prompt()

    # 2. Select optimal model (Flash vs Pro)
    model = select_model(prompt, has_image=bool(image))

    # 3. Generate SVG with Gemini
    response = gemini_client.generate(
        prompt=prompt,
        system=system_prompt,
        image=image,
        model=model
    )

    # 4. Validate and sanitize output
    svg_code = extract_svg(response.text)
    validated_svg = svg_guard.validate(svg_code)

    # 5. Ensure semantic structure
    semantic_svg = ensure_ids_and_groups(validated_svg)

    return semantic_svg

2. Intent-Preserving Edit System

The edit system maintains the original design intent while applying changes:

const handleEdit = async (instruction, imageFile = null) => {
  // 1. Create preview mode (non-destructive)
  setPreviewBaseSvg(svgCode);

  // 2. Send context to AI
  const result = await editSvg(
    svgCode,              // Current SVG state
    instruction,          // User's edit request
    primarySelectedId,    // Focused element (if any)
    imageFile            // Reference image (optional)
  );

  // 3. Show preview with before/after toggle
  setPreviewSvg(result.svg_code);

  // 4. User can approve or discard
  // Only approved edits are committed to history
};

3. Manual Controls with Transform Math

For manual edits, I compute transforms while preserving element structure:

function updateElementRotation(svgCode, elementId, angle) {
  const parser = new DOMParser();
  const doc = parser.parseFromString(svgCode, 'image/svg+xml');
  const element = doc.getElementById(elementId);

  // Get bounding box center
  const bbox = element.getBBox();
  const cx = bbox.x + bbox.width / 2;
  const cy = bbox.y + bbox.height / 2;

  // Compute rotation transform around center
  const transform = `rotate(${angle} ${cx} ${cy})`;
  element.setAttribute('transform', transform);

  return new XMLSerializer().serializeToString(doc);
}

4. Multi-Format Export System

I support SVG, PNG, JPG, and PDF exports:

export function downloadImage(svgCode, filename, format) {
  const canvas = document.createElement('canvas');
  const ctx = canvas.getContext('2d');

  // Parse SVG dimensions
  const parser = new DOMParser();
  const svgDoc = parser.parseFromString(svgCode, 'image/svg+xml');
  const svgEl = svgDoc.documentElement;

  // Set canvas size (2x for retina)
  canvas.width = parseFloat(svgEl.getAttribute('width')) * 2;
  canvas.height = parseFloat(svgEl.getAttribute('height')) * 2;

  // Render SVG to canvas
  const img = new Image();
  const blob = new Blob([svgCode], { type: 'image/svg+xml' });
  const url = URL.createObjectURL(blob);

  img.onload = () => {
    ctx.drawImage(img, 0, 0, canvas.width, canvas.height);
    canvas.toBlob((blob) => {
      saveAs(blob, filename);
    }, `image/${format}`, 0.95);
  };

  img.src = url;
}

🚧 Challenges I Faced

1. SVG Validation and Security

Challenge: AI-generated code can contain invalid XML, malicious scripts, or broken references.

Solution: I built a multi-layer validation pipeline:

XML parsing to catch syntax errors
Whitelist-based element filtering (no <script>, <iframe>, etc.)
Attribute sanitization to prevent XSS attacks
Automatic fixing of common issues (missing viewBox, invalid transforms)

2. Consistent Semantic Naming

Challenge: Gemini would sometimes generate SVGs without IDs or with generic names like rect1, path2.

Solution: I implemented a post-processing semantic labeling system:

Analyze SVG structure and visual hierarchy
Use Gemini to suggest meaningful names based on context
Automatically group related elements (e.g., all parts of a "tree" into <g id="tree">)

This increased editability by 3x in user testing.

3. Preview Mode Race Conditions

Challenge: Users could trigger multiple edits before the first completed, causing state conflicts.

Solution: Implemented a preview lock system:

Disable edit input while preview is active
Queue subsequent requests instead of dropping them
Clear preview state on user cancellation

4. Performance with Large SVGs

Challenge: Complex SVGs (1000+ elements) caused UI lag during manipulation.

Solution:

Virtualized layer tree: Only render visible layers
Debounced updates: Batch DOM changes during drag operations
Web Workers: Offload SVG parsing and transformation to background threads

Performance improved from ~200ms to ~15ms for typical operations.

5. Cross-Browser Export Compatibility

Challenge: Canvas-based export produced different results across browsers.

Solution:

Normalize SVG before rendering (explicit dimensions, embedded fonts)
Use foreignObject fallback for unsupported features
Test suite covering Chrome, Firefox, Safari, and Edge

🎯 What's Next

SVG Mint is just the beginning. Future enhancements I'have excited about:

Real-time collaboration: Multiple users editing the same SVG
Animation timeline: Visual keyframe editor for SVG animations
Component library: Reusable design system elements
Version control: Git-like branching for design iterations
API access: Programmatic SVG generation for developers

🏆 Conclusion

Building SVG Mint taught us that the future of design tools isn't about replacing human creativity—it's about amplifying it. By combining AI's generative power with structured, semantic output, we created a tool that's both powerful and accessible.

I'm proud to have built something that makes vector graphics creation as simple as describing what you want, while maintaining the precision and editability that professionals demand.

SVG Mint: Turn words into vectors with AI.

Built with ❤️ using Google Gemini AI