GeminAI — AI-Powered 2D to 3D Jewelry Reconstruction

Transforming flat jewelry images into production-ready 3D models — instantly, accurately, and interactively.


Inspiration

Generative AI has made it effortless to design stunning 2D jewelry concepts — but converting those flat designs into accurate, manufacturable 3D models has remained a slow, manual, and expensive process.

We were inspired to build a solution that could automate the entire pipeline from image to render-ready model, enabling designers and retailers to iterate at the speed of imagination.


What It Does

GeminAI is an AI-powered jewelry reconstruction system that converts 2D jewelry images into production-ready 3D models using a segmentation-driven, component-based pipeline. It supports a wide range of jewelry types and components including:

  • Diamonds & Gemstones
  • Metal Bands & Prongs
  • Settings & Decorative Elements
  • Pearls & Custom Components

The system processes images through specialized agents and outputs:

Output Description
Segmented Components AI-separated metal and gemstone layers
3D Mesh (.OBJ) Accurate base metal structure reconstructed in 3D
Parametric JSON Scene Structured, editable representation of all components
Live Cost Estimate Dynamic pricing based on materials and gemstones
Exportable Models OBJ, GLB, STL, and PNG export support

It is built for multiple real-world use cases:

  • Jewelry Design Studios
  • Custom Jewelry Retailers
  • Online Visualization Platforms
  • Manufacturing & Prototyping Pipelines

How We Built It

GeminAI uses a segmentation-driven, component-based AI pipeline designed to minimize latency and maximize editability.

Frontend

  • React.js with Three.js / React Three Fiber and Drei for interactive 3D rendering and real-time customization

Backend

  • Flask (Python) APIs orchestrating the segmentation, reconstruction, and positioning agents

AI Models

Model Role
Gemini Vision API Image understanding and component classification
Hunyuan 2.0 2D-to-3D base metal mesh generation
Custom Segmentation Agent Metal and gemstone layer separation
Positioning Agent Centroid-based gemstone coordinate computation

Integrations

  • Vector Database — RAG-based caching for fast retrieval of previously generated meshes
  • OBJ Component Library — Prebuilt reusable 3D assets for gemstones, bands, prongs, and settings
  • Parametric JSON — Scene format enabling real-time, non-destructive design edits

Each stage of the pipeline is handled by a dedicated agent to ensure modular, reliable, and scalable processing.


Pipeline — Step by Step

1. Image Input         → User uploads a 2D jewelry image
2. AI Segmentation     → Metal structure and gemstones are separated
3. Base Metal Extract  → Gemstones removed to isolate pure metal layer
4. 2D → 3D Conversion  → Hunyuan 2.0 generates a 3D mesh (.OBJ)
5. Component Mapping   → Gemstones matched to prebuilt asset library
6. Position Estimation → Centroid-based (x, y, z) coordinates computed
7. JSON Scene Build    → All components structured into parametric scene
8. Render & Preview    → Three.js assembles and renders the final model

Agent-Based Architecture

Agent Responsibility
Segmentation Agent Extracts and separates image components
Reconstruction Agent Handles base metal 3D mesh generation
Positioning Agent Computes accurate gemstone placement
Cost Agent Calculates real-time pricing by material and component

Challenges We Ran Into

  • Generating accurate 3D geometry from flat, often ambiguous 2D jewelry images
  • Computing precise gemstone placement from segmented image layers without manual input
  • Reducing end-to-end latency for complex designs to make the tool practical for real workflows
  • Building a reusable component system flexible enough to cover the diversity of jewelry designs
  • Maintaining structural integrity during mesh reconstruction across varied metal shapes

Accomplishments We Are Proud Of

  • Built a fully automated image-to-3D pipeline without requiring manual CAD intervention
  • Achieved ~12× latency reduction on repeat structures using RAG-based mesh caching
  • Developed centroid-based placement for accurate, agent-driven gemstone positioning
  • Implemented real-time customization — material, shape, size, and position edits without full model regeneration
  • Delivered live cost estimation directly tied to design choices
  • Created a parametric JSON scene format enabling modular, version-controlled design workflows

What We Learned

  • Treating jewelry as a single mesh is a dead end — component-based reconstruction is essential for editability
  • RAG-based caching dramatically changes what's practical — the difference between a 1-minute and 5-second render is the difference between a demo and a product
  • Segmentation quality upstream determines everything downstream — garbage in, garbage mesh out
  • Real-world jewelry design tools require speed and precision together, not a trade-off between them
  • Parametric representations are far more valuable than static outputs for iterative creative workflows

What's Next for GeminAI

  • [ ] Improve support for complex and overlapping geometry in intricate designs
  • [ ] Expand the component library with more gemstone cuts, chain types, pendants, and earrings
  • [ ] Enhance segmentation accuracy for tightly packed or layered components
  • [ ] Build an immersive AR/VR try-on experience for retail environments
  • [ ] Add smart budget recommendations — auto-suggest alternative gemstones or metals based on price targets
  • [ ] Improve the parametric JSON scene format for higher reconstruction fidelity
  • [ ] Explore real-time collaborative design for studio workflows

Tech Stack

Frontend         → React.js, Three.js / React Three Fiber, Drei
Backend          → Flask (Python)
AI & Vision      → Gemini Vision API, Hunyuan 2.0
Image Processing → PIL, Custom Segmentation Model
3D Assets        → OBJ Component Library, Parametric JSON Scene
Vector Database  → RAG-Based Caching for Mesh Retrieval
Output Formats   → OBJ, GLB, STL, PNG

Built for designers who think in images and need to ship in 3D.

Built With

Share this project:

Updates