GeminAI — AI-Powered 2D to 3D Jewelry Reconstruction
Transforming flat jewelry images into production-ready 3D models — instantly, accurately, and interactively.
Inspiration
Generative AI has made it effortless to design stunning 2D jewelry concepts — but converting those flat designs into accurate, manufacturable 3D models has remained a slow, manual, and expensive process.
We were inspired to build a solution that could automate the entire pipeline from image to render-ready model, enabling designers and retailers to iterate at the speed of imagination.
What It Does
GeminAI is an AI-powered jewelry reconstruction system that converts 2D jewelry images into production-ready 3D models using a segmentation-driven, component-based pipeline. It supports a wide range of jewelry types and components including:
- Diamonds & Gemstones
- Metal Bands & Prongs
- Settings & Decorative Elements
- Pearls & Custom Components
The system processes images through specialized agents and outputs:
| Output | Description |
|---|---|
| Segmented Components | AI-separated metal and gemstone layers |
| 3D Mesh (.OBJ) | Accurate base metal structure reconstructed in 3D |
| Parametric JSON Scene | Structured, editable representation of all components |
| Live Cost Estimate | Dynamic pricing based on materials and gemstones |
| Exportable Models | OBJ, GLB, STL, and PNG export support |
It is built for multiple real-world use cases:
- Jewelry Design Studios
- Custom Jewelry Retailers
- Online Visualization Platforms
- Manufacturing & Prototyping Pipelines
How We Built It
GeminAI uses a segmentation-driven, component-based AI pipeline designed to minimize latency and maximize editability.
Frontend
- React.js with Three.js / React Three Fiber and Drei for interactive 3D rendering and real-time customization
Backend
- Flask (Python) APIs orchestrating the segmentation, reconstruction, and positioning agents
AI Models
| Model | Role |
|---|---|
| Gemini Vision API | Image understanding and component classification |
| Hunyuan 2.0 | 2D-to-3D base metal mesh generation |
| Custom Segmentation Agent | Metal and gemstone layer separation |
| Positioning Agent | Centroid-based gemstone coordinate computation |
Integrations
- Vector Database — RAG-based caching for fast retrieval of previously generated meshes
- OBJ Component Library — Prebuilt reusable 3D assets for gemstones, bands, prongs, and settings
- Parametric JSON — Scene format enabling real-time, non-destructive design edits
Each stage of the pipeline is handled by a dedicated agent to ensure modular, reliable, and scalable processing.
Pipeline — Step by Step
1. Image Input → User uploads a 2D jewelry image
2. AI Segmentation → Metal structure and gemstones are separated
3. Base Metal Extract → Gemstones removed to isolate pure metal layer
4. 2D → 3D Conversion → Hunyuan 2.0 generates a 3D mesh (.OBJ)
5. Component Mapping → Gemstones matched to prebuilt asset library
6. Position Estimation → Centroid-based (x, y, z) coordinates computed
7. JSON Scene Build → All components structured into parametric scene
8. Render & Preview → Three.js assembles and renders the final model
Agent-Based Architecture
| Agent | Responsibility |
|---|---|
| Segmentation Agent | Extracts and separates image components |
| Reconstruction Agent | Handles base metal 3D mesh generation |
| Positioning Agent | Computes accurate gemstone placement |
| Cost Agent | Calculates real-time pricing by material and component |
Challenges We Ran Into
- Generating accurate 3D geometry from flat, often ambiguous 2D jewelry images
- Computing precise gemstone placement from segmented image layers without manual input
- Reducing end-to-end latency for complex designs to make the tool practical for real workflows
- Building a reusable component system flexible enough to cover the diversity of jewelry designs
- Maintaining structural integrity during mesh reconstruction across varied metal shapes
Accomplishments We Are Proud Of
- Built a fully automated image-to-3D pipeline without requiring manual CAD intervention
- Achieved ~12× latency reduction on repeat structures using RAG-based mesh caching
- Developed centroid-based placement for accurate, agent-driven gemstone positioning
- Implemented real-time customization — material, shape, size, and position edits without full model regeneration
- Delivered live cost estimation directly tied to design choices
- Created a parametric JSON scene format enabling modular, version-controlled design workflows
What We Learned
- Treating jewelry as a single mesh is a dead end — component-based reconstruction is essential for editability
- RAG-based caching dramatically changes what's practical — the difference between a 1-minute and 5-second render is the difference between a demo and a product
- Segmentation quality upstream determines everything downstream — garbage in, garbage mesh out
- Real-world jewelry design tools require speed and precision together, not a trade-off between them
- Parametric representations are far more valuable than static outputs for iterative creative workflows
What's Next for GeminAI
- [ ] Improve support for complex and overlapping geometry in intricate designs
- [ ] Expand the component library with more gemstone cuts, chain types, pendants, and earrings
- [ ] Enhance segmentation accuracy for tightly packed or layered components
- [ ] Build an immersive AR/VR try-on experience for retail environments
- [ ] Add smart budget recommendations — auto-suggest alternative gemstones or metals based on price targets
- [ ] Improve the parametric JSON scene format for higher reconstruction fidelity
- [ ] Explore real-time collaborative design for studio workflows
Tech Stack
Frontend → React.js, Three.js / React Three Fiber, Drei
Backend → Flask (Python)
AI & Vision → Gemini Vision API, Hunyuan 2.0
Image Processing → PIL, Custom Segmentation Model
3D Assets → OBJ Component Library, Parametric JSON Scene
Vector Database → RAG-Based Caching for Mesh Retrieval
Output Formats → OBJ, GLB, STL, PNG
Built for designers who think in images and need to ship in 3D.
Log in or sign up for Devpost to join the conversation.