Material Alch3mist
Inspiration
The Material-Alch3mist project was inspired by the need to create 3D assets rich in textures directly from textual descriptions. We wanted to explore how generative models could be combined in a modular pipeline to produce highly detailed, stylized 3D meshes efficiently. The goal was to bridge the gap between creative ideas and 3D content creation, making the process accessible to artists and designers while leveraging cutting-edge diffusion and texture enhancement techniques.
What it does
Material-Alch3mist is an end-to-end text-to-mesh pipeline that converts text prompts into richly textured 3D models. Using a three-stage approach:
- Generate base 2D images from text using FLUX.1 [dev].
- Enhance textures, materials, and colors using a specialized LoRA-trained FLUX.1-Kontext [dev] model.
- Convert the textured 2D images into multiview 3D meshes using TRELLIS while preserving fine details and stylistic coherence.
The pipeline is modular, allowing independent adjustments at each stage and fast experimentation with different textures, LoRAs, and model variants.
How we built it
We leveraged three main models:
- FLUX.1 [dev]: Generates the initial 2D images from text prompts.
- FLUX.1-Kontext [dev]: Enhances the textures, materials, and colors of these images, using a LoRA adapter trained with the trigger word
/Alch3mist/. - TRELLIS: Converts the enhanced 2D images into 3D meshes.
The pipeline was orchestrated using ComfyUI, with workflows available as JSON configurations for easy reproduction. Images were preprocessed using Python + PIL to ensure uniform RGBA format. For lightweight model management, OSTRIS was used, and we provided an optional web interface with OWUI. Training and generation were accelerated using high-end GPUs (RTX A6000, A100). The entire system can be run in a Docker container for reproducibility.
Challenges we ran into
- Texture fidelity: Ensuring that textures generated in 2D were faithfully preserved when converted to 3D meshes.
- Efficient LoRA training: Training lightweight LoRAs quickly while maintaining high-quality texture generation.
- Modularity vs complexity: Keeping each stage independent while making the full pipeline seamless for end-to-end generation.
- Dataset creation: Developing a specialized dataset to fine-tune FLUX.1-Kontext [dev] for accurate texture, material, and color handling.
Accomplishments that we're proud of
- Successfully created a modular, end-to-end text-to-3D pipeline.
- Developed a specialized LoRA adapter that enhances textures with minimal training data.
- Built a reproducible workflow with ComfyUI JSON configurations.
- Created a specialized dataset for texture-enhanced prompt interpretation.
What we learned
- Modular pipelines allow flexibility without retraining large models.
- LoRA adapters can efficiently specialize models for texture enhancement with minimal resources.
- Diffusion-based text-to-mesh generation benefits from staged workflows: text → 2D → enhanced texture → 3D mesh.
- Preprocessing, dataset quality, and consistent formats (RGBA) are critical for maintaining fidelity across stages.
What's next for Material Alch3mist
- Developing a web interface for users to submit prompts and view results in real time.
- Expanding the dataset to include more diverse textures and materials.
- Experimenting with new LoRA variants for specialized artistic styles.
- Optimizing mesh generation fidelity and exploring animation-ready outputs for interactive 3D applications.



Log in or sign up for Devpost to join the conversation.