Inspiration

When vibe coding with our software agent, users often create small 3d games and simulators using three.js However, the quality of the generated apps is severely limited by the quality of the 3d assets which can be directly written by LLM.

What it does

Allows LLMs to generate high quality 3d assets via tool calls.

How we built it

Connects image (OpenAI gpt-image-1) and 3D (Tencent Hunyuan3D-2) models to LLMs via an MCP server. The server uses an image model to create a storyboard of the item by creating front, back, and side views from the text prompt provided by the LLM. The 3d model then combines these views into a mesh and renders it as a three.js compatible model.

Challenges we ran into

Straight text-to-3D returns poor results, presumably because the latest 3D models do not incorporate world knowledge in the same fashion as the multimodal models underlying state-of-the-art image generation.

Returning models in a way that can be used for vibe coded outputs is very agent-specific.

What's next for Three-MCP

Look for deep integration of the concept into our software coding agents at launch: agentopia.ai.

Built With

Share this project:

Updates