Make it real. Text-to-3D genAI with Poe and fal.ai

SDXL image of "a low poly 3d model of a baby monkey on a white background"
resulting 3D model from fal.ai's TripoSR

Inspiration

Text (1D) and image (2D) generative AI has taken the world by storm. 3D generative AI is still in it's infancy. The combination of the Poe image generation base bots and the fal.ai image-to-mesh endpoints enables the rapid ability to create a 3D generative AI bot on Poe. The new per-message monetization scheme enables bot creators to cover their external API costs, opening up the possibility for a new dimension for generative AI chatbots on Poe.

What it does

https://poe.com/text-to-3D takes a natural language prompt and first generates a 2D image of the prompt using SDXL. Once the user confirms that they want to create a 3D model, the 3D model is created using fal.ai and rendered on neThing.xyz

How we built it

First made a Poe image generation bot running SDXL
Then connected the image to the fal.ai TripoSR image-to-3D mesh API endpoint
Render the resulting file in 3D using neThing.xyz

Challenges we ran into

Have a newborn baby at home
Challenges with compatibility between the fal and Poe python packages
There is a known Poe API error that prevents server bots from immediately accessing the images that are generated by image bots.
The maximum resolution of the fal.ai TripoSR endpoint is very low (256), so these are low-poly files
There seems to be an image scaling issue on iOS for Poe, when a text response is mixed with an image response

Accomplishments that we're proud of

Making a new 3D generative AI Poe chatbot from scratch in a few hours.

What we learned

Keep the scope small.
Work backwards from the submission
Work step-by-step. Most of the code that was written was later revealed to be unnecessary, because I assumed a function was needed that actually wasn't.

What's next

Fixing the glTF to USDZ conversion to enable iOS users to preview their creations in augmented reality (AR)
Integrating the fal.ai text-to-mesh workflow as an option for https://poe.com/MakeItReal
Option to go from image-to-3D
Improve the image gen prompt engineering to make SDXL more optimized for text-to-image-to-3D
Get rid of suggested replies, which don't make sense in the context of the bot

Built With

fal.ai
modal
poe
python

Updates

Raymond Weitekamp started this project — Apr 06, 2024 06:01 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.