Inspiration
During COVID, we spent a lot of time playing Minecraft, but we weren’t great builders. To help ourselves, we started sketching our ideas on paper and then trying to recreate them block by block in the game. We ended up with stacks of drawings, houses, towers, and fantasy builds, using them as rough blueprints to guide our builds. This process inspired BlockPrint, a way to turn drawings, images, and even voice into real Minecraft structures generated live in your world.
What it does
BlockPrint is an AI powered system that converts images and voice commands into real, buildable Minecraft structures and constructs them live, in game. Users can either upload an image (virtually drawn or hand sketched) or upload an audio description (prerecorded or live), and create a structured Minecraft blueprint that generates live in server.
How we built it
BlockPrint was a full stack AI and game integration application. For the frontend, we used React, Vite, and TypeScript. We also incorporated a drag and drop image upload interface as well as a voice input integration using ElevenLabs. We also built a blueprint visualization UI for users to see a preview of the build. For the backend, we used FastAPI for the api's, VisionAI to convert images into structured blueprint JSON, and ElevenLabs speech to text to convert spoken descriptions into build prompts.
Challenges we ran into
One of the biggest challenges we ran into was the slant roof. This was another big challenge, as the LLM had a hard time distinguishing it from a regular gable roof. Getting it to differentiate as well as changing the block structure from stair blocks to slabs made it difficult. The final challenge when it came to the slanted roof was the block details itself. For example, we originally uses a mix of regular planks and slabs to get the lower roof angle as opposed to stairs, however the planks and slabs would be generated as different colors leading to an unpleasant final product. The solution to this was ultimately to only use slabs which always generated the same blocks and this led to a consistent color around the whole roof.
Accomplishments that we're proud of
We're most proud of successfully creating a complete end to end system where an image or voice command becomes a real Minecraft structure, the structure builds automatically, and users can watch the build happen live. We also were able to incorporate computer vision, voice input, and procedural generation.
What we learned
We learned how to connect AI models to real time interactive systems. We were able to gain experience with VisionAI and prompt engineering, Voice AI integration using ElevenLabs, and backend architecture using FastAPI. Most importantly, we learned how to turn AI output into something tangible and interactive.
What's next for BlockPrint
We would like to expand BlockPrint into a full AI world building platform. This could include supporting full 3D builds from different image angles, generating entire villages or cities, and adding in game AI assistants that build alongside players.
Built With
- elevenlabs
- fastapi
- gemini
- rcon
- react
- typescript
Log in or sign up for Devpost to join the conversation.