Vision Forge
Input Drawing
Output 2
Output 1
While using the application
Experience the Power of AR

🪄 VisionForge: 3D-Magic-Pencil

Imagine pointing your Spectacles at a doodle, saying "Shazam", and instantly seeing it transform into a 3D object anchored in your real world.
VisionForge is an AI-powered AR experience that converts sketches and voice commands into immersive 3D assets — in seconds.

✨ Inspiration

I've always been fascinated by Harry Potter and other magic-filled worlds — where you draw in the air or wave a wand and it comes to life.
As an AR + AI enthusiast, I wanted to bring that same sense of wonder into reality — turning sketches into living 3D objects with just a word.

🪄 What It Does

🎨 Sketch-to-3D: Point your Spectacles at a doodle and say the magic word.
🗣 Voice-activated: Trigger the scan hands-free with a voice command (default: "Shazam").
🧊 Instant 3D Generation: AI creates a textured model from your drawing.
⚓ Anchored in Reality: See it placed right in your world using AR.

🛠 How We Built It

Computer Vision + AI
Used OpenAI Vision API to process camera frames and generate 3D-friendly prompts.
Text-to-3D Conversion
Integrated Meshy + Snap3D to generate models in real time.
AR Placement
Leveraged Lens Studio + Instant World Hit Test to anchor assets in space.
Voice Control
Added speech recognition to trigger the entire process hands-free.
UX Magic
Built a custom edge-fade masking effect for a smooth, immersive AR experience.

⚔️ Challenges I Ran Into

Understanding the codebase for Spectacles and Lens Studio.
As an Python user, working with typescript and js was quite complex ( even after taking help of AI tools)
Balancing speed and quality of 3D generation.
Crafting precise prompts so the AI focuses on the doodle and not the background.
Ensuring anchoring accuracy so assets spawn exactly where expected.
Making voice recognition work well in noisy hackathon venues.

🏆 Accomplishments I am Proud Of

Built a working end-to-end pipeline: sketch → AI prompt → 3D model → AR placement.
Achieved real-time model streaming for a seamless experience.
Designed a magical, playful UX that makes people smile.
Successfully integrated multiple APIs and SDKs into one smooth workflow.

📚 What We Learned

The power of prompt engineering in controlling AI outputs.
Optimizing real-time AR performance without compromising immersion.
How effective and interesting is the AR and VR domain in terms of technical aspect as well.
How to orchestrate complex pipelines between Vision, Meshy, and Lens Studio.
That a touch of magic (voice triggers, edge-fade effects) makes tech feel human.
How effective can be AI tools like Codex, cursor and Clint for agile software development.

🚀 What's Next for VisionForge: 3D-Magic-Pencil

🎨 Texture Generation: Full-color, photorealistic models.
🌍 Multiplayer Mode: Let multiple users share and interact with the same AR objects.
✍️ Gesture-based Drawing: Draw in mid-air — no paper needed.
📚 Model Library: Save and share your creations with the community.
📱 Mobile Support: Extend to smartphones and tablets.