About the Project
https://bolt.new/~/github-oa7ktcwj-zfvgffvh
https://www.npmjs.com/package/@mediapipe/tasks-vision
https://github.com/ShuhongChen/bizarre-pose-estimator
Inspiration
The inspiration for Glossless came from the frustration of working with complex 3D pose estimation tools that required extensive technical knowledge. As someone interested in both AI and creative applications, I wanted to bridge the gap between powerful pose detection technology and accessible design tools. The idea was to create something that felt more like a creative studio than a technical interface - hence the bold, brutalist design that makes pose editing feel approachable and fun.
What it does
Glossless is a professional pose editor that transforms static images into interactive 3D mannequins. Users can upload photos of people or illustrated characters, and the app automatically detects poses using AI, then converts them into editable 3D models. The app features:
- A hybrid AI pipeline that intelligently routes photos to MediaPipe for real-world accuracy and illustrations to a custom Modal API for specialized artist-centric results.
- Dual-view editing with synchronized 2D and 3D manipulation.
- Real-time lighting controls with spotlight and ambient lighting modes.
- Cloud project storage with user authentication.
Export capabilities for use in other applications.
How we built it
Glossless is built with modern web technologies:
Frontend Framework: React with TypeScript
3D Rendering: Three.js with React Three Fiber
UI Framework: Tailwind CSS with a custom brutalist design system
Photo Pose Detection: The latest MediaPipe Tasks Vision API (@mediapipe/tasks-vision) for robust, client-side pose detection.
Illustration Pose Detection: A custom-built Modal API deploying the Bizarre Pose Estimator and VideoPose3D lifter, creating a specialized 2D-to-3D pipeline for illustrated characters.
2D-to-3D Pose Lifting: VideoPose3D neural network for converting 2D keypoints into full 3D poses with temporal consistency.
Backend: Supabase
Build Tool: Vite
Challenges we ran into
- Coordinate System Complexity: The biggest challenge was managing three different coordinate systems (image pixels, 2D canvas, and 3D world space) and ensuring smooth transformations between them.
- MediaPipe API Evolution: Our project began on an older MediaPipe API. During development, we navigated a significant breaking change, migrating our entire detection logic from the deprecated @mediapipe/pose to the modern @mediapipe/tasks-vision library. This required a full architectural rewrite of our detection service but resulted in a more stable and future-proof system.
- MLOps & Dependency Hell: Deploying our custom Python models to Modal required a deep dive into dependency management, pinning specific versions of PyTorch, NumPy, and other libraries to resolve conflicts between legacy research code and a modern cloud environment.
- Performance Optimization: Ensuring smooth 60fps rendering with complex lighting and shadows required careful optimization of Three.js rendering and React re-renders.
- Real-time Synchronization: Keeping the 2D and 3D views perfectly synchronized while allowing independent manipulation required careful state management and efficient rendering.
Accomplishments that we're proud of
- A truly hybrid AI system that can intelligently process both real-world photos and artistic illustrations, using the best model for each task.
- Seamless dual-view editing that feels natural and intuitive.
- A professional-grade lighting system with real-time shadows and multiple lighting modes.
- A robust coordinate transformation system that handles any image size accurately.
- Full-stack implementation with cloud storage and user authentication. ## What we learned This project taught us the intricacies of 3D graphics programming, the importance of stable APIs, and how to architect around dependency conflicts. We gained deep experience with React Three Fiber and learned how to optimize WebGL performance for real-time applications. The integration of multiple AI services showed us how to build robust pipelines that handle different data formats and error conditions gracefully.
What's next for Glossless
- Animation timeline for creating pose sequences and keyframe animation.
- Advanced export formats including FBX, OBJ, and animation data.
- Collaborative editing allowing multiple users to work on poses together.
- AI pose suggestions that recommend natural pose variations.
- Integration with popular 3D software like Blender and Maya.
Built With
- bolt
- detectron2
- mediapipe
- modal
- python
- r3f
- react
- react-three-fiber
- supabase
- tailwind
- three.js
- typescript
- videopose3d
- vite
Log in or sign up for Devpost to join the conversation.