Inspiration
Small e-commerce sellers and businesses can face difficulties in getting professional 3D product models, as they can cost thousands of dollars and require specialized expertise. We were further inspired by Amazon's AR View feature, and aimed to lower the cost for this kind of feature and streamline it further.
What it does
3Define can convert a 2D product photo into an AR-ready 3D model through a web interface. Users can either upload an image or take a photo using the device camera. The platform offers optional AI-powered image enhancement and modification via Google Gemini to optimize photos for 3D reconstruction. After capturing a product photo, users receive an interactive GLB file that they can rotate, zoom, and view in augmented reality using their device.
How we built it
Within our backend, FastAPI powers our REST API, handling file uploads and AI calls. We used Python's fal_client to interface with the Trellis 2 model for hosting the 3D generation infrastructure. Within our frontend, we used React and TypeScript with Vite for fast module replacement, Tailwind CSS for a responsive design, and Google's model-viewer for 3D rendering. We implemented device camera capture by using the getUserMedia API with proper error handling for permission states, plus the AR.js library for mobile AR handoff. We used a Dockerized architecture to deploy the application.
Challenges we ran into
GPU Challenges: Trellis 2 requires 24 GB of VRAM minimum, and it was difficult to acquire this type of cloud compute. AWS Bedrock and Sagemaker did not have available GPUs suitable for this task, as well as Vultr, however we did not have access to higher-level GPUs with the provided credits. As such, we had to use fal.ai. AR Capability: Initially, our AR functionality could not run on iOS devices at all, and Android devices required CORS workarounds. We had to research and implement different tools and methods for AR functionality. After some trial and error, we were able to settle on using AR.js with proper fallback strategies to support both platforms, as well as mobile functionality. Additionally, getting camera permissions to work consistently across iOS and Android browsers required testing and error handling.
Accomplishments that we're proud of
-Cross-Platform AR: We were able to successfully implement AR handoff that works on both iOS and Android devices, as well as mobile devices. -Upload/capture photo -> Gemini Enhancement -> Trellis 2 Generation -> Downloadable GLB Result -Shipped both input paths: drag-and-drop upload and live camera capture with solid permission/error handling. -Integrated FastAPI backend services for both fal.ai (image hosting + 3D generation) and Gemini image enhancement (with fallback/error handling). -Delivered interactive 3D viewing in-browser (rotate/zoom) with WebGL detection, load-timeout protection, and static fallback viewer. -Implemented in-app AR viewing with live camera background and touch gestures (rotate, pan, scale), plus model download. -Implemented in-app AR viewing with live camera background and touch gestures (rotate, pan, scale), plus model download. -Deployed and hosted the app infrastructure on AWS Lightsail, giving you a practical cloud deployment path for both frontend and backend.
What we learned
We learned how to deploy and manage cloud services like AWS Lightsail, implement cross-platform AR experiences, build interactive 3D rendering workflows in the browser, and use AI-powered image modification to improve inputs for 3D model generation.
What's next for 3Define
Next, we plan to scale the product to make generation faster and more reliable, improve model quality while reducing turnaround time, refine the UI/UX for a smoother user flow, and pursue funding to support stronger infrastructure and continued development.
Log in or sign up for Devpost to join the conversation.