Inspiration

What you see on the web is often not what you actually get. Online listings rely on flat images that hide depth, scale, and real proportions. In second-hand markets, this becomes a bigger issue because there is no easy way to know whether an item will truly fit in your space. Buyers are left guessing about size, layout, and compatibility. FlipFind was created to reduce that uncertainty and make online browsing feel closer to real life.

What It Does

FlipFind collects listings of furniture and home decor from platforms like Facebook Marketplace. The images are processed through the SAM3 pipeline to generate interactive 3D models. Users can then place these reconstructed objects into their own environment using augmented reality. This allows them to see how an item looks, how large it actually is, and how it fits within their home before committing to a purchase. Instead of imagining the outcome, users can visualize it directly.

How We Built It

The application runs on a Flask backend that serves HTML and CSS for the frontend experience. Built-in phone sensors are used to generate depth maps and enable real-time AR placement. Computer vision models handle 3D reconstruction from 2D images, while browser-based rendering manages spatial interaction. Modal is used to power scalable inference and distributed workloads. This system architecture balances computational demands with responsiveness.

Challenges We Ran Into

AR compatibility varied across devices, which made consistent deployment difficult. Apple’s ARKit offered limited customization for certain spatial interactions. Generating reliable depth purely from vision models required extensive experimentation and tuning. Multi-GPU training introduced coordination complexity, and inference throughput needed optimization. Asynchronous function calls were implemented to better utilize compute resources and improve performance.

Accomplishments We’re Proud Of

We successfully achieved accurate object placement in AR within a web-based environment. A custom AR interface and user experience were built from the ground up. Depth mapping was integrated to improve object scaling and spatial alignment. Real-time 3D reconstruction from 2D images was implemented. Scalable inference was deployed through multi-GPU asynchronous workflows in Modal.

What We Learned

AR functionality can be effectively incorporated into modern web applications. Building interactive 3D viewers in the browser requires careful coordination between rendering and sensor data. Reconstructing 3D objects from 2D images demands both strong vision models and performance optimization. Scaling machine learning systems with Modal requires thoughtful infrastructure design. Most importantly, spatial computing can meaningfully improve trust and clarity in online marketplaces.

What’s Next for FlipFind

Future development will expand beyond Facebook Marketplace and integrate additional data sources. A wider variety of items will be supported so that more objects can be scanned and visualized. The long-term goal is to create a fully 3D second-hand marketplace for anything that can be captured and reconstructed. FlipFind aims to make spatial commerce practical, accessible, and scalable.

Built With

Share this project:

Updates