Inspiration
The idea for Weyfaro came from a universal frustration: the "Blue Dot" problem. We have mapped the entire world with GPS, yet the moment we step inside a mall, a hospital, or a large office complex, we are completely blind.
We noticed that existing solutions were fundamentally flawed. Building managers don't want to install thousands of dollars worth of Bluetooth beacons (hardware friction), and visitors absolutely refuse to download a 100MB dedicated app just to find a single meeting room (software friction).
We wanted to build a solution that respected the user's time and the building manager's budget. We asked ourselves: Can we use the camera everyone already has in their pocket to navigate, without requiring any new infrastructure?
What it does
Weyfaro is a "Zero-Infrastructure" Indoor Navigation System that runs entirely in the browser.
It turns a simple video walk-through into a digital floor plan. User walks through their space recording a video, uploads it to Weyfaro, and our AI pipeline reconstructs the 2D map automatically.
It provides instant "blue dot" localization without an app download. Another user scans a QR code, snaps a photo of their current location, and Weyfaro calculates their exact position and draws a red path to their destination.
How we built it
We built Weyfaro on a modern, serverless stack designed to offload heavy Computer Vision tasks to the cloud while keeping the user interface snappy.
- Frontend: We used Next.js 14 (App Router) hosted on Vercel to ensure the app is accessible via a simple URL.
- Data & Storage: Supabase acts as our backend backbone. We used Supabase Postgres for storing map metadata/POIs, and Supabase Storage to handle the large raw video files and processed occupancy grids.
- AI & Compute: Modal is our engine room. We deploy serverless GPU functions that spin up on demand to run our heavy computer vision pipeline.
- The CV Pipeline: We utilized Depth Anything 3 to extract dense depth maps from monocular video. These are fed into a Visual Odometry algorithm to track camera movement and project the 3D world down into a 2D Occupancy Grid.
Challenges we ran into
Building a VPS (Visual Positioning System) accessible in a web browser came with significant hurdles:
- The "Cold Boot" Latency: Running heavy AI models on serverless GPUs often incurs a start-up penalty. Optimizing the user's localization request so they didn't have to wait 10+ seconds just to find their location required careful tuning of Modal's
keep_warmstrategies. - Monocular Scale Ambiguity: One of the hardest parts of computer vision is knowing "how big" things are from a single camera. Without LiDAR, the map scale was initially arbitrary. We had to implement a calibration step where the map scale is normalized to ensure the digital path matches reality.
- Data Gravity: Moving large video files between the frontend, storage, and the GPU worker is slow. We learned the hard way that passing bytes through the web server crashes the app. Moving to direct uploads with Supabase presigned URLs solved this bottleneck.
Accomplishments that we're proud of
- Zero-Hardware Localization: We successfully demonstrated that you don't need expensive beacons or WiFi RTT to find a user indoors—just a photo is enough.
- The "Magic" UX: We are proud of the seamless flow where a user uploads a raw video and sees a generated map appear minutes later. Bridging the gap between a simple web form and a complex GPU pipeline felt like magic when it finally worked.
What we learned
- The Power of Serverless GPUs: We learned how to bridge the gap between traditional web dev (JS/TS) and high-performance Python engineering using Modal. It completely changed how we view deployment.
- Postgres is Versatile: Using Supabase, we learned how to treat the database not just as a text store, but as the central nervous system for a complex file processing pipeline, using Row Level Security to keep data safe.
- User Experience > Technology: The coolest AI is useless if the user has to wait too long. We learned that for consumer-facing AI products, perceived performance is just as important as model accuracy.
What's next for Weyfaro
- 3D Map Navigation On top opf the 2D Map, user will be able to get navigation guide from the 3D map directly.
- Augmented Reality (AR) View: Instead of just a navigational guidance, we want to provide real-time navigation by overlaying arrows directly onto the camera feed in the browser.
- Multi-Floor Support: Upgrading the SLAM algorithm to detect staircases and elevators to link multiple maps together.
- Real-Time Crowding: Using the localization queries to generate heatmaps, helping building managers understand high-traffic areas in real-time.
Built With
- huggingface
- nextjs
- postgresql
- python
- supabase
Log in or sign up for Devpost to join the conversation.