Inspiration
Finding reliable local help shouldn't be a guessing game. Living in the UAE, where everything moves fast and efficiency is paramount, you quickly realize the value of frictionless experiences. Having spent time exploring ways to streamline services and eliminate unnecessary bureaucracy, it became obvious that the everyday local service market was still stuck in the past.
When an AC breaks down in the middle of the summer or a pipe bursts, people don’t have the time or patience to navigate clunky directories, fill out long quote forms, or guess the correct search keywords. They just want to point at the problem and get it fixed.
That frustration sparked the idea for NeryByService. We wanted to build a hyper-local bridge that completely removes the search friction. The goal was simple: let users show or describe their problem naturally using voice, text, or video, and let AI do the heavy lifting of translating that human problem into a structured local service match.
What it does
NeryByService is an AI-powered hyper-local service discovery platform. Instead of forcing users to browse through predetermined categories, our platform allows them to input their problem in whatever way is easiest for them:
- Typing a quick description.
- Recording a voice note.
- Uploading a photo of the broken item.
- Capturing a short video.
Using Amazon Nova's foundation models, the app interprets the multimodal input, understands the root issue, and converts it into a structured service category. It then immediately queries a geospatial database to display the nearest relevant professionals on an interactive map, allowing the user to contact them instantly via phone or WhatsApp.
How we built it
We built NeryByService using a modern Full-Stack architecture designed for real-time AI processing and location-based speed.
- Frontend: We used React (Vite) for a fast, responsive user interface, integrating Leaflet to render interactive maps that pinpoint service providers.
- Backend: Node.js and Express handle the routing, API requests, and media processing.
- Database: We utilized MongoDB with
2dsphereindexing to enable lightning-fast geospatial queries. To determine the exact hyper-local matches, the underlying system relies on spherical geometry calculations—specifically the Haversine formula—to accurately measure distances across the Earth's surface:
$$d = 2r \arcsin\left(\sqrt{\sin^2\left(\frac{\theta_2 - \theta_1}{2}\right) + \cos(\theta_1)\cos(\theta_2)\sin^2\left(\frac{\lambda_2 - \lambda_1}{2}\right)}\right)$$
- AI Layer (Amazon Nova): This is the brain of the platform.
- Nova 2 Lite acts as the core reasoning engine, parsing text descriptions into structured JSON payloads (Service Title, Category, Keywords) for our database.
- Nova 2 Sonic processes voice notes, providing highly accurate speech-to-text transcription.
- Nova Multimodal Embeddings analyze user-uploaded images to visually detect problems (e.g., identifying a "leaking pipe" from a photo).
- FFmpeg is used on the backend to extract keyframes and audio tracks from user videos, passing the visual data to the Multimodal model and the audio to Sonic, before combining the context in Nova 2 Lite for a final query.
Challenges we ran into
- Orchestrating the Multimodal Pipeline: Handling video uploads was particularly tricky. We had to build a reliable backend pipeline using FFmpeg to separate a single video into audio tracks (for Nova 2 Sonic) and image frames (for Nova Multimodal), and then synthesize both outputs into one cohesive prompt for Nova 2 Lite without causing high latency for the user.
- Structuring AI Outputs: Foundation models are creative by nature, but our database requires strict formatting. Engineering the right prompts for Nova 2 Lite to consistently return clean, query-ready JSON data (without markdown wrapping or conversational filler) took significant iteration.
- Geospatial Accuracy: Tuning the MongoDB
2dspherequeries to return results that felt genuinely "hyper-local" required balancing the search radius to ensure users saw enough options without being overwhelmed by providers who were too far away to offer immediate help.
What we learned
- The Power of Amazon Nova: We learned how incredibly fast and capable the Nova 2 family is. Routing different media types to specialized models (Sonic for audio, Multimodal for vision) and letting Lite act as the "orchestrator" proved to be a highly efficient architecture.
- UX is Everything in AI: We learned that the best AI is invisible. Users don't care that a complex multimodal pipeline is running in the background; they just care that they uploaded a picture of a broken fridge and a repairman's phone number appeared a second later.
- Data Structuring: We deepened our understanding of how to bridge the gap between generative AI outputs and rigid, traditional database queries.
Built With
- amazon-nova
- amazon-web-services
- node.js
- nova
- react.js
- vite

Log in or sign up for Devpost to join the conversation.