Inspiration

Traditional services providers discovery platforms are plagued by "review bloat" and static data. Users spend hours cross-referencing map ratings, social media sentiment, and portfolio photos. There is no "source of truth" that combines spatial data with real-time visual proof of quality.

What it does

OmniScout isn't a directory; it’s a Multimodal Reasoning Engine.

  • Spatial-Temporal Understanding: It doesn't just look at a "4.5 star" rating; it analyzes video walkthroughs of a contractor's past projects to identify "cause and effect" (e.g., "The tiling looks good, but the grout spacing suggests future drainage issues").

  • The Marathon Agent: It performs long-running tasks like calling service providers via API/Web, checking business licenses against government databases, and synthesizing a "Trust Index" based on live data.

How we built it

OmniScout leverages the Gemini 3 Pro multimodal architecture to transition from search to autonomous verification. We utilize the 1M+ Token Context Window to ingest entire service histories, including years of customer reviews and high-resolution project portfolios, without losing context.

Central to our app is the Gemini Live API for real-time audio synthesis, allowing users to interact with the scout hands-free while on-site. We implemented Thought Signatures, enabling the agent to maintain "Chain of Thought" reasoning during multi-step tool calls—such as comparing a provider's advertised prices against historical invoice data found in community forums.

Furthermore, we leverage Gemini’s Spatial-Temporal Video Understanding. Instead of basic image recognition, the agent analyzes "before and after" videos of service deliveries (like a home renovation or a car repair) to verify structural integrity and aesthetic consistency. By using Thinking Levels, the agent self-corrects its recommendations if it detects a mismatch between a provider's claimed expertise and their visual output history.

Challenges we ran into

The Auto-negotiator and Trust index's accuracy was a bottleneck, but we resolved the issue by enabling the Maps API for the GCP project attached to the API KEY

Accomplishments that we're proud of

The "Marathon" Success: We successfully engineered a long-running autonomous agent that doesn't just "search" but actually audits. Watching the agent independently cross-reference a 10-minute video portfolio against a local business registry was a "eureka" moment.

Multimodal Precision: We moved beyond basic text reviews. We’re proud of our Spatial-Temporal reasoning logic that can identify high-quality craftsmanship in video footage, effectively separating real skill from marketing hype.

Vibe-to-Reality Integration: Implementing the "Paint-to-Edit" feature—where a user can visualize a provider's specific style on their own home—transformed the app from a directory into a decision-making engine.

What we learned

Human-AI Trust: We realized that users don't want more data; they want verified insights. Technical complexity must always be translated into a "human-friendly" trust score to be useful.

Context is King: We learned that with Gemini 3’s 1M token window, "RAG" is just the starting line. The real power lies in long-context reasoning—allowing the AI to spot contradictions in service history over years, not just days.

Thinking Levels Matter: We discovered that forcing the AI to use "Thought Signatures" significantly reduced hallucinations. By letting the agent "think out loud," it self-corrected when a provider's rating didn't match their visual work quality.

What's next for Omniscout

The Live Negotiation Suite: We plan to integrate the Gemini Live API further, allowing OmniScout to conduct initial "vibe-check" calls with providers to verify availability and pricing on behalf of the user.

Community Trust Network: Expanding our "Verified Reviews" to include blockchain-verified service contracts, ensuring that every piece of data the AI analyzes is 100% authentic.

Hyper-Local Expansion: Moving beyond home services into specialized medical and technical fields, where the stakes for "quality transparency" are even higher.

Built With

Share this project:

Updates