Inspiration

I kept hitting the same wall with AI agents: ask them to look something up and they hand back confident nonsense fake stats, dead links, claims no source actually backs. I didn't just want an agent that sounds right; I wanted one that has to prove its answers across independent sources before showing me anything. That's NanoGaze.

What it does

You give it a research goal. It plans the searches with Gemini, pulls sources from the web, and runs every one through a filter pipeline: a relevance check drops off-topic results, then two engines score trust and fact-check, and finally a corroboration step confirms each finding across independent sources. You get a confidence-rated brief where every claim is labeled either "confirmed by multiple independent sources" or flagged as single-source and if sources disagree, it shows you the contradiction. The whole thing streams live, and you choose what to save.

How we built it

Gemini 2.5 Flash is the brain (planning queries, scoring relevance and credibility, extracting findings). The agent layer is Google's ADK so it works with Agent Builder. Search is DuckDuckGo. The two engines are plain Python Engine 1 scores trust, Engine 2 validates facts and the corroboration layer counts independent supporting domains per claim. Everything is stored in MongoDB Atlas through the MongoDB MCP server: research sessions, verified/eliminated sources, and a domain-reputation memory that feeds back into scoring so the system gets smarter the more it's used. It's containerized with Docker and deployed on Google Cloud Run.

Challenges we ran into

The MongoDB MCP integration nearly broke me. Writes were silently "succeeding" but nothing saved. I eventually found three stacked issues the server had renamed all its tools to kebab-case, it refused my Atlas mongodb+srv:// string (I had to resolve the DNS myself and build a standard connection string), and it wraps query results in a security envelope my parser couldn't read. None of them threw errors; old leftover data tricked me into thinking it worked. Then once it deployed, Cloud Run still couldn't connect the Atlas IP allowlist didn't include Cloud Run's IPs. I also fought async cleanup crashes on Linux that swallowed successful responses, and a single-threaded server that froze whenever a live research stream was open.

Accomplishments that we're proud of

The corroboration layer. Early on, my "credibility" filter really just measured whether a source looked polished — which isn't truth at all. Adding cross-source corroboration changed it from "this source seems legit" to "this claim is independently confirmed," and the single-source flags and contradiction detection make the uncertainty honest instead of hidden. I'm also proud that the MongoDB integration is real, load-bearing memory the learned reputation genuinely sharpens future runs and that the whole pipeline streams live so you can watch the agent think.

What we learned

A ton about how MCP servers actually work under the hood, and the hard lesson that "no error" doesn't mean "it worked" I learned to check for silent failures instead of trusting a call went through. I learned why "verified" and "true" aren't the same thing, and that real credibility comes from independent agreement, not a single model's confidence. Plus a lot of practical async Python, prompt design for consistent structured output, and deploying a real container to Cloud Run

What's next for NanoGaze-MLOps

Right now it hands you a brief next I want it to act: email the report, push to a notes app, or schedule recurring research on a topic. I'd add claim-level source highlighting, weight corroboration by source independence (not just distinct domains), expand beyond DuckDuckGo, and let the reputation memory drive smarter query planning over time.

Built With

Share this project:

Updates