Nexus: Scaling Investigative Truth with Gemini
The Vision
Imagine every journalist could have the resources of Woodward and Bernstein. In an era of newsroom layoffs and shrinking budgets, exhaustive research is becoming a luxury. Nexus is an AI-powered investigative assistant designed to bridge this gap, scaling impactful journalism by connecting massive context with proprietary, structured insights.
The Story
For nearly a decade, I worked as a journalist covering politics, crime, and community development. Later, I pivoted to product management, where I help B2B and B2C companies with web applications and agentic AI capabilities that capture business value. Nexus is the fusion of these two worlds — using technology to scale the deep, context-heavy research that gives stories their impact.
Nexus leverages Gemini for its massive context window and deep knowledge of world events, alongside a custom Neo4j Knowledge Graph. This graph is seeded with associations mined from a legacy archive of BBC news stories (2005-2006). By "mining" these archives, Nexus uncovers relationships between people, organizations, and companies that are often deeper and more specific than the general internet data foundation models were trained on.
The Value Proposition
Nexus represents a future where news organizations connect their collective archives and leverage an agentic toolset to bring deep contextual awareness to everyday reporting.
- Public Utility: Free inquiry into historical data to inform the general public.
- Newsroom Premium Offering: Secure, authenticated access to protected information (like anonymous sources) for internal staff use.
- Business Model: A freemium marketplace where general usage is free, but news outlets license the platform to manage and query their proprietary internal intelligence.
Technical Implementation
The project was executed in three distinct phases:
- Knowledge Mining: An agentic n8n workflow running locally on a Strix Halo system uses Gemini to batch-read 2,300 BBC articles, saving extracted "knowledge triplets" (Subject-Predicate-Object) into a Neo4j database. Google Gemini 3 Pro was heavily consulted as the Enterprise Architect and Technical Team.
- Interface: A lightweight AI assistant developed via Google AI Studio serves as the primary research frontend.
- The Bridge: The assistant utilizes Function Calling to query the local graph database via a secure Cloudflare tunnel. This demonstrates that private newsroom data can stay on premises while benefiting from cloud-scale generative AI reasoning.
Key Learnings
- Strict Schema and agentic workflow definition: Constructing a Knowledge Graph with LLMs does work, but the model requires strict instructions on edge definitions to ensure the associations are useful for research. I was unable to vet out the complexities of building and maintaining a knowledge graph, but I believe agentic workflows can do what would normally take a staff of librarians to do.
- Disambiguation: An AI sub-workflow is critical for telling apart entities with similar names (e.g., distinguishing George Bush as possibly George H.W. Bush or George W Bush).
- Local-to-Cloud Hybrid: In this Agentic AI era, it is important to leverage local storage and computing for data privacy, but also to architect with cloud inferencing when practical, and local inferencing when necessary.
Scalability & Future
While this proof-of-concept uses local dev environments for speed and to demonstrate the architecture of a split public/private AI application, the architecture can scale in the cloud too. The real test lies in the business model: would news outlets pool certain resources normally behind paywalls — their news archives — as well as pay into a premium service, to benefit their journalists and the general public in a unified, searchable, and secure investigative suite?
Built With
- aistudio
- cloud-run
- cloudflare
- docker
- gemini-3-pro
- n8n
- neo4j
Log in or sign up for Devpost to join the conversation.