Inspiration

Our journey began with a fundamental challenge at the intersection of AI and clinical neuroscience: trust. While deep learning models show incredible promise for diagnosing diseases like Alzheimer's from fMRI scans, their "black box" nature makes them difficult for doctors and researchers to trust. How can we rely on a diagnosis if the AI cannot explain why it made that decision in a way that aligns with established medical knowledge?

This project, Neuro-Compass, was inspired by the vision of creating a trustworthy AI assistant for neuroscientists. We wanted to build a system that not only finds the destination (a diagnosis) but also provides a clear, reliable map of how it got there, unlocking the AI "black box."

What it Does

Neuro-Compass is an autonomous multi-agent system that serves two critical audiences in the fight against Alzheimer's Disease:

1. For Clinicians and Medical Professionals: When equipped with a high-accuracy, clinically validated prediction model, Neuro-Compass acts as a powerful diagnostic support tool. It doesn't just provide a "black box" classification; it generates a clear, evidence-based report that explains why the model made its decision. By linking abstract activations to known brain networks and functions from a knowledge graph, it provides transparent, trustworthy insights that can aid doctors in their diagnostic process.

2. For AI and Neuroscience Researchers: Neuro-Compass is an indispensable explainability (XAI) toolkit for developers of AD prediction models. It allows researchers to look inside their own models to understand:

  • Which brain regions their model is focusing on for a given subject.
  • If a prediction is wrong, where did the model look, and how does that differ from the ground truth?
  • How their model's learned patterns compare to established neurobiological knowledge.

The system automates the entire analysis pipeline, from dynamic layer selection and activation mapping to knowledge graph enrichment and final report synthesis:

  • Performs a diagnosis using a pre-trained deep learning model.
  • Autonomously charts the best course for analysis by using a "hypothesize-and-verify" process. An LLM first proposes promising neural layers, and a second LLM call validates this choice against real activation data.
  • Generates visual heatmaps of brain activation and aligns them with a standard anatomical atlas.
  • Delivers semantically rich explanations by querying a biomedical Knowledge Graph to uncover the functional roles and network affiliations of the activated brain regions.
  • Synthesizes all information into a single, comprehensive, publication-ready clinical report.

How We Built It

The Neuro-Compass system is built in Python, orchestrating a team of specialist AI agents using the Google Agent Development Kit (ADK). The full architecture is detailed in our paper and diagrams.

Our System Architecture

  1. Tech Stack: The backend relies on PyTorch, Nilearn, Neo4j, and Google's Gemini models as the reasoning engine. The interactive demo is built with Streamlit.

  2. Key Innovation 1: Dynamic Layer Selection: As highlighted in our pitch, the Inference & Validation Agent uses a two-step LLM process. It first inspects the model's static structure to propose layers, then uses real activation statistics (nonzero_ratio) to autonomously validate and select the most informative layer for analysis.

  3. Key Innovation 2: Robust GraphRAG: To ensure reliability, the GraphRAGAgent first performs Entity Linking (using an LLM to match fuzzy region names from the model's output against a canonical list from the database). It then uses this validated list in a 100% reliable, parameterized Cypher query template to fetch knowledge, avoiding the instability of auto-generated queries.

Challenges We Ran Into

Every ambitious project faces challenges, and ours was no exception.

  • The Unreliable Query Generator: Our initial attempts with standard Graph RAG chains were unstable. This challenge forced us to innovate the far more robust "Entity Linking + Templated Query" pattern, which became a core strength of our system.

  • Model Accuracy & Data Limitations: The current proof-of-concept utilizes a 4D-CapsNetRNN model. Due to the limited size of the training dataset, its predictive accuracy is not yet at a clinical-grade level. For Neuro-Compass to realize its full potential as a diagnostic aid, it must be paired with a model trained on a larger, more diverse dataset to achieve the highest possible accuracy. Our framework is model-agnostic, ready to support more powerful models in the future.

  • The Local LLM Hurdle (Computational Resources): We are committed to data privacy in medical settings. We invested significant effort in trying to integrate a locally-run 20B model via Ollama but encountered hardware-related Out-of-Memory (OOM) errors. This highlights a real-world constraint and led us to a pragmatic strategy: delivering a stable cloud-based demo while building a clear roadmap for future local deployment on more powerful hardware.

Accomplishments That We're Proud Of

  • Building a complete, end-to-end multi-agent system that truly unlocks the "black box" of a deep learning model.
  • Achieving a scientifically valid result: our system autonomously identified activation patterns in the Default Mode Network (DMN), a key biomarker for Alzheimer's Disease.
  • Authoring a pre-print technical report (An Agent-based Framework...) for submission to bioRxiv, formalizing our methodology and findings.

What We Learned

The biggest takeaway is the power of a hybrid approach: leveraging LLMs for what they do best (language, reasoning, fuzzy matching) while relying on deterministic, structured code for tasks that demand absolute reliability (like database queries). This journey taught us how to build a system that is both intelligent and trustworthy.

What's Next for Neuro-Compass

This hackathon is just the beginning. Neuro-Compass is the foundation for my Master's thesis. The immediate next steps are:

  1. Complete the integration of a local LLM backend via Ollama to ensure full data privacy for clinical applications.
  2. Conduct large-scale quantitative validation on a full clinical dataset.
  3. Expand the system's capabilities to help navigate the complexities of other neurodegenerative diseases.

Built With

  • agent-development-kit
  • graphrag
  • knowledge-graph
  • multi-agents
  • neo4j
  • python
Share this project:

Updates