Inspiration

It is truly fascinating to live in the era of AI, where language models have evolved from learning language, a human's expression of thinking, to actually thinking. Now, while significant effort is being made to model the natural world, less focus has been put on modeling the process of Science itself.

I believe models with infinite intelligence cannot solve complex problems like neurodegenerative diseases alone; experiments are still needed. And with generative AI and agents, the search space can be expanded. Such that the academic community can actually be creative with experiments that would otherwise be very challenging.

Antimatters is our answer for Scientific acceleration. It introduces a “Serendipitize” mode fundamentally inspired from Complexity’s framework: Information, Computation and Evolution.

What it does

Antimatters enables you to complete research at unprecedented speed, from data collection and literature validation to molecular design. At the core, it runs parallel experiments and derives spatial insights at atomic resolution. Because it is based on physics and first principles, we can treat computational data as if it's a biological assay.

We took a critical challenge of binding drugs for Intrinsic Disordered protein. As these types of protein lack stable secondary/tertiary structures it makes them “undruggable”. We asked Antimatters to dock various small molecules to α-synuclein, an extensively characterized IDP whose aggregation is associated with neuronal death in Parkinson’s disease. The agent identified nuances in the data and literature multimodally, such as Binding regions, Ligand structure and called Computational tools to run parallel experiments. Not only does it correctly predict the relative binding affinities of α-synuclein ligands as measured by solution NMR spectroscopy but also reproduces the atomic-resolution details of ligand binding modes.

This process of “Serpenditze” tends to contribute to the knowledge graph as if it were a structured world model. With that Gemini 3's capabilty of reasoning spatially is unlocked at atomic resolution, and further can be harnessed with generating novel molecules whether 1D or 3D(with Reversible Compression of Molecular Tokenization).

How we built it

We architected Antimatters on Google's ADK with Gemini 3 to orchestrate 2 modes: Serpenditze and Planning.

1. Serendipitize Mode: Asynchronous Discovery

This mode implements an autonomous Inform → Compute → Evolve pipeline where specialized agents collaborate to build a living knowledge base. In particular, agents collaborate from research protocol and experimental matrix to Discovery report.

1. **Research Agent**: Collects data from PED (Protein Ensemble Database) and ChEMBL, validates with literature multimodally using **Gemini's vision capabilities** to analyze protein structures, binding site, Ligands and STRING network diagrams. While updating Resaerch Protocol and Validating it, before passing to experimentation stage

2. **Engineering Agent**: Conducts parallel ensemble docking experiments using custom Docking server for AutoDock Vina, RDKit, OpenBabel, and MDTraj and spawns parallel subagents dynamically.

3. **Evolution Agent**: Derives Structure-Activity Relationships (SAR) and builds Neo4j knowledge graph linking artifacts as scientific evidence. Building structured world models as domain specific knowledge graph.

2. Planning Mode: The Visual Thinker-in-the-Loop

Planning mode allows the user to interact with the knowledge graph through a spatial reasoning interface.

  • Spatial Reasoning: Gemini 3 utilizes native spatial understanding and Code Execution to reproduce atomic-resolution details of ligand binding modes.

  • Molecular Generation: Generates novel candidates via SMILES or native 3D coordinates using RCMT (Reversible Compression of Molecular Tokenization).

  • Contextual Retrieval: Directly queries the Neo4j backend to ground every hypothesis in the project's cumulative experimental history.

### Backend Architecture

The backend is modularized via the Model Context Protocol (MCP), ensuring strict tool separation and scalability.

**MCP (Model Context Protocol) Servers**: Structured tool interfaces

*   **PED Server**: Fetches and parses protein ensemble conformations.

*   **Docking Server**: A specialized wrapper for AutoDock Vina with integrated **MDTraj** analysis.
  • Integrations: Bridges to BioContextAI (Google Scholar/EuropePMC) and the ChEMBL database for broad-spectrum chemical knowledge.

  • Data Tier: A hybrid persistence layer utilizing SQLite for workspace session state and Neo4j for entity-relational knowledge.

**Frontend**:

The interface utilizes an AG-UI protocol for high-performance streaming of research artifacts.

  • Real-time Rendering: Live updates as agents transition from protocols to experimental matrices and final discovery reports.

  • Atomic Visualization: Seamlessly integrates 3Dmol.js and Py3Dmol for interactive molecular inspection.

  • Entity Intelligence: An @ mention system allows for instant entity resolution and autocomplete directly from the knowledge graph.

Challenges we ran into

State Compression: Optimized high-dimensional IDP heterogeneity by clustering 576 ensemble states into 20 via t-SNE/K-means, reducing compute latency from 58 hours to 90 minutes (97% efficiency).

Adaptive Volumetrics: Engineered a dynamic box sizing algorithm ($Box_{size} = 3.86 \times R_g$) to stabilize AutoDock Vina against unfolded protein volumes, achieving 95% alignment with literature benchmarks.

Reasoning Integrity: Architected a Neo4j provenance layer to tag mdtraj_measured ground truth, eliminating Gemini 3 spatial hallucinations in autonomous discovery reports.

Accomplishments that we're proud of

Successfully reproduced binding rankings (Ligand-47 > Fasudil > Ligand-23) matching solution NMR spectroscopy. Took 3 hours of running as opposed to weeks and even months of work, from data collection to Discovery.

Multi-Agent: "Serendipitize"Orchestrated Research, Engineering, and Evolution agents to run parallel docking via custom MCP servers.

HCI Innovation: Built an @ mention system that autocompletes 161 ligands and 86 proteins directly from the Knowledge Graph.

Atomic-Resolution Accuracy: Reproduced specific residue interaction modes (Y125 and Y133) using computational data alone.

Automated Discovery: Built the first end-to-end platform capable of generating novel SMILES with a structural rationale based on Graph-derived SAR.

What we learned

Gemini 3 as an Orchestrator: Gemini 3 Flash efficiently manages high-frequency tool-calling between the PED Server, Docking Server, and ChEMBL, bridging the gap between reasoning and physics.

Graph-to-Spatial Reasoning: Flattened text is insufficient for molecular data. Providing Gemini with a Neo4j Graph structure allowed it to reason about 3D interactions more accurately than using SMILES strings alone.

Physics-Informed Agents: We learned that LLMs should not generate binding scores; they should define the search space. The "intelligence" lies in clustering and SAR interpretation, while the "truth" comes from deterministic engines like AutoDock Vina and MDTraj.

What's next for Antimatters

Custom MCP Registry: Allow users to register external tools (e.g., AlphaFold, Rosetta) via a Model Context Protocol interface.

RCMT Integration: Fully implement Reversible Compression of Molecular Tokenization to generate valid 3D coordinates directly from the agent.

Materials Science Expansion: Apply the ensemble-docking methodology to soft matter systems, such as polymers and liquid crystals, to predict structure-property relationships.PySR Integration: Trigger

Symbolic Regression when dataset size $>20$ ligands to derive mathematical equations for binding free energy:$$\Delta G = f(MW, LogP, \text{aromatic_count})$$

Built With

Share this project:

Updates