Test-Time Program Search for RFdiffusion Enzyme Design

Inspiration

Inspired by Baker Lab's recent paper “Computational Design of Serine Hydrolases,” we set out to embrace a “test-time scaling paradigm" approach to enzyme design—allowing the model to allocate more computational resources as needed for both research and inference—to push toward an agentic approach.

What it does

Our system marries automated literature research with an AlphaGo-style program search to discover novel enzyme scaffolds built around a conserved catalytic triad. First, it combs through current serine hydrolase research—both classical papers and cutting-edge insights—leveraging a web-based deep research module to understand prevailing hypotheses on scaffold geometry and domain organization. Next, it uses those findings to guide a PUCT-inspired exploration of new protein contig (RFdiffusion parameter in control of motif scaffolding instructions) configurations, balancing proven scaffolds with more radical designs. For each candidate configuration hypothesis, the pipeline automatically generates full RFdiffusion scripts that preserve the triad coordinates while flexibly building the surrounding structure. By scaling computational effort during test-time, the system can dynamically allocate more resources when helpful.

How we built it

Tools: We did most of the coding within the Cursor environment, leveraging its inline suggestions and quick code navigation. This gave us a smooth workflow for iterating on our Python scripts and prompts simultaneously.

Research: We started by reading just enough about existing tools and relevant enzyme design papers—particularly from David Baker’s group—so we could identify the key workflows (e.g., extracting a catalytic triad, building a scaffold around it, etc.).

Asynchronous Web Search + Prompting: We combined aiohttp with targeted prompt engineering to pull in external references and parse them in real time. This made our “deep research” module largely hands-off, automatically updating design hypotheses based on each new batch of literature.

AlphaGo-Style Program Search: We added a PUCT-inspired approach to keep track of which scaffold parameters were most promising, balancing exploitation of validated scaffolds (close to known hydrolases) and exploration of more creative folds.

RFdiffusion Integration: For each candidate scaffold, the system programmatically generated a Python script that called NVIDIA’s RFdiffusion endpoint, preserving the fixed catalytic triad while freely designing the rest. We ran those scripts, collected the outputs, and fed key metrics (e.g., folding confidence) back into the search loop.

Test-Time Scaling: We employed an agent architecture that coordinated iterative literature analysis, hypothesis generation, and contig design before final execution. This allowed more complex or uncertain design paths to be examined in-depth, so that each step—research, reasoning, and scaffold construction—was fully informed by prior insights.