-
-
Automate the Bridge: A minimalist entry point for autonomous paper-to-code synthesis via Gemini 3.
-
Agent Initialization: The 'Marathon Agent' extracts latent variables and math from the PDF vector space.
-
Orchestration Phase: Gemini 3 Flash synthesizes abstract math into Python logic with a live diagnostic log.
-
Research Command Center: A high-fidelity dashboard for verified benchmarks and neural audio Theory Maps.
-
Math Transparency: A deep-link layer mapping LaTeX equations to synthesized NumPy code blocks.
-
Architectural Synthesis: Gemini 2.5 Flash-Image generates 2K technical schematics of algorithm data flow.
-
The Verified Source: Modular Python code engineered for mathematical parity and zero-dependency portability.
-
Isolated Verification: Real-time WASM monitoring with raw process output and live memory maps.
-
Structural Parity: A side-by-side audit ensuring theoretical claims are verified in the code synthesis.
-
The Marathon Loop: Visualizing convergence as the agent self-corrects logic through multiple cycles.
Inspiration
Every year, thousands of breakthrough research papers are published, yet most remain trapped in static PDF format. For engineers, the gap between reading a complex LaTeX equation and having a verified, working implementation can take days of head-scratching and debugging. We were inspired by the "Action Era" of AI—moving beyond chat to actual execution. We wanted to build a system where the AI doesn't just explain a paper, but actually proves it understands the methodology by building it from scratch and verifying it against reality.
What it does
ResearchLoop is an autonomous research assistant that handles the entire pipeline of paper implementation:
- Multimodal Extraction: It leverages Gemini 3’s native vision to read academic PDFs, identifying primary algorithms, latent methodology, and reported benchmarks.
- Autonomous Synthesis: It generates complete, modular Python modules and NumPy logic based on the extracted mathematics.
- WASM Execution: It runs the generated code in an isolated, browser-based Python environment (Pyodide), ensuring zero-dependency verification.
- Self-Correction Loop: If the code fails or results don't match benchmarks, the agent (acting as a "Marathon Agent") analyzes the traceback, revises its logic, and iterates autonomously until it achieves mathematical parity.
- Theory-Code Mapping: It provides a transparency layer that links specific equations from the paper directly to the corresponding lines of generated code.
How we built it
ResearchLoop is powered by the Gemini 3 family, utilizing high thinking budgets (up to 32k tokens on Pro) to handle dense mathematical reasoning.
- Reasoning Engine: We used the
@google/genaiSDK to implement "Thought Signatures," allowing the model to "think through" execution errors before attempting a patch. - Frontend: React 19 with a "Brutalist Academic" aesthetic, focusing on technical transparency and high readability.
- Runtime: Pyodide (WASM) was integrated to allow users to run heavy NumPy-based algorithms directly in their browser without any server-side infrastructure.
- Multimodal Assets: We used
gemini-2.5-flash-imagefor 2K architectural blueprints andgemini-2.5-flash-preview-ttsfor neural audio theory maps.
Challenges we ran into
The biggest challenge was handling the "Reasoning vs. Context" balance. High-frequency debugging loops consume significant tokens, so we optimized our system instructions to ensure Gemini focused on "First Principles" logic repairs. Another hurdle was multimodal parsing—extracting precise logic from multi-column academic layouts required specific prompt engineering to maintain context across complex page structures.
Accomplishments that we're proud of
We are incredibly proud of the Autonomous Convergence Loop. Watching the agent encounter a ValueError in its first attempt, reason about the "shape mismatch" in a matrix operation, and then successfully patch the code in a single iteration without human intervention is the definition of the "Action Era." Achieving mathematical parity in a browser sandbox is a major technical milestone.
What we learned
We learned that the Thinking Budget is the most important variable in modern AI orchestration. By giving the model space to "think" before it "codes," the quality of the first-draft implementation increased dramatically. We also realized that transparency—showing the internal state and WASM logs—is crucial for building trust in autonomous agents.
What's next for ResearchLoop
We want to expand ResearchLoop beyond single-file implementations. Our roadmap includes:
- Multi-module Synthesis: Handling papers that require complex project structures and multiple file dependencies.
- Web-Search Grounding: Fully integrating Gemini's Google Search tool to compare implementation results with existing State-of-the-Art (SOTA) benchmarks on the web.
- GPU Acceleration: Leveraging WebGPU to allow the autonomous agent to implement and test larger neural networks directly in the browser environment.
Built With
- browser-localstorage
- fira-code
- google-gemini-3-flash-(gemini-3-flash-preview)
- google/genai-sdk
- marked
- numpy
- pyodide
- python
- react-19
- recharts
- space-grotesk
- tailwind-css
- typescript
- webassembly-(wasm)
Log in or sign up for Devpost to join the conversation.