Inspiration
The scientific reproducibility crisis affects over 70% of researchers who have tried and failed to reproduce another scientist's experiments. With millions of papers published annually, the gap between reading research and implementing it remains a massive barrier to scientific progress.
We asked: What if AI could bridge this gap instantly?
What it does
Paper to Code transforms any scientific PDF into a working Jupyter notebook. Upload a research paper, and our Gemini 3-powered agent:
- Analyzes the full paper using multimodal understanding (text + figures + equations)
- Extracts methodology, algorithms, data sources, and dependencies
- Generates production-ready Python code with proper structure
- Validates the code through an agentic self-correction loop
- Outputs an executable notebook + requirements.txt
How we built it
We leveraged Gemini 3's most powerful features:
| Feature | Usage |
|---------|-------|
| 1M Token Context | Process entire papers without chunking |
| Multimodal Understanding | Analyze figures, diagrams, and equations visually |
| Thinking Levels | high reasoning for complex code generation |
| Thought Signatures | Maintain context across multi-step agent workflows |
| Structured Outputs | Extract methodology into precise JSON schemas |
Architecture: PDF → Gemini 3 (Analysis) → Structured Extraction → → Gemini 3 (Generation) → Self-Validation Loop → → Jupyter Notebook + requirements.txt
Tech Stack: Python, Streamlit, PyMuPDF, nbformat, google-genai SDK
## Challenges we faced
- Equation extraction: Scientific papers have complex LaTeX that needed careful prompt engineering
- Code validation: Building a reliable self-correction loop without infinite iterations
- Context management: Balancing detail vs. token usage for long papers
## What we learned
- Gemini 3's thinking levels dramatically improve code quality when set to
high - Thought signatures are essential for maintaining coherent multi-step reasoning
- Multimodal input (page images + text) catches information that text-only misses
## What's next
- Support for arXiv URL input
- Dataset auto-download integration
- GPU-accelerated code detection and optimization suggestions
- Community library of reproduced papers
Built With
- gemini3
- googlegeminiapi
- jupyter
- pymupdf
- python
- streamlit
Log in or sign up for Devpost to join the conversation.