Research agent for AI/ML publications

Upload research paper
Code pane

Inspiration

Reading AI/ML research papers is hard not because of the math alone, but because it’s unclear how to turn ideas into working code. Summaries exist, but implementations are often incomplete, inconsistent, or don’t reproduce results. Students guess, instructors manually design assignments, and engineers waste time reverse-engineering papers. We wanted to close this gap between research understanding and executable reality.

What it does

Research Agent turns an AI/ML research paper into: A structured understanding of the paper (problem, method, assumptions) A verified code workspace where generated code is actually executed in a sandbox A step-by-step validation pipeline (environment checks, shape checks, loss checks) Reproducible assignments with starter code and grading structure Instead of just explaining papers, the system proves its understanding by running code.

How we built it

Frontend: React + TypeScript for a structured, workflow-driven UI (upload paper → workspace → execution status) Backend: Python + FastAPI for paper ingestion and orchestration Execution layer: Daytona to create isolated, reproducible environments and run generated code safely Pipeline design: Paper upload → structured extraction → workspace creation → deterministic test execution All execution results, logs, and artifacts are stored and surfaced back to the user The key design choice was that every claim must be backed by execution, not just text.

Challenges we ran into

Designing a system that doesn’t look like a generic “AI paper summarizer” Keeping execution deterministic and lightweight for a hackathon scope Handling incomplete or ambiguous details in research papers Structuring backend state without over-engineering databases or queues Balancing simplicity with extensibility for future agent-based workflows

Accomplishments that we're proud of

Built a working paper-to-execution pipeline, not just a demo UI Integrated real sandboxed code execution to verify understanding Designed a clean, extensible architecture for future RAG and agent orchestration Avoided hallucinated outputs by enforcing executable validation Created something usable for students, instructors, and engineers

What we learned

Execution builds trust in AI systems more than explanations Research papers often omit critical implementation details — systems must surface gaps, not hide them Starting with a minimal, file-based backend keeps iteration fast Clear workflows outperform chat-only interfaces for serious technical tools Agentic systems are most valuable when paired with hard validation layers What's next for Research Agent for AI/ML publications Add document ingestion + RAG (LandingAI) for citation-grounded explanations Introduce agent orchestration (Swarms) for ambiguity resolution and code planning Expand assignment generation with autograders and rubrics Support partial result reproduction and ablation studies

Built With

daytona
elevenlabs
react

Updates

Ashu Kumawat started this project — Jan 24, 2026 06:46 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.