Inspiration

Scientific knowledge is expanding exponentially, yet the primary medium for communicating it—the PDF paper—has remained static for decades. Researchers spend countless hours parsing dense text, cross-referencing citations, and manually verifying assumptions. We asked ourselves: What if a research paper could explain itself? What if you could interrogate a paper, challenge its premises, and instantly compare it with conflicting research? With the release of the Gemini 3 model family, we realized the technology finally existed to turn this "Living Scientific Paper" concept into reality.

What it does

Living Scientific Paper is an interactive workbench that transforms static PDFs into dynamic knowledge engines.

  • Structural X-Ray: Instantly extracts and visualizes the logical backbone of a paper—Research Questions, Core Claims, Explicit Assumptions, and Methods.
  • "What If" Reasoning Engine: Users can ask counterfactual questions (e.g., "What happens to the conclusion if the sample size assumption is violated?"). The system uses Gemini 3's advanced reasoning capabilities to trace the logical impact downstream.
  • Assumption Stress-Testing: Automatically identifies implicit assumptions and "stress tests" them to reveal potential weak points in the argument.
  • Deep Comparative Analysis: Select two papers, and the system performs a side-by-side synthesis, highlighting common ground, direct conflicts, and methodological divergences.
  • Context-Aware Explanations: Click on any complex equation or figure, and the system acts as a personalized tutor, explaining the notation and intuition based on your expertise level.

How we built it

We built a modern full-stack application centered around the Gemini 3 API:

  • Frontend: Built with Next.js 14, TypeScript, and Tailwind CSS. We used Framer Motion for a calm, research-focused interface inspired by modern scientific tools, emphasizing readability, structure, and mathematical clarity.
  • Backend: A FastAPI (Python) server handles document processing and logic.
  • AI Integration:
    • Long-Context Understanding: We leverage Gemini 3.5 Pro/Flash's massive context window to feed entire papers (text + layout) into the model without fragmentation.
    • Structured Reasoning Outputs: We designed prompts that require Gemini to produce explicit, structured reasoning steps (claims, dependencies, impacts) without free-form verbosity, enabling transparent stress-testing.
    • Model Fallback System: To ensure robustness, we implemented a custom client that dynamically switches between gemini-2.5-pro, flash, and flash-lite based on availability and quota, updating the UI in real-time to show which model performed the task.
  • Persistence: A custom file-based repository system stores analyzed sessions and comparisons, allowing researchers to build a library of "living" papers.

Challenges we faced

  • Hallucination in Logic: Early iterations would sometimes invent claims. We solved this by forcing the model to output citation indices (e.g., [Evidence: Eq. 3]) alongside every logical step.
  • PDF Parsing: extracting clean text from two-column scientific layouts is notoriously difficult. We optimized a pipeline that feeds raw text to Gemini for structure extraction, which proved far more resilient than traditional Regex/OCR methods.
  • UI Complexity: Visualizing a directed acyclic graph (DAG) of logical dependencies in a user-friendly way required multiple iterations of the "Dependency Graph" view.

Accomplishments that we're proud of

  • Real-Time Comparative Synthesis: Seeing the system accurately identify a subtle methodological conflict between two dense papers was a "breakthrough moment."
  • The "Living" Feel: The transition from a static file upload to an interactive dashboard where you can click an equation and have it "talk" to you feels like the future of education.
  • Robustness: The seamless fallback system means the app "just works" even under API constraint.

What we learned

  • Gemini 3 is a reasoner, not just a summarizer. We learned that treating the model as a logic engine (asking it to trace dependencies) yields far higher value than just asking it to "summarize this paper."
  • The importance of UX in AI. Powerful models need intuitive interfaces. Using a "Navigation Rail" and split-screen Comparison View made the AI's output digestible.

What's next for Living Scientific Paper

  • GraphRAG Integration: Connecting the single-paper reasoning into a global knowledge graph of millions of papers.
  • Multimodal Input: exploring dataset-assisted verification, where users can upload CSV/Excel files to contextualize claims against underlying data.
  • Collaborative Mode: Multiplayer research sessions where teams can annotate and debate logic trees together.

Built With

Share this project:

Updates