PaperBanana Inspiration

As researchers, we noticed a glaring contradiction in the modern scientific workflow: while we have autonomous AI scientists to help write code and analyze data, creating the actual visual assets for papers remains a painful, manual bottleneck.

We realized that existing generative AI tools are designed for artistic creativity, not scientific rigor. They often hallucinate text, ignore strict layout constraints, or fail to adhere to the specific aesthetic standards of top-tier venues like NeurIPS or ICML. We built PaperBanana to close this gap. We wanted to create an AI academic illustration generator that understands the difference between a pretty picture and a correct methodology diagram—a tool that treats scientific illustration with the same rigor as the research itself.

Try it now: 👉 https://paper-banana.ai

What PaperBanana does

PaperBanana is a specialized agentic framework that transforms raw scientific text—such as methodology descriptions, abstract concepts, or datasets—into publication-quality diagrams and plots.

Unlike standard image generators that rely on a single prompt, PaperBanana orchestrates a collaborative team of five specialized AI agents. It completely automates the design process for:

  • Methodology Diagrams: Visualizing Transformer architectures, GAN pipelines, and multi-agent systems.
  • Statistical Plots: Generating zero-hallucination bar charts, scatter plots, and line graphs using executable code.
  • Aesthetic Refinement: Polishing rough sketches into professional, vector-style graphics.

Users simply input their research context at paper-banana.ai, and the system delivers high-resolution, academic-standard visuals ready for LaTeX or Word documents.

How I built PaperBanana

We engineered PaperBanana using a Multi-Agent Workflow designed to mimic a professional design studio. The architecture consists of five distinct agents that pass information sequentially and iteratively:

  1. The Retriever: This agent uses RAG (Retrieval-Augmented Generation) to find relevant academic reference examples, ensuring the output matches current publication standards.
  2. The Planner: It analyzes the user's text input to construct a logical layout and structural blueprint of the diagram.
  3. The Stylist: This agent applies specific academic design rules (color palettes, font choices, line weights) to ensure the diagram looks professional, not "cartoony."
  4. The Visualizer:
    • For diagrams, it renders the visual composition.
    • For statistical plots, it generates executable Python Matplotlib code, ensuring that every data point is mathematically accurate and eliminating AI hallucinations.
  5. The Critic: Perhaps the most critical component, this agent inspects the generated result against the original source text. It provides feedback loops for automatic refinement until the image meets our quality thresholds.

Challenges I ran into

The biggest challenge in building an AI academic illustration generator was solving the "Hallucination Problem" in charts. Early versions of the model would draw a bar chart that looked nice but had random heights unrelated to the data. We solved this by forcing the model to generate Python code for plots rather than generating pixels directly, guaranteeing 100% data accuracy.

Another hurdle was "Semantic Consistency." Ensuring that the Planner and Visualizer agreed on complex architectures (like a Transformer model) required fine-tuning the prompt engineering between agents. We had to teach the Critic agent how to be "harsh" enough to catch subtle errors in flow logic, just like a strict peer reviewer would.

Accomplishments that I'm proud of

We are incredibly proud of the Iterative Self-Critique system. Watching PaperBanana generate a diagram, have the Critic agent reject it because a label was misaligned, and then automatically regenerate a corrected version without human intervention is magical.

We are also proud of the diversity of our output. From cleaning up a user's rough napkin sketch to generating a complex multi-agent framework diagram from scratch, PaperBanana is proving that AI can be a reliable partner in the scientific publishing process.

What I learned

Building PaperBanana taught us that Context is King. A generic "draw a neural network" prompt yields generic results. By implementing the Retriever agent to look at actual academic references, we learned that grounding AI generation in real-world data significantly improves the professional look and feel of the output. We also learned that splitting tasks among specialized agents yields far superior results than asking one massive model to do everything at once.

What's next for PaperBanana: AI Academic Illustration Generator

We are just getting started with automating scientific communication. Our roadmap for PaperBanana includes:

  • LaTeX Integration: Allowing users to export diagrams directly into Overleaf projects with native Ti*k*Z code generation.
  • Expanded Gallery: Adding support for more niche scientific domains, such as chemical molecular structures and biological pathway visualizations.
  • Interactive Editing: Giving users the ability to chat with the image to make granular tweaks (e.g., "Make the arrow red" or "Move the legend to the top").

You can try the current version and streamline your research workflow today at paper-banana.ai.

Built With

  • nextjs
Share this project:

Updates