Introduction

As a university student, I often find trouble understanding what high-level concepts researchers talk about. I find it tough to go through multiple research papers and understand them fully, often struggling to achieve true intuitive understanding. I found myself staring at complex equations, wishing they would move, morph, and explain themselves the way 3Blue1Brown makes mathematics dance.

What if we could instantly turn any static research paper into a high-quality, narrated educational video?

That was how I thought of VisuArXiv. VisuArXiv is an autonomous research-to-video pipeline. It takes an arXiv link or a PDF, reads it with the depth of a researcher, and produces a 3Blue1Brown-style animation that explains the core concepts.

How it Works

The main application operates through a sleek web interface where you can either search for a paper directly from ArXiv or upload your own local PDF. Once you've selected your material and hit the "Generate" button, it employs Deep Research to read the paper and distil complex text into visualisable scenes. Then, an agent writes, validates, and executes the actual animation code to render the visualisations on screen. Our system directs the visuals, generates the voiceovers, and stitches the final video together for you.

I've also built a smart memory into the system using Supabase. If you or anyone else has previously requested a popular paper, the app instantly retrieves the pre-rendered video from the cloud cache, letting you watch it immediately without waiting for generation. Furthermore, the system uses concurrent processing to render animations and generate audio for multiple scenes simultaneously, making the creation process incredibly fast.

The video generated in the demo vid has audio but my computer can't record system audio so the generated video with voiceover is here: https://youtu.be/zoz6m18LTK0

Built With

Share this project:

Updates