SOREN - Your Concept Mentor Inspiration We were drowning in research papers. As CS students, we'd spend hours trying to decode dense academic PDFs, pausing every paragraph to look up concepts, rewinding lectures, and still feeling lost. Meanwhile, 3Blue1Brown videos make complex math intuitive in minutes with beautiful animations. We thought: What if every research paper could become a 3Blue1Brown video? That's when we realized - with modern AI, we could automate the entire pipeline. Not just summarize papers, but truly visualize them with the same mathematical elegance that makes great explainer videos work. What it does Soren transforms research papers into animated explainer videos with interactive AI tutoring: ๐ Upload โ ๐ฌ Video Pipeline:
Upload any academic PDF (machine learning, physics, mathematics) Claude AI analyzes the paper's structure, extracts key concepts, and identifies the most important mathematical ideas Automatically generates a narrative script with pedagogical flow Creates Manim (3Blue1Brown's animation library) code for visual explanations Synthesizes natural voiceover with ElevenLabs Renders a complete educational video
๐ค Interactive Q&A:
Watch your generated video Pause at any timestamp and ask questions AI provides contextual answers using:
The original PDF text The Manim animation code (to explain what's shown visually) The current video frame (to reference what you're seeing)
It's like having an expert tutor who knows exactly what you're confused about
๐ฅ Sample Gallery:
Browse example videos from real papers (LoRA, Latent Diffusion Models, Schrรถdinger Bridges) See the quality before uploading your own
How we built it Frontend:
React + Vite for a fast, responsive UI Custom WebGL shader effects (GridScan) for the futuristic aesthetic Split-screen video player with real-time chat interface React Router for seamless navigation
Backend:
Flask server handling PDF uploads and API requests Multi-stage AI pipeline:
Analyzer - Claude extracts concepts, equations, and structure Planner - Claude designs a 12-scene narrative flow Generator - Claude writes production-ready Manim code Renderer - Manim creates animations Narrator - ElevenLabs synthesizes voiceover
Context-aware Q&A system that indexes PDFs, Manim code, and video metadata
Key Technologies:
Claude API (Anthropic) - Paper analysis, script generation, Q&A Manim - Mathematical animation engine ElevenLabs - Natural voice synthesis PyPDF2 - PDF text extraction FFmpeg - Video processing
Challenges we ran into
- Manim Code Generation Quality Getting Claude to write perfect Manim code was brutal. Early versions would mix incompatible API versions, use deprecated methods, or create syntax errors. We solved this by:
Building a comprehensive knowledge base of Manim best practices Creating "safe templates" for common animation patterns Implementing iterative validation (though we eventually built a "zero-error" generator)
- Context Management for Q&A The AI needed to understand not just what the paper says, but what the video is showing at each moment. We built a sophisticated context extraction system that correlates:
Timestamp โ Scene in video Scene โ Section of paper Visual elements โ Mathematical concepts
- Video-Frontend Integration Getting videos to load properly across different formats, quality levels, and folder structures was a nightmare. We went through multiple iterations:
First tried symlinks (didn't work on all systems) Then copying files (storage issues) Finally settled on a clean API endpoint system with proper routing
- Real-time Performance Full pipeline takes 10-15 minutes for a complete paper. For the demo, we:
Pre-generated sample videos Added realistic progress indicators Built a "demo mode" that shows the UI flow instantly while keeping backend integration for Q&A
Accomplishments that we're proud of โจ It actually works - We have real videos generated from real research papers with professional-quality animations ๐จ Beautiful UI - The minimalist black/white design with shader effects looks genuinely polished ๐ค Smart Q&A - The context-aware question answering feels magical - it understands what you're looking at and explains accordingly ๐ฌ Production-Quality Output - Our Manim code generation produces animations that could genuinely be in a 3Blue1Brown video โก Smooth UX - Split-screen layout, proper error handling, loading states, and intuitive navigation What we learned Technical:
LLMs can write production code with the right prompting and constraints Context management is everything for good AI interactions Frontend performance matters - shader effects need careful optimization API design is critical when frontend/backend are separate
Design:
Less is more - our minimalist design makes the content shine Progress indicators and feedback are crucial for AI-powered tools Users need to understand what's happening under the hood
AI Engineering:
Prompt engineering is software engineering Building knowledge bases for LLMs is like building compilers Multi-agent systems work when each agent has a clear, focused job Validation and error handling are 10x more important with AI-generated code
What's next for SOREN Short-term (Next Month):
๐ฏ Batch Processing - Upload multiple papers, generate playlist ๐ Progress Tracking - Real-time updates as each stage completes ๐จ Customization - Choose animation style, video length, voice ๐พ Video Library - Save and organize your generated videos
Medium-term (3-6 Months):
๐ Course Builder - Turn entire textbooks into video courses ๐ฅ Collaboration - Share videos, contribute annotations ๐ฑ Mobile App - Watch and learn on the go ๐ Multi-language - Support papers in other languages
Long-term Vision:
๐ซ University Partnerships - Deploy for entire CS departments ๐ฌ Live Papers - Authors upload papers, we auto-generate videos for ArXiv ๐ฎ Interactive Exercises - Quizzes and problems generated from content ๐ง Personalized Learning - AI adapts video complexity to your level
The Dream: Make every research paper as accessible as a 3Blue1Brown video. Democratize cutting-edge knowledge. Turn the incomprehensible into the intuitive.

Log in or sign up for Devpost to join the conversation.