Inspiration
Engineering textbooks are dense and intimidating. We realized that true understanding in STEM requires visualization, not just walls of text. We wanted to build a tool that acts as a 24/7 AI tutor—bridging the gap between static syllabus PDFs and intuitive visual understanding.
What it does
LearnVis transforms textbooks into dynamic visual experiences. Upload a PDF, and the platform instantly analyzes it to build an interactive mind map of core concepts. When you select a concept, LearnVis deploys an AI agent to write Python code and generate a high-quality math animation (via Manim) on the fly, turning dry definitions into moving visuals.
How we built it
- Frontend: Next.js and Tailwind CSS for a sleek, responsive UI.
- Backend: FastAPI handles asynchronous tasks and video streaming.
- RAG Pipeline: We use
PyPDF2to chunk PDFs and Gemini embeddings to vectorize them. These are stored in Supabase withpgvectorto ensure animations are grounded in the actual textbook. - Agentic Video Generation: Using LangGraph and Gemini, we built a self-healing pipeline. Gemini generates Manim Python scripts based on RAG context, and the backend executes the code locally.
Challenges we ran into
Our biggest hurdle was LLM hallucination in code generation. Early models generated Manim code with syntax errors, causing rendering failures. To fix this, we engineered a self-healing agentic loop. If the code fails, LangGraph intercepts the error trace and feeds it back to Gemini, allowing the AI to debug its own code and retry autonomously.
Accomplishments that we're proud of
- Implementing pgvector to ground our AI in the user's specific textbook, preventing generic explanations.
- Engineering a self-repairing LangGraph agent that drastically improved our video generation success rate. Seeing the AI write, fail, debug, and successfully render code autonomously is incredible.
- Streaming real-time NDJSON updates to the Next.js frontend so users can see the agent's thought process.
What we learned
We learned how to orchestrate multi-agent LLM systems and handle the complexities of the Manim library. We also discovered that RAG isn't just for text—it's incredibly powerful for grounding code generation with specific formulas and logic.
What's next for LearnVis
- AI Voiceovers: Integrating Text-to-Speech (like ElevenLabs) for synchronized audio explanations.
- Interactive Code Editor: Allowing users to tweak the generated Manim code to experiment with animations.
- Auto-Quizzes: Using the RAG pipeline to generate interactive practice problems based on the videos.
Built With
- fastapi
- gemini-api
- langchain
- langraph
- python
- supabase
- typescript

Log in or sign up for Devpost to join the conversation.