Inspiration
- “PadhAI” (padhai = “study” in Hindi) was born from a simple idea: turn any topic into a clear, animated mini‑lecture—instantly.
- Teachers and learners spend too much time storyboarding, animating, and narrating; we wanted studio‑quality explanations without production overhead.
What it does
- Converts a prompt into a time‑aligned lecture video: Manim animations + synchronized narration.
- Uses a shot‑based timeline (scenes → shots with start times) to guarantee narration/visual sync.
- User friendly UI with chat history, instant replays, and browser TTS.
How we built it
- Frontend: React + Vite (VideoPlayer, ChatPanel, Sidebar), Web Speech API for TTS scheduling by shot start times.
- Backend: Node.js + Express; calls an OSS LLM via Groq; assembles Manim code; executes Manim; serves generated MP4s.
- Prompting: “Maestro” system prompt returns JSON with manim_header, scenes[shots], and manim_footer.
- Reliability: Code sanitization (quote fixes), camera‑animation shims for Manim version differences, Windows‑safe cleanup.
Challenges we ran into
- Manim compatibility: camera.animate not available on some builds; added safe fallbacks and timing preservation.
- JSON → Python hygiene: over‑escaped quotes breaking strings; added robust unescaping and indentation normalization.
- Windows file locks (EBUSY) during cleanup of partial media; added resilient deletion and retries.
- Timeline trust: ensuring animation durations meet narration length; enforced waits/padding.
Accomplishments that we’re proud of
- Shot‑based timeline that makes narration and visuals deterministically sync.
- Clean, accessible UX that feels familiar and fast.
- Fully local render pipeline (Manim + browser TTS) using free tiers.
- Defensive backend: sanitization, fallbacks, and portability fixes.
What we learned
- LLM‑to‑code is powerful but fragile—small escaping/indent errors can break renders.
- Cross‑version portability in Manim requires careful camera handling.
- Windows dev ergonomics (file locking, CRLF) need explicit handling.
- Clear contracts (timeline schema) make complex AI outputs reliably usable.
What’s next for PadhAI
-Fine tune OpenAI GPT OSS with manim codes and synchronized narrations
- Real TTS streaming and audio–video mux (ffmpeg) for shareable, packaged lectures.
- Template library for common pedagogy patterns (graphs, processes, proofs).
- Sandbox execution (Docker/containers) and security hardening.
- Persistence (saved prompts, edits, versions) and collaboration.
- CI for lint/tests and a one‑click cloud deploy path.
Built With
- chatgpt
- groq
- javascript
- manim
- openai
- react
Log in or sign up for Devpost to join the conversation.