Inspiration
Growing up and studying engineering at IIIT Nagpur, we've seen firsthand how millions of JEE and NEET aspirants across India struggle to visualize complex physics and mathematics. When we looked at modern AI video generators like Sora or Runway, we saw a fatal flaw for education: diffusion models hallucinate. They guess frame-by-frame, placing pixels with absolutely zero mathematical harmony.
If a student needs to understand Simple Harmonic Motion, generic AI will draw a warped, physically inaccurate pendulum. They need deterministic precision—like the wave equation $\frac{\partial^2 y}{\partial t^2}=c^2\frac{\partial^2 y}{\partial x^2}$ rendered flawlessly. We wanted to build a "Glass Box" engine that doesn't just guess pictures, but understands the underlying math to deliver absolute, 3Blue1Brown-quality precision.
What it does
MentorBoxAI is a "Text-to-Teacher" engine. Instead of generating slow, inaccurate pixels, our AI generates hyper-efficient Manim Python code.
A student types in a topic (e.g., "Bohr's Atomic Model"). Our 6-layer pipeline distills the pedagogical narrative and authors structurally perfect code to animate it. For example, instead of drawing a blurry atom, it procedurally generates the exact physics on screen: $$E_n=-\frac{13.6\text{ eV}}{n^2}$$
The result? A hyper-personalized, mathematically flawless 720p educational video generated in under 3 minutes, at an AWS infrastructure cost of just ~$0.028 per video.
How we built it
We engineered a 100% AWS-native stack to handle the heavy computational demands of vector rendering and multimodal reasoning:
- The AI Brain: We utilized Amazon Bedrock (Nova Pro). Its massive context window allowed us to pass 8,000-character "Golden Few-Shot Examples," giving the model the exact logic needed to write strict Manim syntax.
- The Compute Core: Rendering Manim requires Linux-native libraries (Cairo, Pango, FFmpeg). We deployed our FastAPI backend directly on an AWS EC2 (us-east-1) Ubuntu instance. We utilized IAM Instance Profiles to securely authenticate with Bedrock without managing any API keys.
- The Visual Engine: We integrated Manim CE v0.19, utilizing a custom
ColorfulScenebase class with 22 pre-built animation helpers (phasor animations, particle physics, etc.) that the LLM can call directly.
Challenges we ran into
- The Serverless Trap: We initially considered using AWS Lambda, but Manim renders take 60–180 seconds and require writing
.pyfiles and reading back.mp4outputs. Lambda's 15-minute limits and ephemeral/tmpstorage forced us to architect a much more robust, asynchronous EC2 compute pipeline. - AI Syntax Errors: Out of the box, LLMs hallucinate library methods (like inventing a
ZoomInanimation that doesn't exist in Manim). A single hallucinated parameter crashes the entire render, which is unacceptable for a student waiting for a video.
Accomplishments that we're proud of
We are most proud of our Self-Healing Code Architecture. We built a Layer-6 Validator that automatically corrects 18 common LLM errors (e.g., converting GREY to GRAY, or stripping hallucinated kwargs).
If an error bypasses our static AST check, we run a subprocess smoke test. If it fails, we feed the traceback directly back into Amazon Nova Pro, which auto-patches its own mistake before the user ever sees an error. This pushed our success rate to 95%+, giving us a true "Zero-Crash Guarantee."
We also pioneered a Zero-LaTeX architecture, forcing the LLM to use ASCII Text() constructs instead of MathTex. This made our EC2 rendering pipeline bulletproof and immune to LaTeX compilation crashes.
What we learned
We gained deep, practical experience with the Amazon Bedrock Converse API and how to structure system prompts for complex code-generation tasks. We also learned how to right-size AWS infrastructure—knowing when to rely on the managed reasoning of Nova Pro, and when to utilize the raw compute muscle of an EC2 instance to execute arbitrary Python safely.
What's next for MentorBoxAI
We have already configured the schemas for Amazon S3 (for global CDN video delivery) and Amazon DynamoDB (for persistent job history). Our next major release will integrate these services fully, along with Amazon Polly to generate synchronized, localized voiceovers in Hindi and regional languages, bringing world-class visual education to every student in Bharat.
Built With
- amazon-bedrock
- amazon-dynamodb
- amazon-nova-pro
- amazon-web-services
- aws-ec2
- fastapi
- ffmpeg
- javascript
- manim
- python
- ubuntu
Log in or sign up for Devpost to join the conversation.