MentorBoxAI

Tech Stack
Architecture

Inspiration

Growing up and studying engineering at IIIT Nagpur, we've seen firsthand how millions of JEE and NEET aspirants across India struggle to visualize complex physics and mathematics. When we looked at modern AI video generators like Sora or Runway, we saw a fatal flaw for education: diffusion models hallucinate. They guess frame-by-frame, placing pixels with absolutely zero mathematical harmony.

If a student needs to understand Simple Harmonic Motion, generic AI will draw a warped, physically inaccurate pendulum. They need deterministic precision—like the wave equation $\frac{\partial^2 y}{\partial t^2}=c^2\frac{\partial^2 y}{\partial x^2}$ rendered flawlessly. We wanted to build a "Glass Box" engine that doesn't just guess pictures, but understands the underlying math to deliver absolute, 3Blue1Brown-quality precision.

What it does

MentorBoxAI is a "Text-to-Teacher" engine. Instead of generating slow, inaccurate pixels, our AI generates hyper-efficient Manim Python code.

A student types in a topic (e.g., "Bohr's Atomic Model"). Our 6-layer pipeline distills the pedagogical narrative and authors structurally perfect code to animate it. For example, instead of drawing a blurry atom, it procedurally generates the exact physics on screen: $$E_n=-\frac{13.6\text{ eV}}{n^2}$$

The result? A hyper-personalized, mathematically flawless 720p educational video generated in under 3 minutes, at an AWS infrastructure cost of just ~$0.028 per video.

How we built it

We engineered a 100% AWS-native stack to handle the heavy computational demands of vector rendering and multimodal reasoning:

The AI Brain: We utilized Amazon Bedrock (Nova Pro). Its massive context window allowed us to pass 8,000-character "Golden Few-Shot Examples," giving the model the exact logic needed to write strict Manim syntax.
The Compute Core: Rendering Manim requires Linux-native libraries (Cairo, Pango, FFmpeg). We deployed our FastAPI backend directly on an AWS EC2 (us-east-1) Ubuntu instance. We utilized IAM Instance Profiles to securely authenticate with Bedrock without managing any API keys.
The Visual Engine: We integrated Manim CE v0.19, utilizing a custom ColorfulScene base class with 22 pre-built animation helpers (phasor animations, particle physics, etc.) that the LLM can call directly.

Challenges we ran into

The Serverless Trap: We initially considered using AWS Lambda, but Manim renders take 60–180 seconds and require writing .py files and reading back .mp4 outputs. Lambda's 15-minute limits and ephemeral /tmp storage forced us to architect a much more robust, asynchronous EC2 compute pipeline.
AI Syntax Errors: Out of the box, LLMs hallucinate library methods (like inventing a ZoomIn animation that doesn't exist in Manim). A single hallucinated parameter crashes the entire render, which is unacceptable for a student waiting for a video.

Accomplishments that we're proud of

We are most proud of our Self-Healing Code Architecture. We built a Layer-6 Validator that automatically corrects 18 common LLM errors (e.g., converting GREY to GRAY, or stripping hallucinated kwargs).

If an error bypasses our static AST check, we run a subprocess smoke test. If it fails, we feed the traceback directly back into Amazon Nova Pro, which auto-patches its own mistake before the user ever sees an error. This pushed our success rate to 95%+, giving us a true "Zero-Crash Guarantee."

We also pioneered a Zero-LaTeX architecture, forcing the LLM to use ASCII Text() constructs instead of MathTex. This made our EC2 rendering pipeline bulletproof and immune to LaTeX compilation crashes.

What we learned

We gained deep, practical experience with the Amazon Bedrock Converse API and how to structure system prompts for complex code-generation tasks. We also learned how to right-size AWS infrastructure—knowing when to rely on the managed reasoning of Nova Pro, and when to utilize the raw compute muscle of an EC2 instance to execute arbitrary Python safely.

What's next for MentorBoxAI

We have already configured the schemas for Amazon S3 (for global CDN video delivery) and Amazon DynamoDB (for persistent job history). Our next major release will integrate these services fully, along with Amazon Polly to generate synchronized, localized voiceovers in Hindi and regional languages, bringing world-class visual education to every student in Bharat.

Built With

amazon-bedrock
amazon-dynamodb
amazon-nova-pro
amazon-web-services
aws-ec2
fastapi
ffmpeg
javascript
manim
python
ubuntu

Updates

Ansh Patidar started this project — Mar 10, 2026 05:11 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.