Overview

Growing up, all of our team members had enjoyed watching Grant Sanderson’s YouTube channel 3Blue1Brown, which has beautiful animations rendered through manim, a programmatic animation library. And while one of manim’s core features is its fine control over its visuals, this also creates a steep learning curve for beginners who just want to create simple math visualizations. So, we wanted to create an app which makes it easy for anyone to generate manim animations along with a detailed lesson specifically for math and physics topics, both for personal exploration and instructional wishes.

Once a user logs in, they are greeted with a prompt box in which they can ask about anything that they’re curious about!

input

Then, they are taken to a page which has a visualization of the concept along with a summary of the topic, a longer form “lesson”, and questions.

lesson

How we built it

We built our website using the MERN stack, tailwindcss, Katex, and along with the OpenAI api and manim.

stack

The general flow for the backend is as follows. Upon a query from the user, we use our engineered prompt and make a call to the OpenAI API and receive back manim code along with text-based lesson content. Afterwards, we run manim on the server-side to generate an mp4 and store it on the server. Lastly, the stored video along with the text-based content (summary notes, exercises, etc.) is sent down to the client for display.

On the front end, we used Tailwind CSS to style the website along with KaTeX to render the LaTeX on the lessons.

Challenges we ran into

One challenge that we ran into was prompting gpt to properly generate the kind of manim code that we desired. Because there isn't much manim usage to reference on the internet, gpt would often hallucinate manim code that was unable to compile on our backend. To address this, we improved the prompting and automatically post-corrected common mistakes in the output code. When that wasn’t enough, we just had gpt regenerate more code.

Another challenge that we ran into was storing and sending video files, which no team members had prior experience with. For the purposes of the hackathon, we came up with the more temporary solution of just storing the videos on the server and sending it down.

Accomplishments that we're proud of

We’re proud that we were able to get nice visualizations for several use cases. Some of our favorites are pictured & linked below!

distributive property

cardioids

graph neural nets

fourier transforms

We were also able to generate very detailed and informative lecture notes in LaTex. Here’s one about quaternions!

quaternions

What we learned

GPT is very unfamiliar with Manim, so we had to teach it the language and syntax essentially from scratch. The way that we did it was through taking inspiration from the following paper, titled Grammar Prompting for Domain-Specific Language Generation with Large Language Models by Wang et al. (https://arxiv.org/abs/2305.19234). Due to time constraints, we were unable to implement the full algorithm described in the paper for generating highly optimized prompts but we took inspiration from the usage of Backus-Naur form (BNF) grammar. We could not find the BNF grammar of Manim online (as it is highly specialized), and the library for Manim is extensive so creating the BNF grammar by hand would’ve been extremely time-intensive. As such, we instead prompted GPT-4 to generate the BNF form of Manim, given a set of examples with attached docstrings that explain how the code snippets work. Some manual tweaking was performed, and afterward the BNF grammar obtained was fed into another prompt to generate Manim code (and dramatically improved performance). We also tried using a spec for Manim, but the density of syntax information is lower and as such more tokens are necessitated/these prompts were less efficient.

We also took inspiration from the paper titled, Large Language Models are Zero-Shot Reasoners by Kojima et al., which highlights several zero-shot techniques to improve performance for LLMs. Specifically, we referred to the Instructive category of prompts, and asked the LLM to think step by step through which parts of the BNF grammar would be necessary to generate a high-quality Manim animation for a specific topic.

What's next for Automanim

In the future we are planning on improving the performance of the manim code generation. Currently the model doesn’t know what its output looks like. Therefore, the output often consists of basic aesthetic errors such as text that isn’t centered or unintended overlapping images.

We would also like to start testing with actual clients. We see the product being used by anyone who wants to better learn or teach math.