BrainrotJS.com

Inspiration

I believe strongly AI will change the education space more than it will affect anything else in the next 10 years. I think by 2030, some kindergartners will be submitting research papers. Ever since GPT-2 was released, I have been non-stop pushing the limits as to what this technology can be utilized to do in the education sphere (most notable with my project smart.wtf).

What it does

Brainrot.js does a lot in the background, but abstracts all away from the user so we can say it "simply generates a short form video from celebrities of your choice who discuss a topic of your choice". The videos are generated just from a single topic, and a choice of two narrators, and within 2 minutes, your completely custom video is created!

How I built it

It is quite complicated but I will try my best to explain in brief. The interface (https://brainrotjs.com) is built with nextjs, tailwind, typescript, and tRPC. It is the interface. We then have the backend... Brace yourself haha. So we need something to store the users requests to generate in an ordered fashion such. We use MySQL DB for this. We then have a Docker Container on a EC2 instance working as a polling function. This is the brains. It invokes OpenAI API to create the transcript, runs a custom inference model built in PyTorch to generate the subtitles for the video at a word-by-word level, posts them in separate SRT files, concats the audio file into a single audio.mp3. The inference model is run with flask, and the polling is a javascript file that infinitely loops. We then merge all this together in remotion. We render our remotion project in 200 separate serverless lambda functions which in parallel fashion render the video and then concat it together. We then host the video on s3, and upload the link to s3. the database is polled on the users end waiting for this link.

Challenges I ran into

Many. Weird GLIBC incompatibility issues, rendering times of 8-15 mins (which is why I had to opt into using serverless parallelized computing, reducing it to 1 minute rendering!). Also, designing the whole rather complex architecture was quite the can of worms. Also, enforcing the google custom search engine to only return images which are publicly available was quite hard, but crucial as this is executed after much expensive laborious computation like transcript generation, audio subtitle inference generation, and more, so it is crucial it never becomes the bottleneck.

Accomplishments that I'm proud of

I will be the first Open-Source solution which is an entirely AI auto generated framework for building and rendering short form content.

What we learned

I learned video rendering sucks and is a real bottleneck! I also learned clever ways to dynamically generate subtitles on the fly with a custom inference model and how to utilize serverless computing to optimize rendering times.