Inspiration
This was the last thing on my mind to produce during this hackathon. I had several other projects that I wanted to complete for the hackathon. But, one of the requirements for submission to the hackathon was a video showcasing the projects. There was also a challenge for building in public. I thought what better way to use ElevenLabs as well as learn about A/V processing than to create a project for the hackathon
What it does
VibeOn has several parts to it, not all of them are fully functional. VibeOn allows users to record screen captures (with microphone and some browsers system sound), PiP (picture in picture) using the computer's camera. If screen recording isn't available the system provides the option to record using an available camera. Once the user has a video recorded, they can enter a script for ElevenLabs to generate audio for the video. The user can upload outside audio sources as well and they can concatenate different audios together to create a whole audio for the video. When ready, the user can merge the audio with the video to create a complete product.
How we built it
I primarily used Bolt.new and prompted it into existence. Some of the features would work out of the box, but others would break. The most complicated part was in browser ffmpeg - not all cdns have the wasm file available. I also used ChatGPT, Claude, and Cursor. Bolt would become very circular with it's changes or recommendations with ffmpeg and it's recommendations and solutions were incorrect. I pulled the system into Cursor to try to work on the issues; however, it couldn't solve them either. Finally, I had a battle of the AIs. I took the ffmpeg code and put it into ChatGPT and had it make recommendations to correct it, took those recommendations into Claude Sonnet 4 and had it analyze the code and make recommendations. After each iteration, I would test the code in Cursor. I went through many iterations before getting a working version. I put the working ffmpeg back into Bolt to continue building the project. Eventually, the project became very monolithic. I tried to have Bolt refactor the code, but after several failed attempts and almost decimating the whole project, I pulled it into Cursor and had it perform the refactor. Initially testing within Bolt's webcontainer worked fine, but as the project grew in size and scope the container would fail. I suspect due to memory limits since A/V processing is very resource intensive.
Challenges we ran into
I never intended for this project to take up as much time as it did, but given the challenges, it took most of my available hackathon time, with my final push being cut short by an early expiration of the free weekend tokens and the elimination of buying extra tokens and being given the only option would be to upgrade. I was in the process of doing some refactoring on this system when the tokens expired (1.5 hours early). That was a major challenge.
Bolt getting in it's own way and wanting to make changes that I would explicitly tell it not to since it had already gone down that route.
Every browser is a little different with how it does video processing, what would work in one browser didn't work in another.
I would get something working with Bolt and the next changes something else would break.
It has definitely been a long road with the project.
A huge challenge that I was able to overcome, via a hack or workaround, was with webm formats. Webm formats are notorious for not saving video duration meta data which is needed for merging videos. I was able to get around this through a lot of iteration. Since it's all in the web, converting the video was too resource intensive. In the video you'll see some NaNs due to this issue.
Latest challenge - A deployment issue - Website isn't being displayed, I might have to manually deploy the site to get it up and running for the judges.
Something else that is interesting, FireFox works better than any of the other browsers. I can merge a video and audio within FireFox that fails in Brave, and only partially merges in Edge.
Accomplishments that we're proud of
I'm proud that I have developed an easy to use Audio/Video editing project that I can continue to develop and make better.
I'm proud of being able to create a system and use the system that I created to generate the videos for my project submissions.
What we learned
I learned a lot about audio and video processing (Something I knew nothing about prior to this project) I learned a lot about Bolt's abilities, I believe that I pushed it to it's limits
What's next for VibeOn
I'm going to continue to develop VibeOn by refactoring it to call on an ffmpeg server (I plan on creating one to run on a VPN I have).
I'm going to add more complex video editing capabilities but make them simplified enough for anyone to use.
I would like to add an LLM that can scan the videos and create a script for ElevenLabs and do a merge from there.
Log in or sign up for Devpost to join the conversation.