Inspiration
I heard the idea in a podcast about micro SaaS ideas, and after seeing the limitations of existing subtitle apps, I knew that my app could significantly improve the subtitle generation process.
Luckily, I had already played around with Remotion prior to this hackathon, so I had a rough idea of how to build the app. However, I was still completely new to the Canva SDK and AWS.
How I built it
The frontend leverages the App UI Kit components to create a native look and feel. One custom component I am very proud of is my preview component, which provides a 1:1 preview of the video (regarding subtitle style and layout). It's a div with the same width and height as the video but scaled down to fit the app's space. This way, I can show exactly how the subtitles would look inside the video.
I am using S3 to save the users' fonts and my PostgreSQL database to store the presets as JSON. The video rendering and audio extraction are done using Remotion, and the transcription is created with Whisper.
To quickly update the progress of the video, I am using Upstash Redis to save the render progress, which gets fetched at a fixed interval from the frontend.
Challenges I ran into
I had some problems trying to extract the audio file from the video. I tried using fluent-ffmpeg but didn't manage to get it running on the server, so I extracted the audio using Remotion.
Another issue that caused me to redesign my entire backend was discovering that API routes time out after 30 seconds (which makes sense in hindsight). Although this was frustrating, it allowed me to make the generation process independent of the client. To make this work, I also had to create my own Lambda function to handle the transcript generation and avoid the 30-second timeout. This was challenging for me, as I had never used AWS Lambda before, but I managed to get it running in the end (Remotion renders using Lambda, but there everything is handled for me).
What I learned
- AWS Lambda, S3, Amplify
- Canva's Apps SDK
- More use cases of Redis
What's next for Styled Subtitles
I think the app is already quite usable, but there will probably be a lot of small bugs that I'll need to fix over the next couple of weeks after the app has launched. If I find the time, I will also implement feature requests from users.
Another feature idea I had, which would take a bit more time, is the ability to manually edit the generated subtitles to fix mistakes made by Whisper and to ensure the subtitles are generated exactly the way the user wants.
Built With
- lambda
- nextjs
- postgresql
- react
- redis
- remotion
- s3
- typescript
- whisper
Log in or sign up for Devpost to join the conversation.