The Story of Truss
Inspiration
The creator economy is huge, over $323B, but most of that value doesn't reach most creators. Roughly 73% of creators make less than $30,000 a year, and the gap usually isn't about talent. It's operational. Building an audience now means running content across Twitch, YouTube, TikTok, and Instagram at once, and the tools each platform gives you only work inside that platform. YouTube's auto-chapters don't help your TikTok clips. Twitch's clip tool doesn't touch your YouTube uploads.
So unless you're already in the top tier of creators who can afford a team, you end up paying for four different subscriptions and running between browser tabs mid-stream to watch chat, check metrics, and grab clips yourself. I wanted to build something that didn't care which platform you were on.
What it does
Truss is a platform-agnostic backend for creator content operations. It watches a live stream's chat and a video's audio/timing data, figures out which moments actually matter (engagement spikes, narrative beats), and turns those moments into ready-to-post vertical clips with captions, without making the creator redo that work separately for every platform they post to.
Concretely: chat across platforms streams into one dashboard instead of several browser tabs. When a clip-worthy moment is detected, the source video gets pulled, cropped to 9:16, captioned, and dropped into S3 automatically. In practice this has cut the post-production time I'd normally spend by around 80% and removed the need for several tools I used to pay for separately.
How I built it
Frontend. I built the interface in Next.js with the App Router, a dense "studio control panel" layout following a dark, hairline-bordered design system. I deployed it on Vercel. Live chat from multiple platforms streams into one panel using Server-Sent Events.
AI layer. For analysis, isolated audio/telemetry tracks get sent to the Vercel AI SDK using generateObject with zod schemas, so the model returns structured JSON marking narrative boundaries instead of free text I'd have to parse myself.
Data layer: DynamoDB. Everything, creator config, daily analytics across platforms, live stream state, asset indices, lives in one DynamoDB table using single-table design, with partition and sort keys separating entity types. That means fetching a creator's full operational state is one query, not five separate lookups across tables.
Processing pipeline. Two pipelines run in parallel:
- Live chat: messages get written to DynamoDB as they arrive. A DynamoDB Stream triggers a Lambda that tracks message velocity against a rolling baseline and flags a timestamp when chat spikes.
- Video: large 4K uploads bypass the edge network entirely. A Next.js Server Action generates an S3 presigned URL and the browser uploads straight to S3. Once a timestamp is flagged by either pipeline, a Step Function triggers a Lambda with a static FFmpeg binary baked in, which pulls the segment into
/tmp, crops it to 9:16, adds captions, and writes the result back to S3.
Challenges I ran into
The main constraint was serverless limits versus video file size. Multi-gigabyte files don't play nicely with Lambda's time and memory caps, and it's easy to time out if you're not careful about what runs where.
Accomplishments that I'm proud of
Getting the single-table DynamoDB design to actually hold up under three different access patterns (creator config, analytics, live state) without falling back into relational habits took a few false starts, and I'm glad I stuck with it instead of just spinning up multiple tables. I'm also glad the OIDC setup worked cleanly. Vercel assumes AWS IAM roles directly at deploy time, so there are no static AWS access keys sitting in environment variables anywhere in the project. And practically: the 75% cut in Lambda execution time from splitting audio out of the video pipeline was bigger than I expected going in.
What I learned
Most of what stuck with me was about decoupling, separating "figure out what matters in this stream" from "process this video file" so neither one blocks the other. I also spent real time learning DynamoDB single-table design properly instead of bolting relational habits onto a NoSQL table, and picked up Vercel's OIDC integration with AWS along the way, which removed a class of credential management I didn't want to deal with manually.
What's next for Truss
The detection logic right now is tuned mainly around chat velocity; I'd like to add audio cues (laughter, tone shifts, volume spikes) so it can catch moments even in lower-chat-volume streams or in recorded video with no live chat at all. I also want to support direct publishing to each platform's API instead of just producing the final clip, and add a feedback loop where a creator marking a clip as good or bad actually improves future detection for their channel specifically.
Built With
- amazon-dynamodb
- nextjs
- react
- s3
- vercel
- vercelaisdk
Log in or sign up for Devpost to join the conversation.