The Problem

Every night, millions of people open Twitch or YouTube and spend 20 minutes scrolling through live streams looking for something worth watching. By the time they find it, the best moment has already passed.

Platforms rank streams by total viewer count. That means the same ten huge streamers dominate the top forever, while a 500-viewer streamer having the most exciting moment of their career stays invisible. Viewer count is not a signal of excitement. It is a signal of existing fame.

We needed a better signal. So we built one.


What We Built

StreamPulse is a real-time live stream discovery platform that monitors Twitch, YouTube, and Bilibili simultaneously and surfaces the exact moment something incredible is happening — right now.

The hype score algorithm combines four signals:

  • Viewer spike — how fast the audience is growing relative to that stream's own historical baseline
  • Chat velocity — how fast messages are moving, modeled using power-law scaling from viewer counts
  • Clip rate — how many viewers are saving this moment
  • Donation surges — financial engagement spikes

Each signal is normalized against the stream's own history using standard deviation and z-scores. A 500-viewer streamer having a breakout moment can outscore a 50,000-viewer streamer with a stable audience. When multiple signals fire simultaneously, a synergy multiplier amplifies the score — because that combination reliably indicates a genuinely viral moment.

AWS Bedrock AI analysis triggers automatically when a stream's hype score crosses 60. Claude Haiku analyzes the real stream data — viewer delta, chat velocity, clip rate, hype score, peak reason — and writes a two-sentence explanation of why this stream is exploding right now. Not a template. Real AI on real numbers. The analysis appears on the stream card and detail page within three seconds.

Multi-platform support:

  • Twitch (100 streams via Helix API)
  • YouTube Live (50 streams via Data API v3 with 3-key rotation)
  • Bilibili (100 streams via public API, no auth required — includes a server-side image proxy to handle their CDN's Referer policy)

Anti-doomscroll session tracker. The sidebar includes a session tracker where users set an intention (what they want to watch) and a time budget. The bar counts down in real time. When it hits zero, the product tells you: you found the best moments — now go watch, not scroll. The goal of StreamPulse is to make you leave StreamPulse.


Why DynamoDB

We chose DynamoDB over Aurora because of the access pattern. A real-time leaderboard has an extreme read-to-write imbalance — we write leaderboard data every 15 seconds but potentially serve millions of concurrent readers. DynamoDB's Global Tables give us active-active multi-region writes across us-east-1, eu-west-1, and ap-southeast-1 without a primary-replica bottleneck.

Five tables, all in production:

Table Purpose TTL
SP_GlobalLeaderboard Top 100 streams per category, ranked by hype score 60 seconds
SP_StreamSnapshots Stream state + hype history for cold-start persistence 1 hour
SP_StreamPeaks Historical peak events with time-bucket GSI 7 days
SP_UserSessions User session storage 24 hours
SP_WsConnections WebSocket connection registry 1 hour

State persistence solves a real serverless problem. Vercel functions are stateless — hype history would reset on every cold start. We write the last 12 snapshots and hype history for the top 100 streams to DynamoDB after every ingest cycle. On cold start, we load that state back so hype scores accumulate correctly across function invocations. DynamoDB is the memory of our stateless infrastructure.


Technical Architecture

Ingest pipeline (every 15 seconds):

  1. Fetch all three platforms in parallel (Twitch, YouTube, Bilibili)
  2. Load persisted state from DynamoDB (restores hype history)
  3. Upsert streams into in-memory state map
  4. Compute hype scores using 4-signal algorithm
  5. Detect peaks (hype ≥ 60) → trigger AWS Bedrock
  6. Write leaderboards to DynamoDB (6 categories × top 100)
  7. Write peaks to DynamoDB
  8. Persist top 100 stream states back to DynamoDB

Read path:

  • SSE stream pushes fresh leaderboard to every connected browser every 15 seconds
  • In-memory read for performance; DynamoDB as authoritative fallback on cold start
  • Vercel edge network with maxDuration=300 handles persistent SSE connections

Scale design (beyond MVP):

  • DAX cluster (r5.large × 3) in front of GlobalLeaderboard: absorbs 99% of reads at microsecond latency
  • Lambda fan-out (10 workers × 10,000 streams each): scales ingest to 100,000+ streams
  • DynamoDB on-demand: auto-scales to any write volume
  • Current MVP cost: ~$3/day. At 10M users with DAX: ~$2,400/month ($0.00024/user/month)

AWS Services Used

  • Amazon DynamoDB — Global Tables, on-demand capacity, TTL, GSI-based queries
  • AWS Bedrock — Claude Haiku 4.5 for AI peak analysis
  • AWS IAM — credential management for Vercel → AWS access

What We Learned

Building this taught us something important about real-time systems on serverless infrastructure: statefulness is not a property of the language or framework — it is a property of where you persist. DynamoDB's BatchWrite and BatchGet patterns made it possible to persist and restore hundreds of stream histories in under two seconds, which is what made the hype algorithm actually work across cold starts.

We also learned that real data is messier than simulated data in useful ways. The hype algorithm had to be recalibrated several times because top streamers with huge stable audiences kept scoring lower than smaller streamers with volatile viewership. That recalibration — adding an audience bonus that scales logarithmically with viewer count — produced a leaderboard that feels intuitively correct, which simulated data would never have exposed.


What's Next

  • Lambda fan-out for ingest (10 workers, 100K+ streams)
  • DynamoDB Streams → Lambda for real-time peak detection without polling
  • DAX cluster for microsecond leaderboard reads at scale
  • Real chat velocity from Twitch IRC and YouTube Live Chat API
  • Kick.com integration (pending workaround for Cloudflare IP blocking)

Built With

Share this project:

Updates