vision by vram

Prediction.chat: Transforming Twitch Streams into Betting Arenas

Inspiration

The idea came during a Twitch stream where fans kept saying "bet you can't land that backflip!" We saw untapped potential here. Posts from gamers asking "Why no live bets on speedruns?" confirmed our thinking. We wanted to make every stream more exciting, where creators earn more and viewers stay longer.

What it does

Our VRAM vision module detects key moments in Twitch streams in real-time. It identifies specific actions like backflips, gaming achievements, and even recognizes streamers like IShowSpeed. This computer vision layer provides the verification foundation for the betting platform, analyzing video frames and returning confidence scores with timestamps.

How we built it

We built VRAM using Amazon Rekognition and Google Vision APIs running on AWS EC2. Our system uploads video segments to S3, processes them through parallel AI services, and combines results using AWS Bedrock for final decisions. We implemented frame caching and created custom confidence thresholds for different action types. For streamer recognition, we built a face search system that can identify personalities with timestamps of their appearances.

Challenges we ran into

Getting accurate event detection was harder than expected. AWS Rekognition often returns generic "Sport" labels (90% confidence) instead of specific actions like "Jumping." We solved this by creating two detection categories: strict indicators at 65% threshold and loose indicators at 85% threshold. Processing speed was critical—we reduced API latency from 200ms to 50ms through async polling and optimized S3 uploads.

Accomplishments that we're proud of

Our vision module successfully detects complex movements with 95% accuracy by combining multiple AI services. We can reliably identify popular streamers like IShowSpeed, including exact timestamps of their appearances in videos. The system processes video in near real-time, providing verification results in under 5 seconds, which is essential for a responsive betting experience.

What we learned

Computer vision APIs have different strengths—Rekognition excels at general labels while Google's API better catches specific actions. We learned to interpret confidence scores contextually rather than using fixed thresholds across all categories. The key insight was creating a reasoning layer to interpret these signals together rather than relying on any single detection point.

What's next for Prediction.chat

We're focusing on improving our vision module by training custom models for streaming-specific events that standard APIs miss. We'll expand detection capabilities to game-specific achievements by identifying in-game UI elements. We're also working on reducing verification time to 3 seconds and adding multi-feed support for tournaments. The vision system will eventually integrate with game APIs for even more accurate verification.