Inspiration
Every basketball fan has sat through a full game recording on their phone, pinching and zooming to follow the action. Meanwhile, professional sports media companies spend thousands on dedicated camera operators and editors to produce vertical clips for TikTok, Reels, and Shorts. As basketball (and social media) fanatics, we wanted to explore a way to go from this landscaped broadcasting directly to billions online.
What it does
SnapDunk transforms standard landscape (16:9) basketball footage into dynamic portrait (9:16) video that intelligently follows the action.
Instead of naive center-cropping, it understands the game:
- Detects players and the ball in every frame using computer vision
- Scores each moment’s importance using motion signals like speed, acceleration, and energy spikes
- Labels play states (HIGHLIGHT, ACTION, POST, NORMAL) so the camera behaves differently depending on context
- Tracks a “hero” player (ball-handler) and smoothly pans the frame like a real cameraperson
- Outputs a ready-to-post vertical video uploaded directly to cloud storage
The result is automated sports content that feels professionally produced. You can export this video to a .mp4 video to customize it in an editing software and publish online.
How we built it
We built SnapDunk as a full computer vision and video processing pipeline:
- YOLOv8 (Ultralytics) for real-time detection of players and the basketball
A custom importance scoring system combining:
- Ball speed and acceleration
- Player motion
- Vertical ball position
- Motion spikes (delta + jerk) to detect highlights
- Adaptive EMA smoothing to react faster during high-energy moments
- Ball speed and acceleration
A play-state machine with adaptive thresholds and POST cooldown, giving the system narrative awareness
A hero tracking system that:
- Assigns ball possession to a player each frame
- Tracks bounding box centers instead of raw pixels
- Uses adaptive lerp rates, deadbands, and velocity-based lookahead
- Assigns ball possession to a player each frame
OpenCV for frame processing and final rendering
A Flask API on Google Cloud Run for scalable processing
Google Cloud Storage for input/output handling
Challenges we ran into
Ball detection is unreliable in real footage
The basketball is small, fast, and often occluded. YOLO frequently misfires or misses frames entirely.
- We filtered detections, selected highest-confidence candidates, and handled missing frames gracefully.
Player identity is unstable across frames
YOLO does not provide consistent IDs, and detection order changes every frame.
- We implemented nearest-neighbor matching using spatial proximity instead of index tracking.
Smooth camera movement without lag
Naively following the ball causes jitter, while heavy smoothing introduces lag.
- We built a multi-layer system with adaptive lerp rates, deadbands, max step sizes, and urgency boosts.
Hero flicker between players
When multiple players are near the ball, focus rapidly switches between them.
- We added temporal consistency rules to preserve the previous hero within a spatial threshold.
Accomplishments that we're proud of
- The camera behavior feels human-like, with smooth tracking and intelligent framing
- Importance scoring captures real basketball moments without hardcoded rules
- Fully automated pipeline from raw video to upload-ready portrait output
- Scalable deployment as a single Cloud Run endpoint
What we learned
- Sports footage is significantly harder than standard CV tasks due to small, fast, cluttered objects
- Good camera movement depends more on what you don’t move (deadbands, holds) than constant tracking
- Adaptive thresholds are critical — different clips require different baselines
- Combining multiple weak motion signals creates strong, reliable behavior
What's next for SnapDunk
- Enhancing basketball detection- some frames aren't able to distinguish quite well
- Audio analysis to incorporate crowd and commentary excitement
- Expanding to other sports (soccer, football, tennis)
- Automatic highlight reels from HIGHLIGHT + ACTION segments
- Real-time processing for live game feeds
- Custom overlays like scoreboards, player labels, and replay markers


Log in or sign up for Devpost to join the conversation.