InStep: Your Personal Dance Coach
Problem
Learning complex dance routines from a video is fundamentally frustrating. Dancers struggle with two core issues:
- Lack of Objective Feedback: It’s difficult to quantitatively measure and identify where their technique, style, and flow break down, forcing reliance on subjective self-critique or expensive professional coaches.
- Synchronization Barrier: Manually syncing a practice video with a reference video is tedious and time-consuming, a necessary pre-step that introduces friction before practice even begins. ## Solution (InStep)
InStep is an analytical web tool that transforms the way dancers practice. It provides move-by-move, actionable critiques by comparing a user's dance performance directly against a reference video, focusing purely on movement and style, not body type.
Key Features:
- Move-by-Move Analysis: The system isolates the exact timestamps where the user's technique or style differs from the reference.
- Targeted Feedback: At each deviation, a comment appears with a specific critique (e.g., "Fix your posture," "Improve your flow") and tips for improvement.
- Data-Driven Score: Aggregates critiques into a quantifiable Numerical Score for tracking progress.
- Intuitive Video Viewing Page: Features include a scrubber with color-coded sections flagging moments of style deviation and a speed adjuster for detailed review. ## Technical Implementation
Two-part backend pipeline:
Accurate Synchronization
Before any analysis, the two videos (user practice and reference) must be perfectly aligned. We implemented a robust, non-AI solution to guarantee millisecond-perfect sync:
- Discrete Cross-Correlation with FFT: We extract the audio files from both videos and use a Discrete Cross-Correlation coupled with a Fast Fourier Transform (FFT). This process identifies the exact millisecond offset between the two audios, aligning similar beats and rhythms.
- Dynamic Video Padding: To ensure a clean side-by-side display, the system automatically detects which video has a longer intro/outro and pads the shorter video with a black screen until the moment the synchronized dance starts/ends.
Multimodal Analysis
- Local Pose Analysis (Complete): We leverage MediaPipe for multimodal analysis. By extracting 33 body landmark points from synced frames of the user's video against the reference, the system focuses strictly on movement and style to generate critiques using weighted Euclidean distance comparisons.
- Bias Mitigation: A core design principle is not to critique based on differences in body type. We mitigate this by normalizing poses to a common centroid and comparing joint angles rather than absolute positions, ensuring fair evaluation across diverse body types.
- Security: As the project will be pushed to a public repository, all sensitive credentials are secured via best practices (e.g., environment variables) to prevent public exposure. Currently, no external API keys are required since all analysis runs locally. ## Why InStep?
InStep directly addresses a painful user need with a two-pronged, technically elegant solution:
- Technical Depth: The combination of a high-fidelity, non-AI audio sync method (FFT/Cross-Correlation) and a state-of-the-art multimodal approach analysis showcases both classic engineering and modern applications.
- Clear Value Proposition: It provides a quantifiable and objective dance coaching experience, bridging the gap between practicing hard and practicing smart. It's a personal coach accessible to everyone, everywhere.
Log in or sign up for Devpost to join the conversation.