posted an update

Python in the Pipeline

Python plays a crucial role in the backend analysis for Trace AI. After a dance video is uploaded, pose estimation data is extracted from the video using TensorFlow JS. These pose estimations are stored as JSON files, containing keypoints that represent human body joints at each frame of the video.

The core of our comparison process is Dynamic Time Warping (DTW), which measures the similarity between two sequences of dance movements. Below, I’ll walk you through how the code processes and compares choreography data.

1. Loading and Normalizing Pose Data

The first step in our analysis is to load the pose estimation data from the JSON files. Each file contains keypoints, such as the coordinates for the shoulder, elbow, wrist, hip, knee, and ankle.

def load_pose_data(file_path):
    """Load pose data from a JSON file."""
    with open(file_path, 'r') as f:
        return json.load(f)

Once we load the data, we normalize the keypoints based on the torso length, which helps standardize the size of the person in the video and eliminates variations caused by different camera distances or body sizes.

def normalize_keypoints(pose):
    """Normalize keypoints based on torso length and center the pose."""
    keypoints = {kp['name']: np.array([kp['x'], kp['y']]) for kp in pose}

    # Calculate the torso length
    if 'left_shoulder' in keypoints and 'left_hip' in keypoints:
        torso_length = np.linalg.norm(keypoints['left_shoulder'] - keypoints['left_hip'])
    else:
        torso_length = 1.0  # Default to 1 to avoid division by zero

    # Center the pose at the hip center
    if 'left_hip' in keypoints and 'right_hip' in keypoints:
        hip_center = (keypoints['left_hip'] + keypoints['right_hip']) / 2
    else:
        hip_center = np.array([0.5, 0.5])  # Default center

    # Normalize keypoints
    normalized_keypoints = []
    for kp in pose:
        coord = (np.array([kp['x'], kp['y']]) - hip_center) / torso_length
        normalized_keypoints.append({'x': coord[0], 'y': coord[1], 'name': kp['name']})

    return normalized_keypoints

2. Calculating Joint Angles

Next, we compute the angles between key joints, such as the elbows, shoulders, hips, and knees. These joint angles provide crucial information about the dance moves and enable a more detailed comparison between two dancers.

def calculate_joint_angles(pose):
    """Calculate angles between joints."""
    keypoints = {kp['name']: np.array([kp['x'], kp['y']]) for kp in pose}

    angles = {}

# Example: Left Elbow Angle Calculation

   if all(k in keypoints for k in ['left_shoulder', 'left_elbow', 'left_wrist']):
        v1 = keypoints['left_shoulder'] - keypoints['left_elbow']
        v2 = keypoints['left_wrist'] - keypoints['left_elbow']
        angles['left_elbow_angle'] = angle_between_vectors(v1, v2)

    # Continue for other joints...

    return angles

These joint angles are then normalized and combined with the keypoint data to create a comprehensive feature vector representing each frame in the video.

Feature Extraction

We extract the features from each frame of the video, consisting of normalized keypoints and joint angles. This allows us to represent a dancer's movement in a structured format that can be compared frame-by-frame with another video.

def extract_features(video_data):
    """Extract feature sequence from video pose data."""
    features_sequence = []
    for idx, frame in enumerate(video_data):
        if not frame:
            continue
        pose = frame[0]
        normalized_pose = normalize_keypoints(pose)
        angles = calculate_joint_angles(normalized_pose)
        feature_vector = []
        for kp in normalized_pose:
            feature_vector.extend([kp['x'], kp['y']])
        for angle in angles.values():
            feature_vector.append(angle / 180.0)  # Normalize angles
        features_sequence.append(feature_vector)
    return features_sequence

Comparing Sequences with Dynamic Time Warping (DTW)

To compare two sequences of dance movements, we use Dynamic Time Warping (DTW), which finds the optimal alignment between two time-dependent sequences, even if they have different lengths. DTW helps us handle variations in speed and timing between different dancers performing the same choreography.

def compare_sequences(sequence1, sequence2):
    """Compare two sequences using Dynamic Time Warping (DTW)."""
    distance, path = fastdtw(sequence1, sequence2, dist=euclidean)
    normalized_distance = distance / len(path)
    max_normalized_distance = 10.0  # You may need to adjust this value
    similarity = max(0, 100 - (normalized_distance / max_normalized_distance) * 100)
    return similarity

After comparing the sequences, we return a similarity score, which indicates how much one video resembles another. A high similarity score suggests that one dance might be a copy of the other.

Challenges and Next Steps

One of the main challenges we faced was converting the 2D pose data from TensorFlow into a 3D mesh file compatible with Panda 3D for more advanced analysis. Although we made significant progress on keypoint-based analysis, we are still working on fully integrating the mesh analysis with Panda 3D.

Our next steps include refining the pose data analysis and enhancing the user interface, so dancers and users can easily track choreography authorship and originality. We are also exploring using augmented reality (AR) glasses to bring these 3D dance models to life.

If We Had More Time...

We would definitely have tweaked our GUI to dynamically tag uploaded videos with accurate credits and ownership acknowledgement. This can be done through an iterative process, comparing every JSON file uploaded to Firebase to each other but itself.

Log in or sign up for Devpost to join the conversation.