Sync-o-Matic

Inspiration

Have you ever noticed how people record videos of events simultaneously, yet rarely share their unique perspectives with others? These videos capture different camera angles, including fun moments and close-to-the-action shots that others may have missed.

What if there was a way to transform these individual recordings into a shared experience?

With the proliferation of recording devices and applications, the need for efficient and effective content enhancement tools is more important than ever. Let's solve that.

Imagine harnessing the power of user-generated content as additional camera angles, allowing everyone to contribute and create a collective, immersive experience. Say hello to a new era of event sharing, where every perspective counts and unforgettable moments are brought to life through a collaborative lens.

What it does

The idea is to synchronize all those videos taken from the crowd and combine them to create new forms of video content like mashups, multiple angles of live performances, voice dubbing, etc.

Specifically this could solve

Restoration: Sync-o-Matic could help restore degraded or damaged audiovisual content by aligning it with other instances of the same event.
Remixing: By aligning multiple instances of the same event, Sync-o-Matic allows for creative remixing of audiovisual content.
Remastering: Improve the quality of your content by aligning it with higher-quality instances of the same event.

How it works

Sports teams or stadiums could have a dedicated place for fans to upload their user-generated content (ie, https://gsw.sync-o-matic.com). Fans upload their videos and a video player appears of the recorded event with all the camera views available, synchronized by timecode. Fans can watch the video and change camera angles at anytime.

Another way is to use a video mashup of people singing a song at the stadium. A single video is created using Sync-o-Matic that time aligns all videos into a fun experience.

See: https://drive.google.com/file/d/12x7LBKePowCg20NBpA1RdmFrruInS6rq/view?usp=drive_link

How we built it

Sync-o-Matic works aligning of multiple unsynchronized video via forced time alignment and classification algorithms. This exploits multiple time-based audio and video data that describe the same audiovisual event through fingerprinting and Fast Fourier Transform (FFT) or Sequential Monte Carlo (SMC) samplers.

Challenges we ran into

Mostly time constraint and testing. I came up with this idea last night and coded a rough prototype this morning that calculates time differences. It needs a front-end for video upload and processing. Currently, it's a local processor that takes video inputs and outputs time alignment data for reconstruction.

Accomplishments that we're proud of

Finding a fast way to calculate the cross-correlation between the two audio signals without processing the entire file. Processing files need to be efficient and avoid costly computation.

What we learned

Several strategies exist for forced alignment. I'm interested in training custom classifiers (Hidden Markov nodels) with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes or phase changes in audio. These strategies would improve the accuracy and robustness of the processor.