IMAGINE FRAMES

Inspiration

While innovative, deep fake tech’s rapid rise minces no time in threatening the trust and integrity in digital media, we set out to develop a viable solution to detect face swap deep fakes because of their growing prevalence in fraud, misinformation, and more. The goal is to protect people, institutions, and society from malicious use of AI generated content.

What it does

By analyzing these subtle inconsistencies that are nearly imperceptible to the human eye, our AI powered system detects and authenticates face swap deep fake videos. The only high quality deep fakes it sniffs are those that are unnatural, or lit inaccurately, or have pixel level artifacts. This system is geared towards being used by law enforcement teams, cybersecurity teams, and media professionals who require an accurate, real time method to confirm video authenticity.

How we built it

Here, we constructed a hybrid model which employs Convolutional Neural Networks for facial feature detection, Recurrent Neural Networks for temporal analysis, and Frequency Domain Analysis for the pixel level artifacts. We also added Audio-Visual Analysis for lip-sync detection and ambient sound inconsistencies. Prominent datasets such as FaceForensics++ and DeepFake Detection are used to train the system, with state of the art tools including TensorFlow, PyTorch, OpenCV and Librosa used to provide system implementation.

Challenges we ran into

Accurate detection of high quality deep fakes was one of the most challenging aspects, since subtle manipulation is often successful against traditional detection techniques. Optimizing our models enabled us to achieve real time performance without sacrificing accuracy. The next hurdle was to combine audio and visual analysis coherently, so that discrepancies in lip sync and background noise are correctly signalled alongside visual artifact detection.

Accomplishments that we're proud of

So we’re proud that we’ve built a system that can detect face swap deep fakes with good accuracy. While our hybrid model demonstrated remarkable results in recognizing the most subtle manipulations, our advanced reporting tool provides rich reporting on the abnormalities we have found. Moreover, our solution exhibits scalability, and can thus adapt to new deep fake technologies and offer value in the long run.

What we learned

We learned as part of this project that it is important to integrate multiple detection techniques to have a more robust system. While the reliance of the entire detection architecture on visual analysis is not sufficient, audio and temporal analysis are crucial to achieve high detection accuracy. In addition, we gained knowledge about how to optimize deep learning models for real time applications, balancing between speed and accuracy.

What's next for Imagine Frames

Our next step is to increase the system by adding more biometric analysis like detecting micro expressions and more on the physiological cue. We also intend to keep updating the system to counter new deep fake techniques as they emerge. Further usage of the system will broaden its use along the lines of video conferencing and live streaming authentication.