RealFakeNews

Prototype
***BREAKING NEWS***

Inspiration

Print media is boring!

Is print media still a thing?

Who would read walls of text when the information fits into a 10 seconds TikTok video?

Videos get 1,200% more shares than text and images combined.

The print media industry is now researching, how to bring their printed newspapers into the digital century. - We think a step ahead and try to introduce them directly into a world where even digitized text is already old-fashioned.

Today is the era of the video!

What it does

We use state of the art AI technology to understand news articles and automatically transform them into short video clips which can be posted on social networks like TikTok or Instagram to make news attractive for younger generations again.

How we built it

We use Hugging Face Transformers to summarize the articles before they enter the audio-visual processing. Tacotron2 helps us to synthesize audio for certain speakers (as far as our models go) with given text. We then use Wav2Lip to synchronize the speaker's lips based on the synthesized audio.

The processing pipeline is integrated into a FastAPI backend that gets triggered from a React frontend. We have set up the stack with docker compose allowing us to easily integrate it into existing ecosystems. Unfortunately, beside having Azure credits, we did not have the time to deploy the stack and make the demo accessible for everyone.

Challenges we ran into

Our models only work well with installed CUDA. Unfortunately, only one of our team members had a GPU-ready notebook. The others had to switch to the cloud for training purposes.
FastAPI spawns multi-threaded apps. Torch creates directories when being instantiated leading to an attempt of creating these directories multiple times in a multi-threaded setting resulting in an error.

Accomplishments that we're proud of

We implemented our own DeepFake model during one weekend that is capable of creating a real-looking video with synthesized audio and synchronized lips solely based on provided text.
We built an end-to-end system utilizing our knowledge in frontend and backend development as well as machine learning.

What we learned

Established libraries aren't necessarily safe from bugs.
Package version management in Python can be a true nightmare (even though we weren't sleeping).

What's next for RealFakeNews

Provide additional speaker models, also in other languages
Integrate text translation
Deploy the application
Make the code more efficient (smells like some late night memory leaks here :D)
Allow user to create own models (we implemented some fine-tuning script that has to be integrated) within minutes