VoiceCraft: AI-Enhanced Inclusivity

"We were inspired by the AI text-to-video application and fascinated by its potential to empower mute individuals to speak again. The goal is to assist deaf-mute people in integrating into the workplace.

Our project allows users to generate presentations starting from a PowerPoint file and a selfie. Users can also customize the emotion they want to convey during the presentation.

We used Python to extract comments from the PowerPoint file and fed them to D-ID API to retrieve the video of the presenter. We generated a video of the slides with the same tempo as the video generated from the D-ID APIs. We integrated the two videos inside a highly customizable template using Remotion.

Unfortunately we did not manage to deploy the model online, also for free trials limitation of api's we are depending on, and the project can only be tested locally.

The pipeline works from start to finish, even with our team having only two members. We learned to use AI-generated videos, a quite exciting tool to master.

What's next for VoiceCraft: AI-Enhanced Inclusivity Our goal is to enable real-time generation of avatars for meetings between mute and non-mute individuals. The same applies to deaf people with real-time sign language generation."

Built With

chatgpt
d-id
fastapi
python
react

Updates

mau lei started this project — Nov 18, 2023 09:30 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.