FaceChat: Real-Time Emotion-Driven Text-to-Face Animation

Interface

Inspiration

While chatbots are widely used with the development of Artificial Intelligence, face-to-face interactions through AI are still emerging. FaceChat is developed to fill this gap. Originally designed to support seniors who find it challenging to read large amounts of text or adapt to fast-changing technologies, FaceChat also aims to provide emotional support to individuals facing loneliness in modern days. While still in its early stages, FaceChat explores the potential of a face-to-face conversational AI service in fields like healthcare, customer service, and entertainment.

What it does

FaceChat goes beyond traditional text-based communication by combining emotion-driven text analysis with real-time facial animations. It enables users to interact in a more engaging and immersive way, using their emotions as a key component of the conversation. Whether for personal interactions or professional support, FaceChat aims to offer a new form of AI-driven, human-like communication.

How we built it

FaceChat integrates emotion-driven text analysis with real-time facial animation. User inputs are analyzed for sentiment and sent to the fine-tuned OpenAI Assistants API. The generated response is then converted into natural speech using the ElevenLabs TTS API. The speech is then streamed to Audio2Face, which generates corresponding facial animations in real-time, displayed on a website via WebRTC.

Challenges we ran into

One of the main challenges was integrating multiple APIs with limited documentation, which made it difficult to understand how to best connect each service and ensure smooth interaction between them. Additionally, managing the flow of real-time data and ensuring low latency in both speech generation and facial animation was a complex task.

Accomplishments that we're proud of

Completing FaceChat as a fully integrated project that brings together several technologies, such as emotion-driven text analysis, natural speech synthesis, and real-time facial animation. Successfully building a connection between the frontend and backend to support real-time communication was a key milestone.

What we learned

Throughout the development of FaceChat, I gained valuable insights into the importance of fundamental computer science principles like data processing, API integration, and asynchronous programming. Additionally, understanding the challenges of real-time systems, such as latency management, was a key takeaway.

What's next for FaceChat: Real-Time Emotion-Driven Text-to-Face Animation

The next steps for FaceChat involve improving the realism of the facial animations. Currently, the facial model appears basic and lacks natural expression variation. I want to enhance the realism by incorporating more nuanced facial expressions and refining the underlying animation models.

Built With

audio2face
elevenlabs
html/css
javascript
openai
python
webrtc
websockets

Updates

Suyeon Park started this project — Dec 02, 2024 03:48 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.