Inspiration
While chatbots are widely used with the development of Artificial Intelligence, face-to-face interactions through AI are still emerging. FaceChat is developed to fill this gap. Originally designed to support seniors who find it challenging to read large amounts of text or adapt to fast-changing technologies, FaceChat also aims to provide emotional support to individuals facing loneliness in modern days. While still in its early stages, FaceChat explores the potential of a face-to-face conversational AI service in fields like healthcare, customer service, and entertainment.
What it does
FaceChat goes beyond traditional text-based communication by combining emotion-driven text analysis with real-time facial animations. It enables users to interact in a more engaging and immersive way, using their emotions as a key component of the conversation. Whether for personal interactions or professional support, FaceChat aims to offer a new form of AI-driven, human-like communication.
How we built it
FaceChat integrates emotion-driven text analysis with real-time facial animation. User inputs are analyzed for sentiment and sent to the fine-tuned OpenAI Assistants API. The generated response is then converted into natural speech using the ElevenLabs TTS API. The speech is then streamed to Audio2Face, which generates corresponding facial animations in real-time, displayed on a website via WebRTC.
Challenges we ran into
One of the main challenges was integrating multiple APIs with limited documentation, which made it difficult to understand how to best connect each service and ensure smooth interaction between them. Additionally, managing the flow of real-time data and ensuring low latency in both speech generation and facial animation was a complex task.
Accomplishments that we're proud of
Completing FaceChat as a fully integrated project that brings together several technologies, such as emotion-driven text analysis, natural speech synthesis, and real-time facial animation. Successfully building a connection between the frontend and backend to support real-time communication was a key milestone.
What we learned
Throughout the development of FaceChat, I gained valuable insights into the importance of fundamental computer science principles like data processing, API integration, and asynchronous programming. Additionally, understanding the challenges of real-time systems, such as latency management, was a key takeaway.
What's next for FaceChat: Real-Time Emotion-Driven Text-to-Face Animation
The next steps for FaceChat involve improving the realism of the facial animations. Currently, the facial model appears basic and lacks natural expression variation. I want to enhance the realism by incorporating more nuanced facial expressions and refining the underlying animation models.
Built With
- audio2face
- elevenlabs
- html/css
- javascript
- openai
- python
- webrtc
- websockets
Log in or sign up for Devpost to join the conversation.