Inspiration
With the introduction of the federal government's Autism CARES Act, autism and communication problems are receiving more attention nowadays. According to the Centers for Disease Control and Prevention (CDC), approximately one in 46 children in the United States faces communication barriers. We are passionate about mental health awareness and were inspired to develop an emotion detection system. We want to create an AI-powered emotional chat tool to facilitate better communication. Many children desire more communication and companionship. We believe that FaceChat.ai will help build the bridges across to worldwide.
What it does
FaceChat utilizes remote video emotion analysis supported by artificial intelligence and computer vision to provide a platform for individuals facing communication barriers. This service offers enhanced features that effectively improve communication. Through virtual interactions, users can employ sentiment analysis to determine whether their language might be hurtful to others.
How we built it
We first start with a video stream, building a socket server to establish the interactive system. We use React.js as the framework for our website development, incorporating internal components "styling" accordingly. After a user completes a sentence, we call the emotion detection API to analyze the emotional impact on the other party. Meanwhile, we use the audio processing interface to convert speech into text. Combining these two interfaces, we import the data into our emotion analysis ChatGPT model to provide user feedback. Based on feedback, users can make improvements and apply them to subsequent conversations.
Challenges we ran into
We are particularly proud to have implemented the "video on the web" module. Our team's expertise was crucial in understanding web sockets and the HTTP protocol. Our biggest allies in this process have been Chrome's developer tools, which we used to build the foundational framework of the web. We are also proud of the algorithms that connect the emotion detection and speech conversion modules. We use a delay method to ensure the correct sequence of these modules. A significant challenge in this process is balancing achieving better performance without wasting computing power. After hours of debugging and programming iterations, we refined the data structure to produce a more intuitive output. Another challenge is debugging large language models. We need to set different conditions to ensure that the production of GPT does not deviate. By controlling the word count, we achieve the right amount of information. Through 'core sampling,' we can distinguish the correct topics. Simultaneously, we set the 'frequency penalty' and 'presence penalty' to keep the text within a reasonable range.
What's next to do
We are all very excited about the development of this app. In terms of future technological advancements, we believe the following steps will elevate our application to a new level. Implementing a remote sign language module for contactless conversion into text. Introducing multi-channel video calling. Enhancing security features.
Log in or sign up for Devpost to join the conversation.