AI Therapist for AI

Inspiration

Earlier this year, a tragic news of a boy named Sewell Setzer III shook the country. After engaging in highly-sexual relationship with a fictional character on a AI powered platform, character.ai, Sewell had a conversation that was suggestive of committing suicide before he tragically took his own life with a hand gun. After hearing this news, we realized the danger of unrestrained AI models, which was growing exponentially by the day. In order to address this problem, we created AI Therapist for AI, an AI model that monitors other conversational AI agents to prevent another victim of unregulated AI safety.

What it does

Our AI model performs a preliminary monitoring job on responses provided by other LLM, quantifies the abnormality and potential of harassment in the generated response, and determine the safety of the response. If a response is deemed harmful, that is, suggestive of violence or inappropriate interaction between the user and the bot, it sends out a corrective prompt to the LLM to steer the response towards a moderated, safer version.

How we built it

We used Selenium to automate web scraping. Selenium first initializes to open Chrome, navigate to the Character AI chat page, and scrape the chat message. To get the conversational AI agent’s message, we used scraping logic that locates specific div elements on Character AI’s page using a CSS selector string due to Character. AI issuing a third-party identification every time a user logs in, we orchestrated workflows by sending requests to a webhook hosted on n8n.cloud. Once logged in, the function navigates to the Character AI page and waits for the presence of response elements, continuously looking for new responses, allowing us to have an online, real-time data flow. For real-time output, we used FastAPI to plug in the output of AI Therapist as a user interface before the user. The check-response endpoint receives responses scraped by Selenium, verifies their authenticity with a JWT (JSON Web Token), and implements additional safety checks or processing.

For the model itself, we utilized OpenAI API’s robust resources and prompt-engineered the model to be flexible based on the output of the conversational AI agent’s response by using the Langchain libraries to enable recurring responses. We accounted for the fact that AI Therapist will only directly affect the conversational AI agent and, therefore, engineered our responses to allow the flexibility of the conversational AI agent’s distinct character.

Challenges we ran into

talk about scrapping? We initially tried to train our own AI model, and that was the initial focus of this project. However, due to the limited computation power and time availability in our hands, we failed our attempt to train an AI model. Instead, we have implemented a prototype our intended model by prompt engineering GPT 4o mini model through OpenAI API and having it execute the functions and demonstrate the intended functions of our AI model.

Accomplishments that we're proud of

We're proud that, despite this being our first experience with AI, we were able to identify a real-world problem that we were truly passionate about; so much so that we were committed to giving our best for the last 25 hours and forgetting to sleep in the process.

What we learned

As none of the members on the team majors in CS, and we had to learn 90% of our project from scratch. However, this was an opportunity to grow beyond what we were comfortable withand learn to be resourceful as we scoured through every page on the internet from Github to Perplexity and be bold with our questions with Zhouli, our mentor.

What's next for AI Therapist for AI

Our next step is to gain backend data flow of the Character AI or other conversational AI agents. This would make AI Therapy completely backend and almost an integrated function that would not hinder the user's experience while protecting the them. Furthermore, we are going to train a custom model through stronger computational power on datasets such as suicidal hotline and public therapeutic sessions, which can yield increased accuracy and detection area of our model.