Customer Service Evaluator - For Humans and AI Chatbots

A view of the homepage of the Customer Service Evaluator.

Inspiration

As a student on the cusp of graduating from the University of Pennsylvania, I've been captivated by the transformative potential of artificial intelligence, particularly in the realm of customer service. With the rapid advancement of generative AI, I believe that chatbots will soon become the dominant medium for customer interactions. However, this shift raises a critical question: how can we effectively evaluate and compare the performance of both human and AI agents to ensure the highest quality of service?

What it does

The Customer Service Evaluator is an innovative platform that leverages cutting-edge AI technologies to automate the analysis of customer service chat logs. By harnessing the power of natural language processing and machine learning, this tool enables companies to efficiently assess agent performance, identify areas for improvement, and make data-driven decisions to elevate the customer experience. It provides a holistic view of agent performance through a custom formula that combines multiple metrics, such as sentiment scores, response times, and resolution rates, into a single, easy-to-understand Overall Score.

How I built it

As a backend developer with prior experience working with AI APIs, I tackled the complex task of building sophisticated NLP models for sentiment analysis and performance metric extraction. I experimented with various techniques, from feature engineering to model training, to create a robust and accurate system. For the frontend, I crafted an interactive dashboard that empowers users to explore chat data, visualize sentiment trends, and uncover valuable insights at a glance.

Challenges I ran into

One of the key challenges I faced was stepping out of my comfort zone as a backend developer and diving into the world of UI/UX design. Through countless iterations and a steep learning curve, I created an intuitive and visually appealing frontend. Additionally, I encountered an unexpected finding during the development process: the base Gemini 1.0 model outperformed my tuned model by a significant margin when tuning the sentiment analysis model using a dataset of over 500 texts and their associated sentiments. The base model had an accuracy score 34% higher than the tuned model. This shows the power that these base models have and that tuning isn't always necessary.

Accomplishments that I'm proud of

I am excited about the potential impact of the Customer Service Evaluator in revolutionizing customer service quality assurance for thousands of companies worldwide. With the enormous size of the customer service industry, and the growth of AI powered chatbots, having an efficient way to evaluate both types of representatives will only grow more important.

What I learned

Through this project, I gained valuable lessons about the importance of thorough testing and validation when working with AI models. I also learned the significance of creating intuitive and visually appealing user interfaces to present insights in a clear and actionable way. Moreover, I discovered the vast potential of AI technologies to transform businesses and improve people's lives across various industries and use cases.

What's next for Customer Service Evaluator - For Humans and AI Chatbots

Looking ahead, I envision expanding the platform's capabilities to include real-time analysis and suggestions to help agents (whether human or AI) to improve during a chat. Additionally, as many CS conversations occur over the phone, I'd like to implement a voice-to-text model to incorporate voice files. Also, since this project requires a strict format for the input of chat logs, I plan on adjusting this to accept more ambiguous files. Lastly, a much larger system should be built to store information about each chat and hold multiple chats for different agents. This will allow more comprehensive analytics for both individual agents and a company as a whole.

Built With

Updates

Evan Fenster started this project — Apr 09, 2024 02:39 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.