Inspiration

Excited about attending my first in-person hackathon, whenever I was in a position to do so, I would respond to questions other students posted on Discord about the event. The majority of the time, though, the answers were publicly available and more than likely previously already had been asked and answered. Imagining that the organizers were busy with getting this amazing event ready for us, I empathized with how taxed they must have been in trying to respond to the various questions and requests.

The phenomena of redundant questions and answers is not endemic to this ShellHacks. As a student, I witness on a daily basis how common an occurrence this is, regardless of the platform. With Discord, it indeed is cumbersome to scroll through dozens and dozens of posts, and performing word searches similarly is not ideal. Consequently, for many it's just easier to post the question sometimes without digging too hard to see if it's already been asked.

After arriving at the hackathon, it struck me that it was a shame that there wasn't a tool or mechanism for automating responses for some of the more routine questions and requests on Discord. If there were a way to do so, it would both benefit the Discord community generally and also give moderators and organizers more time to handle other responsibilities that have a greater impact on the community.

What it does

Seeing this real world need, I built a Discord FAQ AI Bot that monitors chats 24/7 and uses a trained machine learning model to predict whether a new post is a question or request that has been routinely posted on Discord in the past. If it determines that the post indeed is a common one (based on a confidence score), it then directly replies to the original poster and directs them to a (mock) FAQ document via a link.

How I built it

I created a new Discord server for SmellHacks 2022 (not a typo), which is taking place this weekend on Earth-515 from the multiverse. (In this parallel Earth, deodorant was never invented.) I used the $25 credit generously provided to students by Google to build, train, and deploy a machine learning model using Google Cloud Vertex AI. After downloading historic chat data from Discord, I cleaned and preprocessed this data, labeled it, and leveraged the power of the Vertex AI console to upload text datasets and perform training based on single feature classification. The convenience of the platform's AutoML capabilities obviated the need for me to determine which type of model was most suitable, as it did this for me. Vertex AI also simplified the process of creating an endpoint from which the Discord bot can obtain a prediction and confidence score with every new post. I hosted this AI Bot in a Google Compute VM Instance, where it persistently monitors the Discord server.

Anyone with a Discord account - even from our Earth - can join this server here and test it out live.

Challenges I ran into

I had never created a Discord bot (or any bot for that matter). Although I have coded machine learning algorithms from the ground up as part of my university coursework, I had never used a tool like Google Cloud Vertex AI to develop a practical machine learning based solution. Consequently, I was not confident as to how feasible this would be in a compressed weekend. Although my preliminary research revealed that there were other Discord bots that purportedly might serve a similar function, I did not see one that was actively deployed and used.

Because of the fact the inspiration was personal to me and I wanted to learn the entire process from end to end, I opted to work alone on this project. This required toiling many hours in a sleep-deprived state while in-person at the hackathon event.

Some other challenges included cleaning and preprocessing the data and the lengthy amount of time (on average four hours) it took for the data to be trained. I managed to train and build a second model that actually classified Discord posts by topic area, but I was unsatisfied with its accuracy in one of the topic areas and I felt there was inadequate time to train new ones.

Accomplishments that I'm proud of

I did not work on this project for the competitive aspect. Instead, I was just happy to have the opportunity to familiarize myself a little with some of Google Cloud's vast offerings and build a tool that eventually might actually be put into production, all while taking advantage of many of the enriching activities made available to us this weekend.

What I learned

I learned that with robust tools like Google Cloud Vertex AI, one need not be intimated by the prospect of building, training, and deploying production-ready machine learning models. (Although I sort of already knew this, I also learned that you can get a lot accomplished when in the company of other like-minded, energized people who are passionate about the same technologies as you are.)

What's next for the Discord FAQ AI Bot

The natural progression is to continue with refining a training model that predicts the topic area of the Discord post, factor context in light of adjacent posts, and perhaps even attempt to directly respond to the post with an answer (as opposed to directing the original poster to an FAQ document). With some more refinement and testing, who knows? We might be seeing the Discord FAQ AI Bot at a Discord server near you.

Share this project:

Updates