We have both always been longtime fans of machine learning, and we have known each other since 8th grade and have connected since then largely through the online platform Discord. Especially with quarantine over the last few months, Discord has become a big part of peoples' communication, and so we wanted to create a fun toy that can liven up any server with close friends by letting them duplicate one of themselves.
What it does
We have a Wiki and readme file on our GitHub explaining most of the project, but essentially this allows you to train a chatbot on a .csv log of Discord message history and select a user for the bot to train to mimic. Inexperienced users or data scientists need not include any parameters besides the data, but the model's hyperparameters are also fully customizable, and we encourage altering the source code to adjust the model to specific needs.
How we built it
Most of the project's development was done on Google Colab, but for presentation, we created a display copy and a neat GitHub repository with nice documentation for incoming users. The model we used is an open-source (license-free) question/answer model that we adapted to fit both inputting our datasets and outputting into a Discord bot, which we coded using Discord's simple API.
Challenges we ran into
NLP is a notoriously finicky sector of machine learning, and we did not run short of bugs in the implementation of the model. Despite Colab making training time relatively quick in the context of NLP/ML, the models that we attempted to train became more like silly toys than actual interesting chatbots. Considering that, we focused more on the user experience utilizing our code, and less on any model that we could actually generate.
Accomplishments that we're proud of
We found that the data analysis features that we added were actually really interesting to examine your chat history with. Additionally, the simplicity and ease of the Discord API made 'installing' our model into a usable chatbot easy and satisfying.
What we learned
We learned a lot about machine learning and pipelining elements together with this project. As coders who have both worked on many ML models before, it was a new experience to actually build these into a usable application, instead of just for fun. It was fascinating to see the way that the model linked different words together, and it was satisfying to create a program that others could use with ease.
What's next for Personalized Discord Chatbot
This project is dedicated towards the user, specifically the amateur who wants to have a little fun in their Discord server, but it is also perfectly capable of being run as a serious model by expert data scientists. The next step for this project would be to add more customization functionality (notably something we had issues with was being unable to use GPUs for training), but this is more of a community tool to us than it is a model we created.