Created by Casey Culbertson, Anish Konanki, Nelson Sun
The Language Generator project was pretty cool, and the three of us all wanted to make a Discord bot. So, why not do both at once?
What it does
MimikBot is a Discord Bot that attempts to mimic conversation in a Discord text channel. It analyzes the past 50 or so messages in a channel, makes its best guess at each word's part of speech, records the sentence structures of the sentences, and then mixes and matches parts of speech to hopefully arrive at a comprehensible result! Often times, the result is not super legible, but occasionally MimikBot will drop pure gold.
How we built it
Like all Discord bots, MimikBot runs on Python. Internally, it uses a dictionary that tells the bot which words are likely which parts of speech. MimikBot borrows a lot of its infrastructure from the Language Generator project, specifically in generating sentences. However, interfacing a bot with commands and reading sentences to modify the internal data structures was the main point of this project.
Challenges we ran into
One of the biggest struggles for this project was compiling a good dictionary for our bot to use. The primary problem here is that we needed to find the most common parts of speech of each word, since determining precisely which part of speech common words are is way beyond the scope of this project. Finding a dictionary that actually contained the parts of speech of the most common words, sorted by frequency, was actually extremely difficult. We initially attempted to just ignore the "sorted by frequency" aspect, but that caused the text classifier to disproportionately favor many words as being adverbs and conjunctions, among other issues. Eventually, we found a dictionary that had everything we needed, but along the way we went through several dictionaries (one over 150k words long!), and even a paywall. We were also struggling with using words that had multiple parts of speech and implementing such words into the grammar was a big pain, because said word would have a different part of speech in a different sentence, which would not look or sound right at all.
Accomplishments that we're proud of
We enjoyed working on implementing language generator into a Discord bot. We all especially enjoyed running the bot and watching its outputs, which were funny sentences like with language generator. We also enjoyed remaking parts of the code for the bot and watching it work, as we all felt extremely satisfied seeing it work and output crazy sentences like "oh wait I see how you awesome these you literally searched for a plain-text words way something sec that I was found literally for a are of 10k know."
What we learned
We learned a whole lot about Python, and the fact that Python and Java having a lot of similarities when it comes to syntax really helped us understand the basics of what we needed to know, especially when it came to using the Discord API. And as we learned the hard way, the English language is extremely difficult to deal with. We also learned a lot about code optimization and being efficient with repeating code. Furthermore, we learned a lot of transferable skills, like being able to adapt on the fly and being able to communicate effectively about what needs to be done.
What's next for MimikBot
Probably not a whole lot, but one potential improvement would be to improve the dictionary file (the bot is only as good as its dictionary!) and give admins more control on how the bot is triggered - right now, the bot has the potential to be incredibly spammy.