Sometimes we just want to know if we're the asshole in a situation.
Or we want to convince someone that they are the asshole.
Automated asshole detection can help assholes around the world find the cure to being an asshole.
It can also just be cathartic to irrefutably show to someone that they are an asshole, since AI said so.
What it does
Automated asshole detection.
Describe your situation and ask "AITA?" to find out if you are an asshole.
How I built it
- I used psaw and Pushshift.io to scrape reddit posts/comment from /r/AmITheAsshole;
- I used pandas to clean, merge and automatically generate training examples from the data;
- I used AllenNLP and PyTorch to fine-tune a BERT language model on asshole classification;
- I used AllenNLP to serve the final model through a simple frontend interface;
Challenges I ran into
- Training the model was slow (~30min/epoch on a Titan X GPU) since I couldn't use a batchsize greater than 4 without running out of GPU vram, this made experimentation and hyper-parameter search difficult as models took a very long time to converge;
- I had to truncate my input text to 512 tokens so as to not run out of memory: this likely has a negative impact on model performance;
- My initial target labels (5-class multilabel examples) were too difficult for my model to predict accurately; so I converted the 5-class scheme to simpler binary classification task;
- Scraping the data was time-consuming as there are millions of comments in 2019 alone, so instead I used a subset of 500k comments;
What I learned
How to use AllenNLP, a great library for this kind of project!
What's next for ImplementAITA
- Train a generative language model to explain why you are or are not an asshole;
- Use attention to visualize why you are or are not an asshole;
- Train a model with better architecture and hyper-parameters and more training data to improve classification accuracy;