Named-entity recognition (NER) is a sub-task of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as proper names, organizations, etc.
My team was inspired by the concept of using Artificial Intelligence to create a Named-entity recognition system that detects proper names.
Our idea was to create an application that facilitates tagging people. We use tagging a lot in social media and it can take some time to type in your
@ and searching for the person, so we thought, why not automate this process?
What it does
The user types in his text, the trained model, then, goes through each word and detects all the proper names in the text. It outputs all the names concatenated. We also created a Discord bot and a Web App out of this.
The Web Application visualizes the process of analyzing the text and detecting the proper names.
The Discord bot is used to tag users in your server: by using the magic word
$tag before your sentence, the bot will automatically tag every proper name it can find in that server.
How we built it
First, we used the transformers library and PyTorch to create a BERT based transfer learning model that detects proper names. We set up a training pipeline with a well-annotated dataset and started the training.
After that, we used BentoML to serve the model in an optimized way.
Then, we tried to create a simple Web Application with Flask and basic HTML and CSS, connected it to our served model and got a working Web App!
We, then, wanted to go further and create a Discord bot. We used the Discord python library to create a bot that uses our served AI model to automatically tags people when using the magic word
Challenges we ran into
The most challenging part of this project was deploying it. Since our model was voluminous and required high computational requirements, we had some trouble building the docker image and uploading it, especially that the image had a large size and uploading it took a lot of time. Another challenging part of this journey was the time, we had limited time to complete our project and the teammates' time-zones were a bit different. But, this challenge was really constructive, as we managed time really well, and succeeded in completing the project before the deadline.
Accomplishments that we're proud of
We were very excited about this idea! From the moment that we came up with it in our brainstorming, my teammates and I were really motivated to work on this idea and make it work in such a short time. We were very proud that we successfully completed the project and created both a Web App and a Discord bot that incorporates the Auto-Tagger feature. Also, we are very proud that we got to use BentoML in our project and got familiar with this amazing framework. It really made the work a lot easier and reasonably optimized.
What we learned
We learned a lot from this experience:
- Building a training pipeline for Named-Entity Recognition
- Using the BentoML framework to serve models
- Creating and deploying a Discord bot
- How to manage time and make the most out of it
What's next for Auto-Tagger
For Auto-Tagger, we think the next steps are to try and integrate our Discord bot with multiple discord servers. We also think that, on another level, we might develop the Auto-Tagger to even detect usernames! This will be an amazing step forward in Named-Entity Recognition and also might take a lot of work gathering the dataset, but we are really excited to make Auto-Tagger an even more powerful tool for tagging your friends.