Title:
Entity Recognition in Online Fashion Listings
Who:
- Adrien Chavarot (achavaro)
- Walter Zhang (xzhan180)
Introduction:
We are implementing a model from a paper that describes a new approach to extract structured data at the sentence-level. We chose this paper because it engages in the non-trivial task of entity recognition on natural language that may not follow normal grammatical conventions. The dataset we collected is like this too, which is one of the reasons why this paper interests us.
We are performing Named Entity Recognition (NER), a mix of identification and classification.
Related Work:
Named Entity Recognition is a widely-used pre-processing step in many NLP applications, so there is much literature about this. However, most approaches to NER rely on either ‘external resources’ or text normalization, to normalize grammatically and syntactically inconsistent text. However, this paper in particular stands out because it does not rely on any of this.
Most of the other papers make use of a pre-trained BiLSTM network for word embeddings, with a Conditional Random Field at the end.
There has also been some use of sent2vec, and even phonetic representations of words to infer meaning beyond just how they are written
List of known code implementations:
None yet.
(https://medium.com/mysuperai/what-is-named-entity-recognition-ner-and-how-can-i-use-it-2b68cf6f545d)
To summarize the above article, Named Entity Recognition models work in two steps: First, detect a named entity, then categorize the entity. The type of training data is very important to the success of a model at any given task. In the words of the article: “Train your model on Victorian gothic literature, and it will probably struggle to navigate Twitter.” NER is useful to give an overview understanding of any dataset, and filter by certain parameters.
Data:
The dataset that we used is a custom dataset that Adrien obtained through web scraping a popular second-hand clothing website, Grailed. 700k listings are provided.
Entity extraction is interesting in this case because the listings are created by many different users, so there is no standardization in style across the dataset. Therefore, a naïve approach of looking for certain keywords is insufficient.
No significant pre-processing will need to be done on the data itself. Each description contains tags about category, name, color, size, designers, seller location, and condition. These can be used as the ground truth labels.
Methodology:
Our model will follow the architecture described in the paper:

We think the hardest part of this model to implement will be the BiLSTM modules, as we have only seen a one-direction LSTM previously.
Metrics:
After training, we will run the model on the testing data and compare the model’s outputs against the ground truth. We will do this with both the data described in the Data section and a standard WNUT dataset used by the paper.
The models in the paper are compared using the F1 score, which is calculated using the number of true positives, false positives, and false negatives. This metric is more suitable than accuracy to measure the performance on a data set whose distribution of labels is uneven, as the false positives and negatives are weighed more. The authors of the paper sought to compare the performance of their model against “state-of-the-art” models on standard datasets for predicting named entities.
Thus, we will also be using the F1 metric to measure the performance of our mode
Our base line will be to implement the layers for processing test input in the figure above and have a working model that can accurately predict some labels. The target will be to achieve a similar score on the data set as described in the paper, and the stretch goal will be to implement the multimodal feature that adds an image analysis component.
Ethics:
- Why is Deep Learning a good approach to this problem?
Deep learning is a good approach to this problem because it is a problem that can be feasibly solved, but is extremely repetitious and therefore not worth doing by humans. A good approach to see if a deep learning can be good at a specific problem is to see if a human can do it. If a human can’t then it is hard to expect a DL model to. A human could feasibly do this specific task, but an algorithmic approach would not work well here.
- What is your dataset? Are there any concerns about how it was collected, or labeled? Is it representative? What kind of underlying historical or societal biases might it contain?
The dataset will have to be manually inspected. This is because we have a concern with it, which is that all the data points have been created by random internet users. One negation to this concern is that unlike with generative models, which can cause real harm by generating content similar to harmful data they were trained on, entity extraction models do not create any new data, they just allow it to be more efficiently organized and tagged.
Another concern is with how the data was collected. Web scraping is not widely accepted within the web community, and some consider it to be damaging to the website that is being scraped. However, our rationale is that a lot of DL datasets have been created from scraping, and the Grailed company stands to gain from our research on how to more efficiently categorize fashion listings. Therefore, we do not see ourselves working against their interests.
Division of Labor:
Adrien will be responsible for Data Gathering and cleaning.
Adrien and Walter will both share responsibility for the code implementation of the model and the subsequent write-up
Built With
- python
- tensorflow
Log in or sign up for Devpost to join the conversation.