Get hands on experience of applying NLP to a real world problem.
What it does
How I built it
Using python, pytorch, huggingface pre-trained BERT and scikit-learn.
Challenges I ran into
- I don't speak German so getting a handle on the corpus was tricky
- Not enough time to properly pre-process the data for some tasks (e.g. extracting named entities, stop words, etc)
Accomplishments that I'm proud of
- Managed to fine tune BERT from descriptions on the Internet with reasonably good results
What I learned
- NLP is a lot trickier in the real world than in well defined and understood datasets
- Pre-processing key
- Large models and lots of compute is required