Get hands on experience of applying NLP to a real world problem.

What it does

E-mail classification.

How I built it

Using python, pytorch, huggingface pre-trained BERT and scikit-learn.

Challenges I ran into

  • I don't speak German so getting a handle on the corpus was tricky
  • Not enough time to properly pre-process the data for some tasks (e.g. extracting named entities, stop words, etc)

Accomplishments that I'm proud of

  • Managed to fine tune BERT from descriptions on the Internet with reasonably good results

What I learned

  • NLP is a lot trickier in the real world than in well defined and understood datasets
  • Pre-processing key
  • Large models and lots of compute is required

What's next for skai hackathon

