Inspiration

Detection of Email spams are extremely important to help people save their time and money going after spams.

What it does

Using NLP helps to classify the email spams and hams.

How we built it

Following steps were applied to the dataset into the project:

Preprocessing of the raw data Feature Engineering Machine Learning model The Preprocessing of raw data included tokenization of sentences and words. Stemming and Lemmatization was done on the raw data to prepare it for further processes. Whole preprocessing steps were done using SpaCy library of Python. Another alternative is NLTK.

Feature Enginnering involved the method of 'Bag of n words'. The words were vectorized by the help of CountVectorizer of sklearn.

Naive-Bayes model was applied for classification of the data.

Accomplishments that we're proud of

The model predicted the test set with approx 99% accuracy and precision scores.

Share this project:

Updates