An attempt to understand the language and word choices the scammers use to analyze and use in prevention of scams.

What it does

It rates email on their likeliness of being a scam email by using word occurrences in the email.

How we built it

Utilizing MATLAB

Challenges we ran into

We had difficulty data cleaning the initial datasets and had a hard time coming up a method to quantify a likeliness of whether a email is a scam or not using certain words due to the data being not clean.

What we learned

We learned a bit in regards to NLP, bag of words, and using MATLAB.

What's next for ScamRating

Have a larger datasets / more modern scam emails. Using some database to store and collect this information.

Built With

Share this project: