An attempt to understand the language and word choices the scammers use to analyze and use in prevention of scams.
What it does
It rates email on their likeliness of being a scam email by using word occurrences in the email.
How we built it
Challenges we ran into
We had difficulty data cleaning the initial datasets and had a hard time coming up a method to quantify a likeliness of whether a email is a scam or not using certain words due to the data being not clean.
What we learned
We learned a bit in regards to NLP, bag of words, and using MATLAB.
What's next for ScamRating
Have a larger datasets / more modern scam emails. Using some database to store and collect this information.
Log in or sign up for Devpost to join the conversation.