Super-Safe

Inspiration

Nowadays most people’s work has shifted online during pandemic session. Much of the formal communication is done using email. However online communication comes with a high risk of cybercrime. To steal your data, a hacker might send a phished email to you. It means that someone will pretend to be something else to gain your data for misusing. This can be done by adding malicious links in the email. A hacker clones a legitimate website, email id, graphic design and asks the target to act on the deceptive link. Some of the commonly extracted information from the target includes the name of bank account, credit card number, sensitive company and personal data. This collected data is then misused.

What it does

So people use the website to check if the uniform resource locator is phished or it is original. This is rather time-consuming. Also to get the Uniform Resource Locator you have to click on the link and then copy it from your browser. There is some technology which can extract data of the user when he/she/they click on the link. This shows that one should not open links directly.

How we built it

Build a program which can tell the user if the link provided in the email is safe to open or not. Using a classifier generated by apriori algorithm, we can get a prediction if a particular website is phished or not. We trained the data from phished website URL database. Then the input is automated from user’s email into that classifier. The classifier generates an output and my model displays the result on users screen.

Challenges we ran into

I learned machine learning and data mining from scratch within some time.

Accomplishments that we're proud of

Ideating a safety application using multiple computer science concepts.

What we learned

The rise of cybercrime is very threatening. We have to constantly upgrade our technology to safeguard ourselves. Data mining can be very helpful to deal with this situation. Data mining along with computer application build by coding can perform much of the analysis with much user interference. With the help of frequent pattern mining using Apriori algorithm, we can Find out what attributes come together to form a phished uniform resource locator. The disadvantage of this method is that it takes a lot of time to compute and iterate over dataset again and again. One should constantly update new training data so that the classifier is recent and capable of detecting new phishing techniques.