PhishNet

Inspiration

We wanted to find a way to actively and proactively protect users from entering into a potentially sketchy site. By leveraging machine learning we wanted to assess the risks of sites before they are visited and give the users reasoning as to the potential dangers and red flags our AI model was able to find.

What it does

The site takes a URL that the users wants to visit and uses AI to asses the risk level of the site. Starting the task by using a dataset of previously flagged phishing as a reference, the AI can then go into the site and scrape the HTML looking for key indentifiers that could lead to a fraudulant site. Utilizing certain aspects such as recognizing HTTPS vs HTTP, scraping web results for matches of it being previously marked as phishing, along with analyzing contents of the webpage to make an informed decision.

How we built it

We accomplished this by creating a Full Stack webpage using React for the front end, express for the back end, mongodb as our data base, and pytorch for the AI model. With our decision to have mongodb as our database we were able to have a method to immediately catch previously discovered phishing sites and when there's a novel site our AI model can pick up the slack and even add to the database when it finds sites that fail to meet its standards. We then pass this valuable information to the front end along with a detailed reasoning on why the user should be wary of the site.

Challenges we ran into

We ran into a lot of issues when it came to connecting the frontend to the backend primarily when it came to retrieving the desired information from the database. We were able to get the post and get requests to go through but what received back was never the intended result. Another problem we ran into was finetuning the template for the AI, without adjustments the AI model would create a lot of misleading result that could misinform the users.

Accomplishments that we're proud of

We managed to properly connect the Front end to the back end and able to get accurate query results from the website by sending user inputted urls to the backend for data processing and get those results back. Also we managed to create a working AI model that is able to reason and give accurate readings about url links submitted.

What we learned

We learned alot about Full Stack development for this site particular on create API calls to connect UI components to Backend logic and processing. We also realized the problems of the dreaded "it runs on my machine" since we constantly had to debbug why it would work on one teammates system and not the other.

What's next for PhishNet

We still have a lot to accomplish with this one is being able to properly send the AI responses to the Front End for the users to receive the display as of right not it works off of an established dataset but we still need to connect that AI component to the finalized project