AI Phishing Website Detection

Inspiration

Phishing attacks are among the most prevalent cybersecurity threats, deceiving users into revealing sensitive information on fraudulent websites. With the rise of AI-driven scams, traditional security measures often fail to keep up. Our team set out to develop an AI-powered phishing website detector to help users stay safe online by identifying malicious sites in real time.

What it does

Our AI Phishing Website Detector analyzes websites before users interact with them. It evaluates URLs, website content, and metadata using machine learning to determine whether a site is legitimate or a phishing attempt. The system can be integrated as a browser extension or an API, providing real-time alerts when a user visits a suspicious website.

How we built it

Data Collection: Train and test with the benchmark dataset provided by Prof. Abdelhakim Hannousse and Prof. Salima Yahiouche in the paper TOWARDS BENCHMARK DATASETS FOR MACHINE LEARNING BASED WEBSITE PHISHING DETECTION: AN EXPERIMENTAL STUDY Feature Engineering: Extracted key indicators such as domain age, SSL certificates, URL patterns, HTML structure, and suspicious keywords. Model Training: Utilized Dense Network and LSTMs to classify websites with high accuracy. Deployment: Hosted the model using Flask/FastAPI

Challenges we ran into

Data Imbalance: Phishing websites are significantly outnumbered by legitimate ones, we need to use weighted loss functions to handle the skewed dataset. Performance vs. Accuracy: Achieving real-time detection without compromising accuracy or introducing false positives was a key challenge.

Accomplishments that we're proud of

Achieved high detection accuracy with minimal false positives. Overcame evasion tactics by leveraging adaptive learning techniques and continuous dataset updates. Implement an intuitive and lightweight solution that enhances security without disrupting browsing speed.

What we learned

Machine learning for cybersecurity, including feature extraction from URLs, HTML content, and metadata. Real-time detection techniques to ensure fast and accurate classification of phishing websites. Web scraping and data collection for building a robust dataset of phishing and legitimate websites.

What's next for AI Phishing Website Detection

Integrate NLP to analyze website text and detect phishing attempts based on content semantics. Develop a graph-based detection system to identify malicious networks and detect phishing sites linked to known scams. Expand compatibility to work seamlessly with multiple browsers and platforms as extensions or apps.

Built With

flask
html
python
tensorflow

Updates

alexyy2004 Yan started this project — Mar 02, 2025 07:21 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.