Detection of Phishing Websites Using Machine Learning
DOI:
https://doi.org/10.71366/ijwos03032691564Keywords:
Phishing Detection, Machine Learning, URL Feature Analysis, Random Forest, XGBoost, Cybersecurity, Website Classification
Abstract
Phishing attacks continue to be one of the most pervasive and damaging forms of cybercrime, exploiting human trust to steal sensitive information through fraudulent websites. This paper proposes a machine learning-based framework for the automatic detection of phishing websites by analyzing URL structures, domain-level attributes, and webpage content features. We evaluate four classification algorithms — Random Forest, Support Vector Machine (SVM), Logistic Regression, and XGBoost — on the UCI phishing websites dataset containing 11,055 labeled instances. Our experimental results demonstrate that the Random Forest classifier achieves the highest detection accuracy of 97.4%, with a precision of 0.975 and an F1-score of 0.973. Feature importance analysis reveals that URL length, presence of an IP address in the URL, and abnormal URL patterns are the most discriminative indicators of phishing. The proposed system offers a scalable, real-time solution that outperforms conventional blacklist-based approaches and generalizes well to previously unseen phishing websites.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


