Detection of Phishing Websites Using Machine Learning

Pavithra R; Anitha P

doi:10.71366/ijwos03032691564

Authors

Pavithra R PG Student, vellalar college for women,Erode
Author
Anitha P Assistant Professor, vellalar college for women,Erode
Author

DOI:

https://doi.org/10.71366/ijwos03032691564

Keywords:

Phishing Detection, Machine Learning, URL Feature Analysis, Random Forest, XGBoost, Cybersecurity, Website Classification

Abstract

Phishing attacks continue to be one of the most pervasive and damaging forms of cybercrime, exploiting human trust to steal sensitive information through fraudulent websites. This paper proposes a machine learning-based framework for the automatic detection of phishing websites by analyzing URL structures, domain-level attributes, and webpage content features. We evaluate four classification algorithms — Random Forest, Support Vector Machine (SVM), Logistic Regression, and XGBoost — on the UCI phishing websites dataset containing 11,055 labeled instances. Our experimental results demonstrate that the Random Forest classifier achieves the highest detection accuracy of 97.4%, with a precision of 0.975 and an F1-score of 0.973. Feature importance analysis reveals that URL length, presence of an IP address in the URL, and abnormal URL patterns are the most discriminative indicators of phishing. The proposed system offers a scalable, real-time solution that outperforms conventional blacklist-based approaches and generalizes well to previously unseen phishing websites.