AI-Driven Network Traffic Anomaly Detection: A Comparative Study of Random Forest, XGBoost, and Deep Learning on the CICIDS2017 Dataset

Ankit Ranjan; Sagar Choudhary; Achint Tiwari

Authors

Ankit Ranjan B.Tech Student, , Department of CSE, Quantum University, Roorkee, Uttarakhand, India
Author
Sagar Choudhary Assistant Professor, , Department of Computer Science and Engineering, Quantum University, Roorkee, India
Author
Achint Tiwari B.Tech Student, , Department of CSE, Quantum University, Roorkee, Uttarakhand, India
Author

DOI:

Keywords:

Network Intrusion Detection System (NIDS), Anomaly Detection, Random Forest, XGBoost, Deep Learning, MLP, CICIDS2017, SMOTE, Machine Learning, Cybersecurity.

Abstract

Network intrusion detection is a cornerstone of modern cybersecurity infrastructure. This study presents an end-to-end machine-learning pipeline for binary network traffic anomaly detection based on a synthetic variant of the widely used CICIDS2017 dataset. Three model families were systematically compared: Random Forest (RF), XGBoost (XGB), and Multi-Layer Perceptron (MLP) deep learning models. The pipeline incorporates stratified train/validation/test splits, median imputation, standard scaling, and Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance (BENIGN: 79.7%, ATTACK: 20.3%). Evaluation on an isolated test set of 7,530 samples demonstrated near-perfect detection across all three models. Random Forest achieved perfect scores (accuracy = 1.0, AUC-ROC = 1.0), XGBoost attained 99.99% accuracy with a false-positive rate of only 0.02%, and the deep learning MLP reached 99.54% accuracy with AUC-ROC = 0.9999. The results underscore the practical viability of ensemble methods for real-time IDS deployment and provide a reproducible benchmark for future research