INTRUSION DETECTION MODEL USING MACHINE LEARNING ALGORITHMS ON NSL-KDD DATASET

Alshamy, Reem; AKCAYOL, MUHAMMET

doi:10.5121/ijcnc.2024.16605

INTRUSION DETECTION MODEL USING MACHINE LEARNING ALGORITHMS ON NSL-KDD DATASET

Atıf İçin Kopyala

Alshamy R., AKCAYOL M. A.

International Journal of Computer Networks and Communications, cilt.16, sa.6, ss.75-88, 2024 (Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 16 Sayı: 6
Basım Tarihi: 2024
Doi Numarası: 10.5121/ijcnc.2024.16605
Dergi Adı: International Journal of Computer Networks and Communications
Derginin Tarandığı İndeksler: Scopus
Sayfa Sayıları: ss.75-88
Anahtar Kelimeler: Intrusion Detection, Machine Learning, Network Security Laboratory (NSL)-KDD dataset, SMOTE
Gazi Üniversitesi Adresli: Evet

Özet

Big data, generated by various sources such as mobile devices, sensors, and the Internet of Things (IoT), has many characteristics such as volume, velocity, variety, variability, veracity, validity, vulnerability, volatility, visualization, and value. An Intrusion Detection System (IDS) is essential for cybersecurity to detect intrusions before or after attacks. Traditional software methods struggle to store, manage, and analyze big data, developing new techniques for effective and rapid intrusion detection in organizations and enterprises. This study introduces the IDS Random Forest (RF) model in binary and multiclass classification for intrusion detection. In this model, we used the Synthetic Minority Oversampling TEchnique (SMOTE) to address class imbalances, and the RF classifier to classify attacks using the Network Security Laboratory (NSL)-KDD dataset. In the experiment, we compared the IDS-RF model with the Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), and Logistic Regression (LR) classifiers in terms of accuracy, precision, recall, f1-score, and times for training and testing. The experimental results showed that the IDS-RF model achieved high performance in binary and multiclass classification compared to others. In addition, the proposed model also achieved high accuracies for each class (Normal, DoS, Probe, U2R, or R2L) and obtained 98.69%, 99.72%, 98.93%, 95.13%, and 89%, respectively.