International Journal of Computer Networks and Communications, cilt.16, sa.6, ss.75-88, 2024 (Scopus)
Big data, generated by various sources such as mobile devices, sensors, and the Internet of Things (IoT), has many characteristics such as volume, velocity, variety, variability, veracity, validity, vulnerability, volatility, visualization, and value. An Intrusion Detection System (IDS) is essential for cybersecurity to detect intrusions before or after attacks. Traditional software methods struggle to store, manage, and analyze big data, developing new techniques for effective and rapid intrusion detection in organizations and enterprises. This study introduces the IDS Random Forest (RF) model in binary and multiclass classification for intrusion detection. In this model, we used the Synthetic Minority Oversampling TEchnique (SMOTE) to address class imbalances, and the RF classifier to classify attacks using the Network Security Laboratory (NSL)-KDD dataset. In the experiment, we compared the IDS-RF model with the Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), and Logistic Regression (LR) classifiers in terms of accuracy, precision, recall, f1-score, and times for training and testing. The experimental results showed that the IDS-RF model achieved high performance in binary and multiclass classification compared to others. In addition, the proposed model also achieved high accuracies for each class (Normal, DoS, Probe, U2R, or R2L) and obtained 98.69%, 99.72%, 98.93%, 95.13%, and 89%, respectively.