Abnormal Traffic Prediction and Classification based on Information Big Data
DOI:
https://doi.org/10.54097/hset.v23i.3216Keywords:
Big Data, Intrusion Detection System, UNSW-NB15, multi-class classification.Abstract
Intrusion Detection System (IDS) is a proactive security technique for detecting and alerting suspicious signals. However, the intrusion method developed as a fast and traditional method for detecting malicious traffic has a lot of shortcomings like low accuracy and low efficiency. To determine the different intrusion methods' features and promote the accuracy of malicious traffic detection, several Machine Learning models for classifying different intrusion methods such as KNN, Naive Bayes, SVM, LightGBM are compared. To further improve the accuracy of the model, ensemble models like Voting, Stacking for comparison are also introduced. Grid Search is used for the best parameters. The accuracy, precision, recall score and F1 score are used as metrics to evaluate the performances of different models. The experimental comparison and analysis show that the integrated learning algorithm based on Stacking has the highest accuracy for malicious traffic detection.
Downloads
References
Ruiz-Correa, Salvador, Linda G. Shapiro, and Marina Melia. "A new signature-based method for efficient 3-d object recognition." Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. Vol. 1. IEEE, 2001.
Gurina, Anastasia, and Vladimir Eliseev. "Anomaly-based method for detecting multiple classes of network attacks." Information 10.3 (2019): 84.
Hu Fengjie. Research on Network Intrusion Detection System Based on LightGBM[D].XidianUniversity,2020.DOI:10.27389/d.cnki.gxadu.2020.001824.
He Hongyan, Huang Guoyan, Zhang Bing, and Chen Yu.Research on intrusion detection model based on multiple feature selection strategies[J].Information Security Research,2021,7(03):225-232.
Rajagopal, Smitha, Poornima Panduranga Kundapur, and K. S. Hareesha. "Towards effective network intrusion detection: from concept to creation on Azure cloud." IEEE Access 9 (2021): 19723-19742.
Ke, Guolin, et al. "Lightgbm: A highly efficient gradient boosting decision tree." Advances in neural information processing systems 30 (2017).
Izenman, Alan Julian. "Linear discriminant analysis." Modern multivariate statistical techniques. Springer, New York, NY, 2013. 237-280.
Eitas, Timothy K., and Jeffery L. Dangl. "NB-LRR proteins: pairs, pieces, perception, partners, and pathways." Current opinion in plant biology 13.4 (2010): 472-477.
Hearst, Marti A., et al. "Support vector machines." IEEE Intelligent Systems and their applications 13.4 (1998): 18-28.
Biau, Gérard, and Erwan Scornet. "A random forest guided tour." Test 25.2 (2016): 197-227.
Lewis, Roger J. "An introduction to classification and regression tree (CART) analysis." Annual meeting of the society for academic emergency medicine in San Francisco, California. Vol. 14. Citeseer, 2000.
Kramer, Oliver. "K-nearest neighbors." Dimensionality reduction with unsupervised nearest neighbors. Springer, Berlin, Heidelberg, 2013. 13-23.
Seber, George AF, and Alan J. Lee. Linear regression analysis. John Wiley & Sons, 2012.
Lau, Richard R., and David P. Redlawsk. "Voting correctly." American Political Science Review 91.3 (1997): 585-598.
Banzhaf III, John F. "Weighted voting doesn't work: A mathematical analysis." Rutgers L. Rev. 19 (1964): 317.
Ting, Kai Ming, and Ian H. Witten. "Stacking bagged and dagged models." (1997)
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







