Username   Password       Forgot your password?  Forgot your username? 


Applying an Improved Elephant Herding Optimization Algorithm with Spark-based Parallelization to Feature Selection for Intrusion Detection

Volume 15, Number 6, June 2019, pp. 1600-1610
DOI: 10.23940/ijpe.19.06.p11.16001610

Hui Xu, Qianqian Cao, Heng Fu, and Hongwei Chen

School of Computer Science, Hubei University of Technology, Wuhan, 430068, China

(Submitted on March 20, 2019; Revised on April 7, 2019; Accepted on June 5, 2019)


With the growth of the intrusion data scale model, irrelevant or redundant features in high-dimensional intrusion detection data leads to slow processing speed of the intrusion detection algorithm, and the consumption of the algorithm in time and space will increase as the feature dimensions increase. In view of good classification performance of the Elephant Herding Optimization (EHO) algorithm in reducing feature redundancy, this paper introduces the EHO algorithm into feature selection for intrusion detection. Since the basic EHO algorithm tends to fall into a local optimum and lacks strong search ability, the classification performance and dimensional reduction ability of the algorithm are severely limited. Therefore, an Improved Elephant Herding Optimization (IEHO) algorithm is proposed in this paper to search the feature space and find the optimal feature subset, so that the feature number is minimized while the classification performance is maximized. As the scale of intrusion data grows, the large amount of redundant information in the intrusion data will cause the improved algorithm to process slowly. Thus, in this case, the improved algorithm is considered to be parallelized to relieve the pressure of single-machine operation. This paper then proposes a Spark-based distributed parallel IEHO algorithm for intrusion detection, and a feature selection method based on this algorithm for intrusion detection is discussed. The feature selection in a distributed environment can improve the running efficiency of the IEHO algorithm, so as to reduce the running time of the algorithm under the premise of ensuring classification accuracy. As for the experimental validation, both UCI and KDD CUP99 datasets are used to verify the feature selection for intrusion detection. Compared with the classical PSO, MFO, and EHO algorithms, the feature selection by the binary IEHO algorithm is improved by 4.16%, 1.42%, and 0.98%, respectively, and the classification performance is also significantly improved. Compared with the stand-alone version of the IEHO algorithm, the classification efficiency of the parallel IEHO algorithm based on Spark for intrusion feature selection is significantly improved, and the acceleration ratio is increased by two orders of magnitude.


References: 26

  1. V. Paxson and R. Sommer, “Outside the Closed World: On using Machine Learning for Network Intrusion Detection,” in Proceedings of the 2010 IEEE Symposium on Security and Privacy, pp. 305-316, 2010
  2. B. Sun, Y. Zhang, and Z. Shang, “The Status and Trend of Intrusion Detection System Research,” in Proceedings of the 2012 Second International Conference on Electric Information and Control Engineering, Vol. 1, pp. 1559-1561, 2012
  3. M. Dash and H. Liu, “Feature Selection for Classification,” Intelligent Data Analysis, Vol. 1, No. 3, pp. 131-156, 1997
  4. S. W. Lin, K. C. Ying, C. Y. Lee, and Z. J. Lee, “An Intelligent Algorithm with Feature Selection and Decision Rules Applied to Anomaly Intrusion Detection,” Applied Soft Computing, Vol. 12, No. 10, pp. 3285-3290, 2012
  5. H. H. Gao, X. Y. Wang, and H. H. Yang, “Ant Colony Optimization based Network Intrusion Feature Selection and Detection,” in Proceedings of International Conference on Machine Learning & Cybernetics, 2005
  6. H. Q. Huang and H. Sun, “Intrusion Detection of Particle Swarm Selection Feature and Information Gain Determination Feature Weight,” Computer Applications, Vol. 34, No. 6, pp. 1686-1688, 2014
  7. L. D. S. Coelho, S. Deb, and X. Z. Gao, “A New Metaheuristic Optimisation Algorithm Motivated by Elephant Herding Behaviour,” International Journal of Bio-Inspired Computation, Vol. 8, No. 6, pp. 394-409, 2017
  8. A. Alihodzic, E. Tuba, and M. Tuba, “Multilevel Image Thresholding using Elephant Herding Optimization Algorithm,” in Proceedings of International Conference on Engineering of Modern Electric Systems, pp. 240-243, 2017
  9. E. Tuba and Z. Stanimirovic, “Elephant Herding Optimization Algorithm for Support Vector Machine Parameters Tuning,” in Proceedings of International Conference on Electronics, Computers and Artificial Intelligence, pp. 1-4, 2017
  10. R. K. Mallick and N. Nahak, “Hybrid Differential Evolution Particle Swarm Optimization (DE-PSO) Algorithm for Optimization of Unified Power Flow Controller Parameters,” in Proceedings of IEEE Uttar Pradesh Section International Conference on Electrical, 2017
  11. W. Zhu and Q. Zhang, “Application of Machine Learning in Network Intrusion Detection,” Data Acquisition and Processing, Vol. 32, No. 3, pp. 479-488, 2017
  12. M. X. Hua and F. J. Zhang, “Intrusion Detection System Framework in Big Data Environment,” Communications Technology, Vol. 48, No. 11, pp. 1300-1304, 2015
  13. W. Wang and J. Zhang, “Research and Implementation of Network Intrusion Detection Algorithm based on Cloud Computing Platform,” Modern Electronic Technology, Vol. 39, No. 19, pp. 76-79, 2016
  14. H. Li and Q. Wu, “A Distributed Intrusion Detection Model based on Cloud Theory,” in Proceedings of International Conference on Cloud Computing and Intelligent Systems, pp. 435-439, 2012
  15. T. L. Huang, X. Y. Liu, and X. Wang, “Research on the Intrusion Detection Mechanism based on Cloud Computing,” in Proceedings of International Conference on Intelligent Computing & Integrated Systems, pp. 125-128, 2010
  16. C. Barba-González, J. Garcia-Nieto, A. J. Nebro, and J. F. Aldana-Montes, “Multi-Objective Big Data Optimization with Metal and Spark,” in Proceedings of International Conference on Evolutionary Multi-Criterion Optimization, pp. 16-30, 2017
  17. Z. M. Fang, Z. Y. Ruan, and P. P. Zhou, “Doppio: I/O-Aware Performance Analysis, Modeling and Optimization for In-Memory Computing Framework,” in Proceedings of IEEE International Symposium on Performance Analysis of Systems & Software, Vol. 1, pp. 22-32, 2018
  18. G. G. Wang, L. D. S. Coelho, and S. Deb, “Elephant Herding Optimization,” in Proceedings of International Symposium on Computational and Business Intelligence, pp. 1-5, 2016
  19. V. Das, S. N. Omkar, J. Senthilnath, and V. Mani, “Clustering using Levy Flight Cuckoo Search,” Advances in Intelligent Systems and Computing, Vol. 202, pp. 65-75, 2013
  20. R. Eberhart and J. Kennedy, “Particle Swarm Optimization,” in Proceedings of Icnn'95-International Conference on Neural Networks, Vol. 4, pp. 1942-1948, 2002
  21. J. Q. Wang, L. X. Zhang, Y. N. Zhao, et al., “Feature Selection in Machine Learning,” Computer Science, Vol. 31, No. 11, pp. 180-184, 2004
  22. Y. Chen, Y. Li, H. W. Shen, X. Q. Chang, “An Efficient Feature Selection Algorithm for Lightweight Intrusion Detection Systems,” Chinese Journal of Computers, Vol. 30, No. 8, pp. 1398-1408, 2007
  23. M. Chowdhury, M. J. Franklin, M. Zaharia, S. Shenker, and I. Stoica, “Spark: Cluster Computing with Working Sets,” in Proceedings of Usenix Conference on Hot Topics in Cloud Computing, pp. 10, 2010
  24. S. Q. Ruan and X. D. Wu, “Comparison of MapReduce and Spark for Big Data Analysis,” Journal of Software, No. 6, 2018
  25. J. Rini and K. K. Sherly, “Parallel Frequent Itemset Mining with Spark RDD Framework for Disease Prediction,” in Proceedings of International Conference on Circuit, 2016
  26. J. S. Wu and W. P. Zhang, “Data Analysis and Study on KDDCUP99 Dataset,” Computer Applications and Software, Vol. 31, No. 11, pp. 321-325, 2014


Please note : You will need Adobe Acrobat viewer to view the full articles.Get Free Adobe Reader

This site uses encryption for transmitting your passwords.