This paper proposes a new approach which combines different classifiers in order to make best use of each classifier. To build the new model, we evaluate the accuracy and performance (training and testing time) of three classification algorithms: ID3, Naitive Bayes and SVM.
Nội dung trích xuất từ tài liệu:
Building models for detecting system attacts based on data mining JOURNAL OF SCIENCE OF HNUE Interdisciplinary Science, 2013, Vol. 58, No. 5, pp. 39-46 This paper is available online at http://stdb.hnue.edu.vn BUILDING MODELS FOR DETECTING SYSTEM ATTACTS BASED ON DATA MINING Pham Duy Trung1, Luong The Dung1 and Nguyen Duy Hai2 1 Academy of Cryptography Techniques, 2 Centre of Information Technology, Hanoi National University of Education Abstract. With the development of the Internet, network security has become an indispensable factor of computer technology. Intrusion Detection Systems (IDS) play an important role in network security. One aspect which affects the accuracy and performance of IDS are classifiers. This paper proposes a new approach which combines different classifiers in order to make best use of each classifier. To build the new model, we evaluate the accuracy and performance (training and testing time) of three classification algorithms: ID3, Naitive Bayes and SVM. Our experimental results using the KDDCup’99 IDS dataset based on the 10-fold cross validation test shows that against any one particular type of attack, one of the classifiers functions best. The purpose of this study is to enhance the accuracy and performance of IDS against particular types of attacks. Keywords: Network security, data mining, network computer.1. Introduction The Internet pervades almost every aspect of life and business and, due to theexponential growth of this trend, there has come to exist the critical need to secure thesesystems from unauthorized disclosure, transfer, modification or destruction. An IntrusionDetection System (IDS) inspects the activities in a system for suspicious behavior orpatterns that may indicate an ongoing system attack or misuse. Recently, as networks havebecome faster, the need has an emerged for security analysis techniques that will be ableto keep up with the increased network throughput [1]. Due to large volumes of securityaudit data as well as complex and dynamic properties of intrusion behaviors, optimizingReceived May 25, 2013. Accepted June 30, 2013.Contact Nguyen Duy Hai, e-mail address: haind@hnue.edu.vn 39 Pham Duy Trung, Luong The Dung and Nguyen Duy Haithe performance of IDS becomes an important, open problem that receives more attentionfrom the research community [2]. Besides expert systems, state transition analysis and statistical analysis, data mininghas become a popular technique for detecting intrusion [3]. The main reason for usingData Mining Techniques for IDS is that it is capable of handling the enormous volumeof existing and newly appearing network data that require processing. One of the mostimportant Data Mining Techniques for Intrusion Detection is classification. Classificationmodels can be built using a wide variety of algorithms which can be classified intothree types: extensions to linear discrimination (e.g., multiplayer perceptron and logisticdiscrimination), decision tree and rule-based methods (e.g., C4.5 or J.48, AQ and CART)and density estimators (Na¨ıve Bayes and k-nearest neighbor, LVQ) [4]. A search of theliterature shows that a 3-level classification model with C4.5 algorithm provides a DOSdetection rate of almost 100% [5]. Rung Chin Cheng et al. [6] proposed an intrusiondetection method using SVM based on a RST. They show that an accuracy of 86.79%could be achieved using 41 features, while using a rough set increased the accuracy by89.13%. No data mining algorithms for intrusion detection has been identified as being thebest. Furthermore, it should be noted that once IDS are more widely used, new propertieswill have to be taken into consideration, such as large volumes of security audit data andcomplex and dynamic properties of intrusion behavior. One difficulty encountered in sucha study concerns the lack of published objective comparisons between classifiers. Ideally,classifiers should be tested within the same context, i.e., with the same dataset and usingthe same features extraction method. Currently, this is a crucial problem for IDS researchbased on data mining. In this paper, we evaluated three data mining algorithms for intrusion detection,Na¨ıve Bayes, J48 and Support Vector Machine (SVM), based on data mining structurefor IDS. In addition, we propose a new approach which combines different classifiers inorder to make best use of each classifier. The purpose of our research is to enhance theaccuracy and performance of IDS against particular types of attacks.2. Content2.1. The data mining model f ...