Profit agent classification using feature selection eigenvector centrality

Số trang: 11 Loại file: pdf Dung lượng: 488.48 KB Lượt xem: 3 Lượt tải: 0

Hoai.2512

Báo xấu

Xem trước 2 trang đầu tiên của tài liệu này:

Thông tin tài liệu:

In this paper we applied a feature selection based on graph method, graph method identifies the most important nodes that are interrelated with neighbors nodes.
Nội dung trích xuất từ tài liệu:
Profit agent classification using feature selection eigenvector centralityInternational Journal of Mechanical Engineering and Technology (IJMET)Volume 10, Issue 3, March 2019, pp. 603–613, Article ID: IJMET_10_03_062Available online at http://www.iaeme.com/ijmet/issues.asp?JType=IJMET&VType=10&IType=3ISSN Print: 0976-6340 and ISSN Online: 0976-6359© IAEME Publication Scopus Indexed PROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITY Zidni Nurrobi Agam Computer Science Department, Bina Nusantara University, Jakarta, Indonesia 11480 Sani M. Isa Computer Science Department, Bina Nusantara University, Jakarta, Indonesia 11480 Abstract Classification is a method that process related categories used to group data according to it are similarities. High dimensional data used in the classification process sometimes makes a classification process not optimize because there are huge amounts of otherwise meaningless data. in this paper, we try to classify profit agent from PT.XYZ and find the best feature that has a major impact to profit agent. Feature selection is one of the methods that can optimize the dataset for the classification process. in this paper we applied a feature selection based on graph method, graph method identifies the most important nodes that are interrelated with neighbors nodes. Eigenvector centrality is a method that estimates the importance of features to its neighbors, using Eigenvector centrality will ranking central nodes as candidate features that used for classification method and find the best feature for classifying Data Agent. Support Vector Machines (SVM) is a method that will be used whether the approach using Feature Selection with Eigenvalue Centrality will further optimize the accuracy of the classification. Keywords: Classification, Support Vector Machines, Feature Selection, Eigenvalue Centrality, Graph-based. Cite this Article: Zidni Nurrobi Agam and Sani M. Isa, Profit Agent Classification Using Feature Selection Eigenvector Centrality, International Journal of Mechanical Engineering and Technology (IJMET)10(3), 2019, pp. 603–613. http://www.iaeme.com/ijmet/issues.asp?JType=IJMET&VType=10&IType=3 http://www.iaeme.com/IJMET/index.asp 603 editor@iaeme.com Zidni Nurrobi Agam and Sani M. Isa1. INTRODUCTIONIn this era data is a very important commodity used in almost all existing technologies, datamakes researchers examine more data in order to find hidden patterns that can be used asinformation. but with the increasing number of data, there are also many data that irrelevantand redundant dataset, making the quality of the data less good. Feature Selection is a method that selects a subset of variables from the input which canefficiently describe the input data while reducing effects from noise or irrelevant variables andstill provide good prediction results [1]. Usually, feature selection operation both ranking andsubset selection [2][3] to get most relational or most important value from a dataset. ndescribed as total feature the goal of feature selection is to select the optimal feature I, so theoptimal feature selection is I < n. With Feature Selection processing data will improve theoverall prediction because optimal dataset that improves by feature selection. we applied feature selection for optimizing classification based on graph feature selection,this feature selection ranked feature based on Eigenvector Centrality. in graph theory, ECFSmeasures a node that has major impact on other nodes in the network. all nodes on thenetwork are assigned relative scores based on the concept that nodes that have high valuecontribute more to the score of the node in question than equal connections to low-scoringnodes [4]. A high eigenvector score means that a node is connected to many nodes whothemselves have high scores. so, relationship between feature (nodes) are measure by weightthe connection between nodes. The problem from feature subset selection refers the task ofidentifying and selecting a useful subset of attributes to be used to represent patterns from alarger set of often mutually redundant, possibly irrelevant, attributes with different associatedmeasurement costs and/or risks [5]. we try to find the most influential feature to predict theprofit agent with ECFS. There are many studies that research about Eigenvector Centrality such as Nicholas J.Bryan and Ge Wang [6] , Nicholas J. Bryan and team research about how music with somany features can create pattern network between song and used to help describe patterns ofmusical influence in sample-based music suitable for musicological analysis. [7] To analyzerank influence feature between genre music with Eigenvector Centrality. and on 2016Giorgio Roffo & Simione Melzi research about Feature ranking vie Eigenvector Centrality, inGiorgio Roffo & Simione Melzi research important feature by identifying the most importantattribute into an arbitrary set of cues then mapping the problem to find where feature are thenodes by assessing the importance of nodes through some indicator of centrality. for buildingthe graph and the weighted distance between nodes Giorgio Roffo & Simione Melzi useFisher Criteria. The Goal of this paper is to applied Chi-Square and ECFS feature selection and compareboth features with different dataset. Both Feature Selection test with HCC and Profit agentdataset, this test validates with K Fold Cross Validation feature selection to test the model’sability then evaluated with confusion matrix to measure misclassification. Based o ...