A Novel Replica of High Dimensional Data Using Feature Subset Selection Algorithm

M Ravichandran, K Indhumathi, M Priyanka

Abstract


In the high dimensional data the dimensional reduction is an important factor, for that purpose the clustering based feature subset selection algorithm is proposed in this paper. The features has been clustered according to the class labels. The Relevance of the clustered features has been evaluated. The correlation of the relevant clustered feature are then evaluated. Based on the correlation evaluation the Minimum Spanning Tree (MST) has been generated. The representatives of each class has been identified by the MST. The effectiveness is determined in terms of time required to find the subset of feature and the efficiency is determined terms of quality of the subset. By comparing the proposed algorithm with the existing feature selection algorithms like FCBF, reliefF, CFS etc with respect to the four classification algorithms namely Naive Bayer, the tree based c4.5, the instance based IB1 and rule based RIPPER the proposed algorithm is better in terms of efficiency and accuracy. The results are computed with various types of data set.


Full Text:

PDF

Refbacks

  • There are currently no refbacks.


IT-EDU-2017   RTUWO 2017

ISSN: 2307-8162