Feature Subset Selection Method For An Effective Classification Based On Genetic Algorithm


Thesis Type: Postgraduate

Institution Of The Thesis: Gazi Üniversitesi, Fen Bilimleri Enstitüsü, Turkey

Approval Date: 2014

Student: SHİMA AFZALİ VAHED MOGHADDAM

Supervisor: OKTAY YILDIZ

Open Archive Collection: AVESIS Open Access Collection

Abstract:

Classification is one of the important methods commonly used in the Machine Learning and Data Mining. Feature selection technique has frequently used to improve the classification performance in the last decade. Feature selection has applied in wide variety of real world applications, such as text mining, bioinformatics and image analysis. Feature selection techniques within the data set can be classified as disposal of irrelevant attributes that adversely affect performance or the selection of important attributes. Thus, the performance of classifiers can be increased. In this study, a hybrid effective feature selection method is proposed based on genetic algorithm. This method was applied on 4 data sets that have commonly used in the literature. The proposed method consists of two steps. In the first stage, feature ranking methods were used to create feature pool. In the second stage, a genetic algorithm was used to select the proposed optimal subset of features with high classification performance. In this study, the genetic algorithm and four other different classification algorithms were used. The proposed method was applied on wisconsin diagnostic breast cancer (WDBC), Single proton emission computed tomography (SPECT) heart, Statlog heart and Wisconsin prognostic breast cancer (WPBC) data sets taken from the UCI and achieved effective feature selection with 100%, 91,25%, 96,29%, and 94,8276% classification accuracy, respectively.