FEATURE SELECTION WITH WEIGHTED CONDITIONAL MUTUAL INFORMATION


Celik C., BİLGE H. Ş.

JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, cilt.30, sa.4, ss.585-596, 2015 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 30 Sayı: 4
  • Basım Tarihi: 2015
  • Dergi Adı: JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.585-596
  • Anahtar Kelimeler: Feature selection, conditional mutual information, maximum dependency, minimum redundancy, INPUT FEATURE-SELECTION, CLASSIFICATION, RELEVANCE
  • Gazi Üniversitesi Adresli: Evet

Özet

Huge data processing and extracting the meaningful information from those data is one of the important topics in data mining. In practice, it cannot be known whether present data are relevant to the problem, and irrelevant data increase the complexity of the prospective model. Dimensionality reduction approaches are applied to the problem parameters to build simpler and low cost models. Information theory based mutual information approaches are commonly used on dimensionality reduction. In these approaches, it is aimed to have the minimum redundancy and maximum dependency in outputs between inputs in the subset obtained from the data set in the execution of the size degradation. However the heuristic functions which are used in the proposed approaches to ensure this condition, control the relation between dependency and redundancy with the fixed parameter and problem-independent. In this study, a new mutual information approach is proposed. The heuristic function used in this proposed approach weights the effectiveness of redundancy on selection by evaluating the relationship between mutual information of features with class and mutual information of features among themselves. Similarly, both conditional mutual information and mutual information are calculated for maximum dependency. Thus the proposed heuristic function presents a dynamic approach to variety of problems. The obtained results of the tests point out the success of the proposed approach.