An Effective Sample Preparation Method for Diabetes Prediction


Afzali S., YILDIZ O.

INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, cilt.15, sa.6, ss.968-973, 2018 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 15 Sayı: 6
  • Basım Tarihi: 2018
  • Dergi Adı: INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.968-973
  • Anahtar Kelimeler: Diabetes, clustering, classification, K-means, SVM, sample preparation, SYSTEM
  • Gazi Üniversitesi Adresli: Evet

Özet

Diabetes is a chronic disorder caused by metabolic malfunction in carbohydrate metabolism and it has become a serious health problem worldwide. Early and correct detection of diabetes can significantly influence the treatment process of diabetic patients and thus eliminate the associated side effects. Machine learning is an emerging field of high importance for providing prognosis and a deeper understanding of the classification of diseases such as diabetes. This study proposed a high precision diagnostic system by modifying k-means clustering technique. In the first place, noisy, uncertain and inconsistent data was detected by new clustering method and removed from data set. Then, diabetes prediction model was generated by using Support Vector Machine (SVM). Employing the proposed diagnostic system to classify Pima Indians Diabetes data set (PID) resulted in 99.64% classification accuracy with 10-fold cross validation. The results from our analysis show the new system is highly successful compared to SVM and the classical k-means algorithm & SVM regarding classification performance and time consumption. Experimental results indicate that the proposed approach outperforms previous methods.