Predicting the survival of heart failure patients in unbalanced data sets Dengesiz veri setlerinde kalp yetmezliǧi hastalarinin saǧkalim tespiti

Turkmenoglu B. K. , YILDIZ O.

29th IEEE Conference on Signal Processing and Communications Applications, SIU 2021, Virtual, Istanbul, Turkey, 9 - 11 June 2021 identifier

  • Publication Type: Conference Paper / Full Text
  • Volume:
  • Doi Number: 10.1109/siu53274.2021.9477806
  • City: Virtual, Istanbul
  • Country: Turkey
  • Keywords: Classification, Heart failure, Machine learning, Sampling, Survival prediction


© 2021 IEEE.Heart failure is a serious, cardiovascular condition that affects the lives of millions of people. Early diagnosis of this disease is extremely important in the treatment of the disease. Survival analysis of sick individuals gives us information for early diagnosis and treatment. Survival analysis of heart failure patients was performed within the scope of the study. Using the Correlation Matrix and Random Forest methods, the most relevant characteristics to death status were determined as serum creatinine, ejection fraction and age. Patient follow-up time was ignored because it was not known when performing the survival analysis. Resampling methods were applied due to the uneven class distribution in the data set. It has been shown in experimental studies that when data cleaning is applied together with resampling, the prediction success is higher. It was determined that eliminating the class imbalance in the data set increased the success of the classifier. While the oversampling method showed a better success on the Random Forest algorithm with %84.51, the undersampling method showed a higher success on the Extra Trees algorithm with %84.58.