BANTAO JOURNAL, cilt.20, ss.39-40, 2022 (ESCI)
Introduction: Cronic kidney disease (CKD) effects more than 10% of the adult population and causes substantial morbidity and
mortality. Genetic kidney disease (GKD) is increasingly recognised as an important cause of CKD and can be difficult to accurately
diagnose and treat. Knowing the presence of genetic kidney disease offers many advantages; including providing prognostic information,
informing targeted surveillance and therapies, preventing inappropriate treatments, informing reproductive decisions, and reducing the
use of invasive diagnostic investigation such as renal biopsies. However, genetic testing is a very laborious and expensive examination.
The aim of our study was to develop a novel prediction model for genotype positivity in patients with kidney disease by applying
machine learning algorithms to clinical, laboratory and renal imaging variables that are readily available in daily practice. We decided
to use power of machine learning to decide whether genetic test is necessary or not for a patient.
What is machine Learning?
Machine learning (ML) is a type of artificial intelligence (AI) that allows software applications to become more accurate at predicting
outcomes without being explicitly programmed to do so. ML models are being increasingly utilized to improve the accuracy of clinical
risk prediction tools in a variety of clinical settings.
Methods: We constructed ML models using readily available clinical and laboratory data of 140 patients from Gazi University with
kidney disease who had undergone genetic testing. In our case, we used Supervised machine learning Classification method with
Logistic Regression Algorithm. We used Python’s scikit-learn library that is has a 4-step modeling pattern that makes it easy to code a
machine learning classifier. We have used %75 of the data for training and remaining %25 is used for testing the performance of the ML
model. Since how the data is separated to training and testing sets effects test results, we repeated the training and testing processes 100
times and calculated the average prediction performance.
Results: A total of 140 patients with kidney disease who underwent genetic testing were screened. In this cohort, sixteen patients were
then excluded because of missing data and twelve patients were excluded from the study because genetic testing was incomplete.112
patients included in the final cohort. 45 patients had a positive result, 40 patients had a negative result and 27 patients had a clinically
relevant variant of uncertain significance. We evaluated 67 parameters and found 23 predictors of genotype positivity in logictic
regression model in order of importance to the model. Some previously known predictors of genotype positivity were identified by
our predictor selection algorithm, i.e. family history, parents consanguineous marriage, presence of polycystic kidney on ultrasound.
The algorithm also identified several novel important predictors, such as the presence of isolated hematuria, presence of autoimmune
disease, hypopotasemia, hyperlipidemia and development of end stage kidney disease.
Discussion: Our ML models demonstrated a good ability to predict genotype positivity in patients with GKD. ML models have never
previously been applied to the prediction of genetic kidney disease and this was the first study to apply ML algorithms to clinical,
laboratory and renal imaging data to predict genotype positivity in patients with kidney disease.