YÜKSEK RİSKLİ SINAVLARDA BAŞARIYLA İLİŞKİLİ DEĞİŞKENLERİN VERİ MADENCİLİĞİ YÖNTEMLERİYLE İNCELENMESİ

AYŞEGÜL BOZDAĞ KASAP

YÜKSEK RİSKLİ SINAVLARDA BAŞARIYLA İLİŞKİLİ DEĞİŞKENLERİN VERİ MADENCİLİĞİ YÖNTEMLERİYLE İNCELENMESİ

Tezin Türü: Doktora

Tezin Yürütüldüğü Kurum: Gazi Üniversitesi, Gazi Eğitim Fakültesi, Eğitim Bilimleri, Türkiye

Tezin Onay Tarihi: 2024

Tezin Dili: Türkçe

Öğrenci: AYŞEGÜL BOZDAĞ KASAP

Danışman: Dilara Bakan Kalaycioğlu

Özet:

This study aims to identify the variables related to high-stakes exam achievement and the appropriate educational data mining methods that can be used to predict exam performance. In this context, variables such as gender, university entrance exam (YGS), secondary education success scores, academic grade point average, high school, university, faculty, undergraduate program, graduation year, and number of exam attempts are used to predict success in the civil servant recruitment exam (KPSS), graduate education exam (ALES), and the foreign language exam (YDS-English). Using the CRISP-DM process model, the research analyzes the longitudinal success data of 9,918 examinees across different periods. It identifies significant variables for modelling using the SPSS Modeler feature selection node. The importance levels of variables related to KPSS and ALES achievement were investigated with the analysis using C&RT, CHAID decision trees, multiple regression, random forest, k-nearest neighbor, support vector machines, and artificial neural networks. For the classification of YDS foreign language proficiency levels, C5.0, CHAID, C&RT, k-nearest neighbor, multinomial logistic regression, support vector machines, artificial neural networks, and random forest methods were used. The prediction performance of the models was compared based on mean absolute error and correlation coefficients, while classification performance was evaluated using accuracy, sensitivity, specificity, precision, and F-score criteria. Despite differences in the importance values of variables for predicting KPSS and ALES achievement and classifying YDS language proficiency, it was observed that the importance levels of the variables generally demonstrated similar trends across the models. According to the results of the analysis, the most important variables for predicting KPSS achievement in most models were ALES quantitative, ALES verbal, and YGS mathematics tests. It has been observed that the importance values of undergraduate program and university variables in predicting KPSS achievement are relatively high, while academic grade point average and secondary education success scores have moderate effects. When considering the models examined separately by ALES score types, it was consistently determined that the KPSS general ability test was the most significant predictor of ALES achievement across all score types. For ALES quantitative success, YGS mathematics was identified as a subsequent important variable; for ALES verbal success, YGS Turkish, and ALES equally weighted success, both tests were significant. The study found that in predicting success in high-stakes exams, KPSS and YGS achievement were highly valid predictors of ALES, and YGS and ALES achievement were highly valid predictors of KPSS. The results emphasize the importance of mathematical proficiency and reasoning skills for KPSS and ALES quantitative success, and Turkish language and reasoning skills for ALES verbal success. For YDS-English achievement, it was observed that undergraduate field, university, and undergraduate program variables generally had high importance values, while academic grade point average and secondary education success scores had moderate effects. The results of the verbal fields of the other high-stakes exams, including YGS Turkish, KPSS general ability, and ALES verbal tests, were significant predictors of YDS-English language proficiency levels. In conclusion, the performance in other exams being the most important variable in predicting individuals' ALES, KPSS, and YDS achievement indicates that individuals' success statuses exhibit a holistic structure and provide evidence for the consistency and reliability of high-stakes exams. Additionally, the importance values of variables such as university type, high school type, faculty, graduation year, and gender are generally low in the models examining KPSS, ALES, and YDS-English achievement. This situation is an indicator of the equality and inclusivity of high-stakes exams, as well as pointing to the objectivity and impartiality of the exams. When the prediction performances of the models were compared, it was found that artificial neural networks provided the best performance and the most consistent predictions related to KPSS and ALES achievement. In classifying YDS-English language proficiency levels, the random forest model demonstrated the best performance. The study aims to contribute to the relevant field by guiding decision-making processes for identifying variables associated with success in high-stakes exams and selecting appropriate educational data mining methods.