Investigation of the effect of parameter estimation and classification accuracy in mixture IRT models under different conditions


Creative Commons License

SAATÇİOĞLU F. M., ATAR H. Y.

INTERNATIONAL JOURNAL OF ASSESSMENT TOOLS IN EDUCATION, cilt.9, sa.4, ss.1013-1029, 2022 (ESCI) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 9 Sayı: 4
  • Basım Tarihi: 2022
  • Doi Numarası: 10.21449/ijate.1164590
  • Dergi Adı: INTERNATIONAL JOURNAL OF ASSESSMENT TOOLS IN EDUCATION
  • Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), ERIC (Education Resources Information Center), TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.1013-1029
  • Anahtar Kelimeler: Mixture Item Response, Theory Models, Maximum Likelihood, Estimation, Item Parameter Recovery, Classification Accuracy, Missing Data, Latent Class
  • Gazi Üniversitesi Adresli: Evet

Özet

This study aims to examine the effects of mixture item response theory (IRT) models on item parameter estimation and classification accuracy under different conditions. The manipulated variables of the simulation study are set as mixture IRT models (Rasch, 2PL, 3PL); sample size (600, 1000); the number of items (10, 30); the number of latent classes (2, 3); missing data type (complete, missing at random (MAR) and missing not at random (MNAR)), and the percentage of missing data (10%, 20%). Data were generated for each of the three mixture IRT models using the code written in R program. MplusAutomation package, which provides the automation of R and Mplus program, was used to analyze the data. The mean RMSE values for item difficulty, item discrimination, and guessing parameter estimation were determined. The mean RMSE values as to the Mixture Rasch model were found to be lower than those of the Mixture 2PL and Mixture 3PL models. Percentages of classification accuracy were also computed. It was noted that the Mixture Rasch model with 30 items, 2 classes, 1000 sample size, and complete data conditions had the highest classification accuracy percentage. Additionally, a factorial ANOVA was used to evaluate each factor's main effects and interaction effects.