Simultaneous estimation of overall score and subscores using mirt, ho-irt and bi-factor model on timss data


Creative Commons License

ERDEMİR A., Yavuz Atar H. Y.

Journal of Measurement and Evaluation in Education and Psychology, cilt.11, sa.1, ss.61-75, 2020 (ESCI) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 11 Sayı: 1
  • Basım Tarihi: 2020
  • Doi Numarası: 10.21031/epod.645478
  • Dergi Adı: Journal of Measurement and Evaluation in Education and Psychology
  • Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), Scopus, TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.61-75
  • Anahtar Kelimeler: TIMSS, subscores, multidimensional item response theory, higher-order item response theory, bi-factor model, ITEM RESPONSE THEORY, COVARIANCE STRUCTURE, INFORMATION, DIMENSIONALITY, DETECT
  • Gazi Üniversitesi Adresli: Evet

Özet

© 2020 Association of Measurement and Evaluation in Education and Psychology (EPODDER). All rights reserved.In educational testing, there is an increasing interest in the simultaneous estimation of the overall scores and subscores. This study aims to compare the reliability and precision of the simultaneous estimation of overall scores and sub-scores using MIRT, HO-IRT and Bi-factor models. TIMSS 2015 mathematics scores have been used as a data set in this study. The TIMSS 2015 mathematics test consists of 35 items, four of which are polytomously scored (0-1-2), and the rest of the items are dichotomously scored (0-1). The four content domains include number (14 items), algebra (9 items), geometry (6 items), and data and change (6 items). Ability parameters were estimated using the BMIRT software. The results showed that the MIRT and HO-IRT methods performed similarly in terms of precision and reliability for subscore estimates. The MIRT maximum information method had the smallest standard error of measurement for the overall score estimates. All three methods performed similarly in terms of the overall score reliability. The findings suggest that among the three methods compared, HO-IRT appears to be a better choice in the simultaneous estimation of the overall score and subscores for the data from TIMSS 2015. Recommendations for the testing practices and future research are provided.