Keeping humans in the loop efficiently by generating question templates instead of questions using AI: Validity evidence on Hybrid AIG

Kıyak, YAVUZ; Emekli, Emre; Coşkun, ÖZLEM; Budakoğlu, Işıl

doi:10.1080/0142159x.2024.2430360

Keeping humans in the loop efficiently by generating question templates instead of questions using AI: Validity evidence on Hybrid AIG

Kıyak Y. S., Emekli E., Coşkun Ö., Budakoğlu I. İ.

MEDICAL TEACHER, cilt.47, sa.4, ss.744-747, 2025 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 47 Sayı: 4
Basım Tarihi: 2025
Doi Numarası: 10.1080/0142159x.2024.2430360
Dergi Adı: MEDICAL TEACHER
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, ASSIA, CINAHL, EBSCO Education Source, Educational research abstracts (ERA), ERIC (Education Resources Information Center), Public Affairs Index
Sayfa Sayıları: ss.744-747
Gazi Üniversitesi Adresli: Evet

Özet

BackgroundManually creating multiple-choice questions (MCQ) is inefficient. Automatic item generation (AIG) offers a scalable solution, with two main approaches: template-based and non-template-based (AI-driven). Template-based AIG ensures accuracy but requires significant expert input to develop templates. In contrast, AI-driven AIG can generate questions quickly but with inaccuracies. The Hybrid AIG combines the strengths of both methods. However, neither have MCQs been generated using the Hybrid AIG approach nor has any validity evidence been provided.MethodsWe generated MCQs using the Hybrid AIG approach and investigated the validity evidence of these questions by determining whether experts could identify the correct answers. We used a custom ChatGPT to develop an item template, which were then fed into Gazitor, a template-based AIG (non-AI) software. A panel of medical doctors identified the answers.ResultsOf 105 decisions, 101 (96.2%) matched the software's correct answer. In all MCQs (100%), the experts reached a consensus on the correct answer. The evidence corresponds to the 'Relations to Other Variables' in Messick's validity framework.ConclusionsThe Hybrid AIG approach can enhance the efficiency of MCQ generation while maintaining accuracy. It mitigates concerns about hallucinations while benefiting from AI.