Generative AI vs. human expertise: a comparative analysis of case-based rational pharmacotherapy question generation


Creative Commons License

Güvel M. C., Kıyak Y. S., Doğan Varan H., Sezenöz B., Coşkun Ö., Uluoğlu C.

EUROPEAN JOURNAL OF CLINICAL PHARMACOLOGY, cilt.0, 2025 (SCI-Expanded, Scopus) identifier identifier identifier

Özet

Purpose This study evaluated the performance of three generative AI models—ChatGPT- 4o, Gemini 1.5 Advanced Pro,

and Claude 3.5 Sonnet—in producing case-based rational pharmacology questions compared to expert educators.

Methods Using one-shot prompting, 60 questions (20 per model) addressing essential hypertension and type 2 diabetes

subjects were generated. A multidisciplinary panel categorized questions by usability (no revisions needed, minor or major

revisions required, or unusable). Subsequently, 24 AI-generated and 8 expert-created questions were asked to 103 medical

students in a real-world exam setting. Performance metrics, including correct response rate, discrimination index, and

identification of nonfunctional distractors, were analyzed.

Results No statistically significant differences were found between AI-generated and expert-created questions, with mean correct

response rates surpassing 50% and discrimination indices consistently equal to or above 0.20. Claude produced the highest

proportion of error-free items (12/20), whereas ChatGPT exhibited the fewest unusable items (5/20). Expert revisions required

approximately one minute per AI-generated question, representing a substantial efficiency gain over manual question preperation.

Nonetheless, 19 out of 60 AI-generated questions were deemed unusable, highlighting the necessity of expert oversight.

Conclusion Large language models can profoundly accelerate the development of high-quality assessment questions in medical

education. However, expert review remains critical to address lapses in reliability and validity. A hybrid model, integrating

AI-driven efficiencies with rigorous expert validation, may offer an optimal approach for enhancing educational outcomes.