CROATIAN MEDICAL JOURNAL, cilt.65, sa.2, ss.93-100, 2024 (SCI-Expanded)
Aim To evaluate the quality of ChatGPT-generated case reports and assess the ability of ChatGPT to peer review medical articles. Methods This study was conducted from February to April 2023. First, ChatGPT 3.0 was used to generate 15 case reports, which were then peer -reviewed by expert human reviewers. Second, ChatGPT 4.0 was employed to peer review 15 published short articles. Results ChatGPT was capable of generating case reports, but these reports exhibited inaccuracies, particularly when it came to referencing. The case reports received mixed ratings from peer reviewers, with 33.3% of professionals recommending rejection. The reports' overall merit score was 4.9 +/- 1.8 out of 10. The review capabilities of ChatGPT were weaker than its text generation abilities. The AI as a peer reviewer did not recognize major inconsistencies in articles that had undergone significant content changes. Conclusion While ChatGPT demonstrated proficiency in generating case reports, there were limitations in terms of consistency and accuracy, especially in referencing.