A Multi-Criteria Comparison of Large Language Model Powered Assistants in Pre-Research Studies for the Academia

Akin, MURAT; Batur Sir, GÜL; Karadag, AYYÜCE; Cercioglu, HAKAN

doi:10.1109/access.2025.3586502

A Multi-Criteria Comparison of Large Language Model Powered Assistants in Pre-Research Studies for the Academia

Akin M., Batur Sir G. D., Karadag A., Cercioglu H.

IEEE ACCESS, cilt.13, ss.127086-127099, 2025 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 13
Basım Tarihi: 2025
Doi Numarası: 10.1109/access.2025.3586502
Dergi Adı: IEEE ACCESS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
Sayfa Sayıları: ss.127086-127099
Gazi Üniversitesi Adresli: Evet

Özet

Large Language Models (LLMs), including Generative Pre-trained Transformers (GPT), a specific type of Large Language Model Powered Assistants (LLM-PA), have emerged as powerful tools in academic research and education. They offer capabilities ranging from language understanding to content generation, and serve as the foundation for LLM-PA-powered assistants, such as ChatGPT, DeepSeek, and Gemini, which facilitate interactive learning, research support, and intelligent tutoring. This study aims to guide researchers in choosing and ranking various LLM-PA alternatives in their preliminary research for academic studies. However, selecting the appropriate alternatives requires considering a large number of distinct criteria. Therefore, we conducted a multi-criteria comparison of different LLM-PAs employed in academic research. These assistants are evaluated based on criteria including performance metrics, user experience, ethical issues, and technical constraints. Examining the strengths and limitations of each tool across these dimensions, it is aimed to provide insights into their performance and suitability for academic applications. Throughout the solution procedure, we first define the criteria and sub-criteria affecting the preferences and sort them by the G1 method. Subsequently, we evaluate nine commonly used LLM-PA using the Simple Additive Weighting Method. According to the results, Gemini 2.0, Claude 3.7 Sonnet and ChatGPT-4o are the most preferred tools.