PROMPT ATTACKS ON LARGE LANGUAGE MODELS CASE STUDY: A CYBER SECURITY SCENARIO

Aksoy Ç., Söğüt E.

5th INTERNATIONAL SELÇUK SCIENTIFIC RESEARCH CONGRESS, Konya, Türkiye, 14 - 15 Aralık 2024, ss.1710-1725, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Basıldığı Şehir: Konya
Basıldığı Ülke: Türkiye
Sayfa Sayıları: ss.1710-1725
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
Gazi Üniversitesi Adresli: Evet

Özet

With groundbreaking advancements in the field of artificial intelligence, the development of Large Language Models (LLMs) such as ChatGPT and Gemini, along with their widespread adoption, has facilitated easier and faster access to information through question-and-answer systems. However, these advancements also bring significant cybersecurity challenges. LLMs are increasingly subjected to various jailbreaks or prompt attack strategies by malicious and unauthorized actors, aiming to exploit these models for unintended or harmful purposes. In this study, a scenario was primarily developed to address a cybersecurity problem, and based on this scenario, malicious prompt queries were executed on widely used LLM models such as ChatGPT, Gemini, Microsoft Copilot, and Perplexity. This approach enabled the malicious steps required to realize the defined scenario to be planned by the LLMs. Although LLM developers continuously enhance these models with security measures against jailbreak or prompt attack strategies, the results of this study experimentally demonstrate that the tested LLMs remain vulnerable to such attacks. This study is anticipated to contribute to the situational awareness of LLM developers and users regarding these security vulnerabilities. Furthermore, the importance of urgently addressing these vulnerabilities and the security policies and measures that LLM developers need to implement have been highlighted.