Automating shareable cyber threat intelligence production for closed source software vulnerabilities: a deep learning based detection system


Arıkan S. M., KOÇAK A., ALKAN M.

International Journal of Information Security, 2024 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1007/s10207-024-00882-4
  • Dergi Adı: International Journal of Information Security
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, FRANCIS, ABI/INFORM, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Compendex, Computer & Applied Sciences, Criminal Justice Abstracts, INSPEC
  • Anahtar Kelimeler: 68M25, 68T07, 68T10, 68T30, 94A13, Binary analysis, Closed source software, Cyber threat intelligence, Deep learning
  • Gazi Üniversitesi Adresli: Evet

Özet

Software can be vulnerable to various types of interference. The production of cyber threat intelligence for closed source software requires significant effort, experience, and many manual steps. The objective of this study is to automate the process of producing cyber threat intelligence, focusing on closed source software vulnerabilities. To achieve our goal, we have developed a system called cti-for-css. Deep learning algorithms were used for detection. To simplify data representation and reduce pre-processing workload, the study proposes the function-as-sentence approach. The MLP, OneDNN, LSTM, and Bi-LSTM algorithms were trained using this approach with the SOSP and NDSS18 binary datasets, and their results were compared. The aforementioned datasets contain buffer error vulnerabilities (CWE-119) and resource management error vulnerabilities (CWE-399). Our results are as successful as the studies in the literature. The system achieved the best performance using Bi-LSTM, with F1 score of 82.4%. Additionally, AUC score of 93.0% was acquired, which is the best in the literature. The study concluded by producing cyber threat intelligence using closed source software. Shareable intelligence was produced in an average of 0.1 s, excluding the detection process. Each record, which was represented using our approach, was classified in under 0.32 s on average.