Enhancement of Video Anomaly Detection Performance Using Transfer Learning and Fine-Tuning


Creative Commons License

Dilek E., DENER M.

IEEE ACCESS, cilt.12, ss.73304-73322, 2024 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 12
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1109/access.2024.3404553
  • Dergi Adı: IEEE ACCESS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
  • Sayfa Sayıları: ss.73304-73322
  • Gazi Üniversitesi Adresli: Evet

Özet

The use of surveillance cameras is a common solution that addresses the need to provide security and manage urban traffic that arises due to the increasing population in cities. As the number of surveillance cameras rises, video streams that create big data are recorded. The analysis of video streams collected from those traffic surveillance cameras and the automatic detection of unusual, suspicious events, as well as a range of harmful activities, have become crucial because it is impossible to observe, analyze, and comprehend the contents of these movies using human labor. Recent studies have shown that deep learning (DL)-based artificial intelligence (AI) techniques, particularly machine learning (ML) systems, are used in video anomaly detection (VAD) studies. In this study, an efficient frame-level VAD method is proposed based on transfer learning (TL) and fine-tuning (FT) approach, and anomalies were detected using 20 popular convolutional neural network (CNN)-based DL models where variants of VGG, Xception, MobileNet, Inception, EfficientNet, ResNet, DenseNet, NASNet, and ConvNeXt base models were trained via the TL and FT approaches. The proposed approach was tested using the CUHK Avenue, UCSD Ped1, and UCSD Ped2 datasets, and the performances of the models were measured via area under curve (AUC), accuracy, precision, recall, and F1-score metrics. The highest AUC scores measured were 100%, 100%, and 98.41% for the UCSD Ped1, UCSD Ped2, and CUHK Avenue datasets, respectively. Compared to existing techniques in the literature, experimental results show that the suggested method offers state-of-the-art VAD performance.