Deep Learning based, a New Model for Video Captioning


Ozer E. G., Karapinar I. N., Busbug S., Turan S., Utku A., AKCAYOL M. A.

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, cilt.11, sa.3, ss.514-519, 2020 (ESCI) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 11 Sayı: 3
  • Basım Tarihi: 2020
  • Dergi Adı: INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS
  • Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), Scopus, Index Islamicus, INSPEC
  • Sayfa Sayıları: ss.514-519
  • Anahtar Kelimeler: Video captioning, CNN, LSTM
  • Gazi Üniversitesi Adresli: Evet

Özet

Visually impaired individuals face many difficulties in their daily lives. In this study, a video captioning system has been developed for visually impaired individuals to analyze the events through real-time images and express them in meaningful sentences. It is aimed to better understand the problems experienced by visually impaired individuals in their daily lives. For this reason, the opinions and suggestions of the disabled individuals within the Altmokta Blind Association (Turkish organization of blind people) have been collected to produce more realistic solutions to their problems. In this study, MSVD which consists of 1970 YouTube clips has been used as training dataset. First, all clips have been muted so that the sounds of the clips have not been used in the sentence extraction process. The CNN and LSTM architectures have been used to create sentence and experimental results have been compared using BLEU 4, ROUGE-L and CIDEr and METEOR.