FUSEPOP: A Multi-Modal Fusion with Mutual Information Weighting and Stacked Ensemble for Social Media Popularity Prediction


ŞENCAN Ö. A., ATACAK İ., DOĞRU İ. A., TOKLU S., Bar N., Kılıç K.

Applied Sciences (Switzerland), cilt.16, sa.9, 2026 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 16 Sayı: 9
  • Basım Tarihi: 2026
  • Doi Numarası: 10.3390/app16094160
  • Dergi Adı: Applied Sciences (Switzerland)
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
  • Anahtar Kelimeler: mutual information fusion, popularity detection, social media, stacked ensemble
  • Gazi Üniversitesi Adresli: Evet

Özet

Short-form video content has gained importance as a popular form of digital media due to the rising popularity of social media platforms and the decreasing attention spans of consumers. However, a major obstacle to popularity detection in short-form content is the heterogeneous nature of the data, encompassing textual, visual, and metadata components. To tackle this challenge, we propose FUSEPOP, a robust multi-modal architecture. The proposed framework utilizes ResNet-50 for visual feature extraction and XLM-RoBERTa for encoding multilingual textual information. FUSEPOP employs a mutual information-based modality weighting mechanism with logarithmic smoothing and a 0.7 weight ceiling to balance contributions from each input stream. Furthermore, FUSEPOP implements a robust stacked generalization strategy trained via stratified 5-fold cross-validation. This approach utilizes a logistic regression meta-learner to dynamically synthesize predictions from random forest, XGBoost, and a neural network-based classifier. Experimental results show that this architecture significantly outperforms benchmark models, achieving an accuracy of 0.980 and an average F1-score of 0.964 on the feature configuration selected for this study, and remains competitive on a literature-aligned alternative configuration. These findings confirm that the proposed model successfully detects popularity on short-form social media content.