FUSEPOP: A Multi-Modal Fusion with Mutual Information Weighting and Stacked Ensemble for Social Media Popularity Prediction
Applied Sciences (Switzerland), cilt.16, sa.9, 2026 (SCI-Expanded, Scopus)
- Yayın Türü: Makale / Tam Makale
- Cilt numarası: 16 Sayı: 9
- Basım Tarihi: 2026
- Doi Numarası: 10.3390/app16094160
- Dergi Adı: Applied Sciences (Switzerland)
- Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
- Anahtar Kelimeler: mutual information fusion, popularity detection, social media, stacked ensemble
- Gazi Üniversitesi Adresli: Evet
Özet
Short-form video content has gained importance as a popular form of digital media due to the rising popularity of social media platforms and the decreasing attention spans of consumers. However, a major obstacle to popularity detection in short-form content is the heterogeneous nature of the data, encompassing textual, visual, and metadata components. To tackle this challenge, we propose FUSEPOP, a robust multi-modal architecture. The proposed framework utilizes ResNet-50 for visual feature extraction and XLM-RoBERTa for encoding multilingual textual information. FUSEPOP employs a mutual information-based modality weighting mechanism with logarithmic smoothing and a 0.7 weight ceiling to balance contributions from each input stream. Furthermore, FUSEPOP implements a robust stacked generalization strategy trained via stratified 5-fold cross-validation. This approach utilizes a logistic regression meta-learner to dynamically synthesize predictions from random forest, XGBoost, and a neural network-based classifier. Experimental results show that this architecture significantly outperforms benchmark models, achieving an accuracy of 0.980 and an average F1-score of 0.964 on the feature configuration selected for this study, and remains competitive on a literature-aligned alternative configuration. These findings confirm that the proposed model successfully detects popularity on short-form social media content.