Forecasting the future popularity of the anti-vax narrative on Twitter with machine learning

Biri, Ismail; Kucuktas, Ulku; Uysal, Fatih; HARDALAÇ, FIRAT

doi:10.1007/s11227-023-05567-8

Forecasting the future popularity of the anti-vax narrative on Twitter with machine learning

Atıf İçin Kopyala

Biri I., Kucuktas U. T., Uysal F., HARDALAÇ F.

Journal of Supercomputing, cilt.80, sa.3, ss.2917-2947, 2024 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 80 Sayı: 3
Basım Tarihi: 2024
Doi Numarası: 10.1007/s11227-023-05567-8
Dergi Adı: Journal of Supercomputing
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, zbMATH
Sayfa Sayıları: ss.2917-2947
Anahtar Kelimeler: Anti-vaccine sentiment, Forecasting, Sentiment analysis, Social media analysis
Gazi Üniversitesi Adresli: Evet

Özet

Social media play a significant role in shaping and spreading societal views, including anti-vaccine sentiments that can undermine public health efforts. Understanding the extent of these views and predicting their future trends is challenging but essential. Social media posts, often semi-structured and laden with irony, are difficult to process with traditional methods. To address this, this study has developed a system to monitor the popularity of antivaccine misinformation and predict its future direction. A key feature of this research is the creation of a custom dataset. Instead of using a generic sentiment analyzer, Turkish tweets containing the word ”vaccine” were collected and categorized to create a specialized data set. The collected data have been analyzed using several advanced deep learning networks, including different BERT architectures, LSTM, and BART. These models were trained on the categorized dataset to classify the remaining tweets. This classification provided a metric indicating the prevalence of anti-vaccine sentiment on social media. The output from the top-performing model was subsequently used to train and test a range of time series forecasting models, which included the naive forecaster, AutoARIMA, AutoETS, Croston’s method, polynomial trend forecaster, unobserved components model, and Facebook’s Prophet. The goal was to pinpoint the most effective algorithm for predicting the future trends of anti-vaccine sentiment. This research stands out for its dual focus on tracking and predicting public sentiment, providing a potential early warning system for public health authorities. The best results in the classification task were achieved by BERT base with F1 scores of 0.851, 0.731, 0.779, and 0.720 for each respective class, indicating its superior ability to capture and classify sentiment in the data. In the subsequent task of forecasting future trends, Prophet emerged as the top-performing model, demonstrating a mean absolute error of 6.01, signifying its accuracy in predicting anti-vaccine sentiment trends. The use of various deep learning networks for sentiment analysis, different forecasting models for trend prediction, and a custom-made dataset highlights this research’s novelty in social media discourse analysis and prediction.