Stance and Sentiment Analysis of Health-related Tweets with Data Augmentation


Küçük D., ARICI N.

Journal of Scientific and Industrial Research, cilt.83, sa.4, ss.381-391, 2024 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 83 Sayı: 4
  • Basım Tarihi: 2024
  • Doi Numarası: 10.56042/jsir.v83i4.1012
  • Dergi Adı: Journal of Scientific and Industrial Research
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aquatic Science & Fisheries Abstracts (ASFA), CAB Abstracts, INSPEC, Directory of Open Access Journals
  • Sayfa Sayıları: ss.381-391
  • Anahtar Kelimeler: Deep learning, Health informatics, Machine learning, Natural language processing, Public health monitoring
  • Gazi Üniversitesi Adresli: Evet

Özet

Common social media platforms like Twitter are important as up-to-date information sources for several monitoring purposes, including instant public health monitoring. In this sense, large volumes of health-related social media posts (such as tweets on the COVID-19 pandemic) have been produced recently, and are ready to be analyzed to facilitate health-related decision making. In this paper, joint stance detection and sentiment analysis on tweets about the COVID-19 vaccination was performed, in order to showcase the contribution of different machine learning and deep learning techniques equipped with data augmentation. Training and test tweet datasets are compiled and annotated for both stance and sentiment analysis and next, the training dataset is extended using an automatic data augmentation technique to increase its size. Experiments with different classifiers are performed for automated stance and sentiment analyses, using this extended dataset during training. The data augmentation technique adopted in this study to cope with data scarcity problems in machine learning research leads to better performance rates in this domain of health-related social media analysis. Comparative evaluations are also performed using a publicly-available sentiment analysis tool. The extended dataset and the test dataset, along with the approaches, and evaluation results are significant for health informatics, because, they facilitate joint estimation of instant community stance and sentiment towards COVID-19 vaccination which has been an important public health concern. Therefore, public health decision-makers can extensively and readily benefit from the findings and resources of the current study.