X (Twitter) Sentiment Analysis Based on Hybrid Approach: An Application for Online Food Ordering


Güneş Y., Arıkan M.

INTERNATIONAL JOURNAL OF INFORMATICS TECHNOLOGIES, cilt.18, sa.2, ss.143-167, 2025 (Hakemli Dergi)

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 18 Sayı: 2
  • Basım Tarihi: 2025
  • Doi Numarası: 10.17671/gazibtd.1616709
  • Dergi Adı: INTERNATIONAL JOURNAL OF INFORMATICS TECHNOLOGIES
  • Derginin Tarandığı İndeksler: Applied Science & Technology Source, Computer & Applied Sciences, TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.143-167
  • Gazi Üniversitesi Adresli: Evet

Özet

For sentiment analysis of user opinions on online platforms such as X (formerly known as Twitter), dictionary-based approaches and machine learning methods are generally used. Recent studies emphasize that hybridizing these approaches improves model performance. In this study, we propose a hybrid classification model for sentiment analysis of texts on food ordering. In addition, we suggest a feature selection method based on aggregating words for the high-dimensionality problem of text classification. The main problems in that domain are low number of words with distinctive features, complexity of interpretation of food ordering field, domain dependency of text classification. The use of classification algorithms and a domain lexicon-based approach will contribute to overcoming these difficulties. For this purpose, two domain-specific lexicons are developed using data from online users' opinions, one for sentiment analysis and the other for product-service systems classification, referred to as basic lexicons. Basic lexicons have been transformed into new lexicons with fewer words, referred to as boosted lexicons, by grouping the words in basic lexicons and representing the groups with a single word in boosted lexicons. 144 models of combinations of six classification algorithms, three term weighting methods, and the lexicons are created in a hybrid approach for sentiment analysis. The study used two datasets of 21 039 and 14 389 tweets obtained from X between January 1 and December 31, 2020. The models were trained, tested on the first dataset, and the best models were selected. The second dataset is analyzed with the selected models, we present proposals for the industry.