Sentiment Analysis of Iraqi Arabic Dialect on Facebook Based on Distributed Representations of Documents

Alnawas, Anwar; Arıcı, NURSAL

doi:10.1145/3278605

Sentiment Analysis of Iraqi Arabic Dialect on Facebook Based on Distributed Representations of Documents

Alnawas A., Arıcı N.

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, cilt.18, sa.3, 2019 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 18 Sayı: 3
Basım Tarihi: 2019
Doi Numarası: 10.1145/3278605
Dergi Adı: ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Anahtar Kelimeler: Doc2Vec, Iraqi Arabic Dialect, word embedding, sentiments analysis, facebook, MEDIA
Gazi Üniversitesi Adresli: Evet

Özet

Nowadays, social media is used by many people to express their opinions about a variety of topics. Opinion Mining or Sentiment Analysis techniques extract opinions from user generated contents. Over the years, a multitude of Sentiment Analysis studies has been done about the English language with deficiencies of research in all other languages. Unfortunately, Arabic is one of the languages that seems to lack substantial research, despite the rapid growth of its use on social media outlets. Furthermore, specific Arabic dialects should be studied, not just Modern Standard Arabic. In this paper, we experiment sentiments analysis of Iraqi Arabic dialect using word embedding. First, we made a large corpus from previous works to learn word representations. Second, we generated word embedding model by training corpus using Doc2Vec representations based on Paragraph and Distributed Memory Model of Paragraph Vectors (DM-PV) architecture. Lastly, the represented feature used for training four binary classifiers (Logistic Regression, Decision Tree, Support Vector Machine and Naive Bayes) to detect sentiment. We also experimented different values of parameters (window size, dimension and negative samples). In the light of the experiments, it can be concluded that our approach achieves a better performance for Logistic Regression and Support Vector Machine than the other classifiers.