Comparative Analysis of Traditional Machine Learning and Transformer-based Deep Learning Models for Text Classification


Aydın N., Erdem O. A., Tekerek A.

JOURNAL OF POLYTECHNIC-POLITEKNIK DERGISI, 2024 (ESCI) identifier

Özet

In today's information age, the generation and utilization of vast amounts of textual data have become exceedingly important. Within artificial intelligence, specifically natural language processing, text classification is crucial, aiding in the organization and comprehension of this data deluge. Text classification involves categorizing text pieces and allocating them to respective classes, a process significantly advanced by machine learning and deep learning methodologies. The aim of this study is to evaluate the effectiveness of conventional machine learning algorithms, including DT, NB, RF, and SVM, alongside state-of-the-art Transformer-based models such as BERT, DistilBERT, GPT-2, GPT-3, and RoBERTa in text classification tasks. Findings indicate that while Naive Bayes achieves a 65% accuracy rate among traditional methods, GPT-3 surpasses them with a 77% higher accuracy and F1 score. These results highlight the significant promise and efficiency of Transformer-based models in text classification endeavors.