Detecting and correcting automatic speech recognition errors with a new model


Arslan R. S., Barışçı N., Arıcı N., Kocer S.

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, cilt.29, sa.5, ss.2298-2311, 2021 (SCI-Expanded) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 29 Sayı: 5
  • Basım Tarihi: 2021
  • Doi Numarası: 10.3906/elk-2010-117
  • Dergi Adı: TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.2298-2311
  • Anahtar Kelimeler: Automatic speech recognition, automatic speech recognition error correction, artificial intelligence, alternative hypothesis suggestion, natural language processing
  • Gazi Üniversitesi Adresli: Evet

Özet

The purpose of automatic speech recognition (ASR) systems is to recognize speech signals obtained from people and convert them into text so that they can be processed by a computer. Although many ASR applications are versatile and widely used in the real world, they still generate relatively inaccurate results. They tend to generate spelling errors in recognized words, especially in noisy environments, in situations where the vocabulary size is increased, and at times when the input speech is of poor quality. The permanent presence of errors in ASR systems has led to the need to find alternative methods for automatic detection and correction of such errors. In this study, the basic principles of ASR evaluation are first summarized, and then a new approach based on the suggestion of an alternative hypothesis is proposed for the detection and correction of these errors generated by ASR systems. The proposed method involves a series of processes such as identifying incorrect words, selecting the ones that can be corrected, and identifying candidate words to replace these words. As a result of the tests carried out by creating different test environments, significant performance improvements for Turkish were achieved and an average of 4.60 % performance improvement was provided.