Effect of Plosives on Isolated Speaker Recognition System Performance

Senturk Z., SALOR DURNA Ö.

9th International Conference on Electrical and Electronics Engineering (ELECO), Bursa, Türkiye, 26 - 28 Kasım 2015, ss.1263-1265, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası:
Basıldığı Şehir: Bursa
Basıldığı Ülke: Türkiye
Sayfa Sayıları: ss.1263-1265
Gazi Üniversitesi Adresli: Evet

Özet

In this paper, the effect of keyword choice including and excluding plosive sounds on isolated speaker recognition system is investigated. In order to perform this study, a Turkish word database has been created consisting of 48 words including plosives and 7 words without plosives. Records are acquired at a sampling frequency of 16 kHz in a professional recording studio, with sound insulation. The records have been acquired during three or four sessions, achieved at different times of the day, for each participant to reflect the sound variability of the human vocal tract on the database. A speaker recognition system employing MelFrequency Cepstrum Coefficients (MFCC) for feature extraction and Dynamic Time Warping (DTW) for time equalization has been developed. After the system training stage, average speaker recognition performances for the keywords in the test set including plosives and excluding plosives has been found to be % 98.24 and % 91.76, respectively.