Speech Record Speed Up with Machine Learning Technics

Delul Celen P., HARDALAÇ F.

25th Signal Processing and Communications Applications Conference (SIU), Antalya, Türkiye, 15 - 18 Mayıs 2017, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası:
Doi Numarası: 10.1109/siu.2017.7960316
Basıldığı Şehir: Antalya
Basıldığı Ülke: Türkiye
Gazi Üniversitesi Adresli: Evet

Özet

Previous studies have shown that listening is a much faster primary information processing sense in contrast to speaking. The aim of this study is to improve the comprehension speed of speaking at a recognizable rate. For this, the speech data is divided into windows, and the RMS, ZCR and FFT attributes are extracted for each window. The SVM-based classifier is lectured by the training data, and it classifies each sample by either speech and non-speech. Samples which have been identified as recognizable was produced by collecting speech parts. The performance of the study was measured by a speech-to-text program. It converted the recognizable portion of sample speech into a text.