Speech Record Speed Up with Machine Learning Technics

Delul Celen P., HARDALAÇ F.

25th Signal Processing and Communications Applications Conference (SIU), Antalya, Türkiye, 15 - 18 Mayıs 2017 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası:
  • Doi Numarası: 10.1109/siu.2017.7960316
  • Basıldığı Şehir: Antalya
  • Basıldığı Ülke: Türkiye


Previous studies have shown that listening is a much faster primary information processing sense in contrast to speaking. The aim of this study is to improve the comprehension speed of speaking at a recognizable rate. For this, the speech data is divided into windows, and the RMS, ZCR and FFT attributes are extracted for each window. The SVM-based classifier is lectured by the training data, and it classifies each sample by either speech and non-speech. Samples which have been identified as recognizable was produced by collecting speech parts. The performance of the study was measured by a speech-to-text program. It converted the recognizable portion of sample speech into a text.