Thesis Type: Postgraduate
Institution Of The Thesis: Gazi University, Fen Bilimleri Enstitüsü, ELEKTRİK-ELEKTRONİK MÜHENDİSLİĞİ ANA BİLİM DALI, Turkey
Approval Date: 2021
Thesis Language: Turkish
Student: Ayşe Nur ÇAYIR
Supervisor: Tuğba Selcen Navruz
Open Archive Collection: AVESIS Open Access Collection
Abstract:
Today, studies based on voice recognition and classification are important for use in artificial intelligence applications. Taking advantage of deep learning while designing a system that recognizes voice commands provides successful results. In this thesis, a system that enables the recognition of 12 English voice commands is designed using convolutional neural network, which is one of the deep learning architectures. The designed system achieved 94.64% success in terms of test accuracy. In addition, when the model was trained, a data set consisting of sounds taken from non-native English speakers was created and the real-life success of the model was investigated, and an accuracy rate of 63.29% was obtained. It has been seen that the data for Turkish is limited and it has been examined whether the success of the deep learning model can be increased by using data augmentation techniques while having limited data in this direction. After creating a very small data set containing 12 voice Turkish commands, the data set was enlarged by manipulating the voice data. While the average test accuracy was 53.947% before data augmentation, it increased to 85.756% after data augmentation, and while the average real-life accuracy was 58.074%, it increased to 64.761% after data augmentation. Experimental results have shown that even with limited data, accuracy rates can be increased by data augmentation. In addition, the effect of data augmentation was examined on a class basis. As a result of the examination, it is thought that the application of data augmentation techniques specific to classes will increase the success even more and contribute to the studies progressing in this direction.
Key Words
Deep learning, convolutional neural network, voice command, data augmentation