Advances in Distributed Computing and Artificial Intelligence Journal, cilt.13, 2024 (ESCI)
Recently, one of the most common approaches used in speech recognition is deep learning. The most advanced results have been obtained with speech recognition systems created using convolutional neural network (CNN) and recurrent neural networks (RNN). Since CNNs can capture local features effectively, they are applied to tasks with relatively short-term dependencies, such as keyword detection or phoneme- level sequence recognition. This paper presents the development of a deep learning and speech command recognition system. The Google Speech Commands Dataset has been used for training. The dataset contained 65.000 one-second-long words of 30 short English words. That is, %80 of the dataset has been used in the training and %20 of the dataset has been used in the testing. The data set consists of one-second voice commands that have been converted into a spectrogram and used to train different artificial neural network (ANN) models. Various variants of CNN are used in deep learning applications. The performance of the proposed model has reached %94.60.