A Review on Real Time Speech Emotion Recognition System Using RNN and CNN |
Author(s): |
Pranay Ughade , G H Raisoni College of Engineering Nagpur; Sonali Guhe, G H Raisoni college of Engineering Nagpur; Prasad Ambalkar, G H Raisoni college of Engineering Nagpur; Pranay Durutkar, G H Raisoni college of Engineering Nagpur |
Keywords: |
RTSER, CNN, RNN, MFCC, MS, Emotion, Healthcare, Education, Entertainment, Signals |
Abstract |
Speech emotion recognition is an important area of research that aims to automatically detect and recognize human emotions from speech signals. This technology has many potential applications, including in fields such as healthcare, education, and entertainment. To recognize emotions in speech signals, Mel-frequency cepstrum coefficients (MFCC) and modulation spectral (MS) features are extracted and used to train various classifiers. Feature selection techniques are employed to identify the most significant subset of features. Different machine learning paradigms are used to classify seven emotions. The initial classifier used is a recurrent neural network (RNN), and its performance is compared with multivariate linear regression (MLR) and support vector machines (SVM). These three techniques are commonly used for emotion recognition in spoken audio signals. Along with that in this paper we have proposed new model for Real Time Speech Emotion Recognition (RTSER) that uses a convolutional neural network (CNN) approach to learn deep frequency features with a modified pooling strategy. The proposed model has a low computational complexity and a high recognition accuracy. The model was trained on extracted frequency features from speech data and was tested to predict emotions. |
Other Details |
Paper ID: IJSRDV11I20019 Published in: Volume : 11, Issue : 2 Publication Date: 01/05/2023 Page(s): 20-22 |
Article Preview |
|
|