%0 Journal Article %A SUN Hanyu %A HUANG Lixia %A ZHANG Xueying %A LI Juan %T Speech Emotion Recognition Based on Dual-Channel Convolutional Gated Recurrent Network %D 2023 %R 10.3778/j.issn.1002-8331.2107-0249 %J Computer Engineering and Applications %P 170-177 %V 59 %N 2 %X In order to build an efficient speech emotion recognition model, make full use of the information contained in different emotion features, a dual-channel convolutional gated recurrent network model based on the self-attention mechanism is constructed, which uses spectrogram features and LLDs features as the input. Simultaneously, in order to solve the problem that the cross-entropy loss function cannot increase the compactness and separation of the emotional characteristics of the speech, a new loss—CCC-Loss is proposed which is combined with the consistency correlation coefficient. First, the two features are separately input into the CGRU model to extract deep features and the self-attention mechanism is used to give higher weight to the key moments. Then, the model uses CCC-Loss and cross-entropy loss to train together. CCC-Loss calculates the ratio of the sum of consistency correlation coefficients of different types of emotional samples and of similar emotion samples and then uses it as the loss term, which improves the intra-class correlation of sample features and improves the feature discrimination ability of the model. Finally, the classification results of the two networks are used to achieve decision fusion. The proposed method has achieved 92.90%, 88.54% and 90.58% recognition results on the EMODB, RAVDESS and CASIA databases, which is better than baseline models such as ACRNN and DSCNN. %U http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2107-0249