计算机工程与应用 ›› 2015, Vol. 51 ›› Issue (10): 219-222.

• 信息与信号处理 • 上一篇    下一篇

采用GW-MFCC模型空间参数的语音情感识别

沈  燕,肖仲喆,李冰洁,周孝进,周  强,陶  智   

  1. 苏州大学 物理科学与技术学院,江苏 苏州 215006
  • 出版日期:2015-05-15 发布日期:2015-05-15

Speech emotion recognition using GW-MFCC feature

SHEN Yan, XIAO Zhongzhe, LI Bingjie, ZHOU Xiaojin, ZHOU Qiang, TAO Zhi   

  1. School of Physical Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China
  • Online:2015-05-15 Published:2015-05-15

摘要: 针对单一语音特征对语音情感表达不完整的问题,将具有良好量化和插值特性的LSF参数与体现人耳听觉特性的MFCC参数相融合,提出基于线谱权重的MFCC(WMFCC)新特征。同时,通过高斯混合模型来对该参数建立模型空间,进一步得到GW-MFCC模型空间参数,以获取更高维的细节信息,进一步提高情感识别性能。采用柏林情感语料库进行验证,新参数的识别率比传统的MFCC和LSF分别有5.7%和6.9%的提高。实验结果表明,提出的WMFCC以及GW-MFCC参数可以有效地表现语音情感信息,提高语音情感识别率。

关键词: 语音情感识别, 线谱对频率(LSF), Mel频率倒谱系数(MFCC), 高斯混合模型, 模型空间

Abstract: Aiming the insufficient expression of speech emotion with single type of speech features, a new feature weighted MFCC(WMFCC) is proposed combining LSF with good interpolation and quantization performance and MFCC which presents human hearing characters. GMM model is applied to this feature to obtain high level model space parameter GW-MFCC in order to further improve the emotion recognition rate with detailed information. Experiments are carried out on EMO-DB. The correct recognition rates are 5.7% and 6.9% higher than using MFCC and LSF respectively. The experiment results show that the GW-MFCC feature can effectively convey emotional information in speech, thus can improve the performance in the emotion recognition.

Key words: speech emotion recognition, Linear Spectrum Frequence(LSF), Mel-Frequency Cepstral Coeffients(MFCC), Gaussian Mixture Model(GMM), model space