计算机工程与应用 ›› 2017, Vol. 53 ›› Issue (2): 166-171.DOI: 10.3778/j.issn.1002-8331.1505-0072

• 模式识别与人工智能 • 上一篇    下一篇

小波域能量谱和非线性降维的病理嗓音识别

常静雅,张晓俊,顾玲玲,袁  悦,顾济华,陶  智   

  1. 苏州大学 物理与光电·能源学部,江苏 苏州 215006
  • 出版日期:2017-01-15 发布日期:2017-05-11

Wavelet domain energy spectrum and nonlinear dimensionality reduction in pathological voice recognition

CHANG Jingya, ZHANG Xiaojun, GU Lingling, YUAN Yue, GU Jihua, TAO Zhi   

  1. College of Physics, Optoelectronics and Energy, Soochow University, Suzhou, Jiangsu 215006, China
  • Online:2017-01-15 Published:2017-05-11

摘要: 提出了一种病理嗓音小波域建模分析方法并对病理嗓音进行识别。首先对病理嗓音进行多尺度连续小波变换时频分析,然后对沿尺度轴方向的能量谱进行高斯混合建模,采用最大似然估计方法求解得到模型统计学参数作为特征参数,并且使用改进的动态加权局部线性嵌入方法对特征参数进行非线性降维处理。实验结果表明,小波域能量谱特征经过非线性降维后对病理嗓音的识别率达到97.45%,改进的动态加权局部线性嵌入方法降维效果优于主成分分析和局部线性嵌入方法。

关键词: 小波能量谱, 动态加权局部线性嵌入, 病理嗓音, 语音识别

Abstract:  A wavelet domain modeling method is proposed to analyze and recognize pathological voice. Multi-scale continuous wavelet transform is operated to analyze pathological voice in time-frequency domain, then the Gaussian mixture model of the energy spectrum is calculated along the scale axis direction, and maximum likelihood estimation method is employed to solve the model statistical parameters as the feature parameters. It uses the improved dynamic weighted locally linear embedding method to reduce the dimensionality of the feature parameters nonlinearly. The experimental results show that, the pathological voice recognition accuracy rate of the wavelet energy spectrum features after nonlinear dimensionality reduction recognition reaches 97.45%, and the dimensionality reduction effect of dynamic weighted locally linear embedding method is better than principal component analysis and locally linear embedding reduction method.

Key words: wavelet energy spectrum, dynamic weighted locally linear embedding, pathological voice, voice recognition