Research of extracting of pathological voice’s characteristics and recognition based on wavelet transformation and Gaussian mixture model

doi:10.3778/j.issn.1002-8331.2009.22.062

Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (22): 194-196.DOI: 10.3778/j.issn.1002-8331.2009.22.062

• 工程与应用 • Previous Articles Next Articles

Research of extracting of pathological voice’s characteristics and recognition based on wavelet transformation and Gaussian mixture model

YU Yan-ping ^1，2，HU Wei-ping¹

1.College of Physics and Electronic Engineering，Guangxi Normal University，Guilin，Guangxi 541004，China
2.Department of Electronic Engineering，Liuzhou Railway Vocational Technical College，Liuzhou，Guangxi 545007，China

Received:2008-04-23 Revised:2008-07-24 Online:2009-08-01 Published:2009-08-01
Contact: YU Yan-ping

病态嗓音特征的小波变换提取及识别研究

于燕平^1，2，胡维平¹

1.广西师范大学物理与电子工程学院，广西桂林 541004
2.柳州铁道职业技术学院电子工程系，广西桂林 545007

通讯作者: 于燕平

Abstract

Abstract: Considering the voice pronunciation mechanism，the different performances of the abnormal voice and the normal voice in the field of frequency，the paper proposes a new method for extracting characteristics that is Entropy Coefficient based on De-noise，Decomposition of Multi-scale Analysis（ECDDMA） using the wavelet decomposition to find the pathological voice’s characteristics，and comparative analysis of the effective speech characteristics MFCC.242 normal voices samples and 234 abnormal samples are recognized with MFCC and the new extracted characteristics ECDDMA based on Gaussian Mixture Model （GMM）.The result indicates that，the parameters of ECDDMA are more advantageous to the normal and abnormal voice recognition than the traditional MFCC and the dynamic characteristic which mimic the human ears non-linear characteristic with frequency，and improves the abnormal and normal voice’s recognition result.

Key words: Gaussian Mixture Model（GMM）, pathological voice, Mel Frequency Cepstrum Coefficient（MFCC）, wavelet transformation

摘要： 通过分析嗓音的发音机理、病态嗓音与正常嗓音在频域的表现差异，利用小波变换对信号进行分解，突出病态嗓音的特点，提出了基于多尺度分析的小波降噪、分解的熵系数（Entropy Coefficient based on De-noise，Decomposition of Multi-scale Analysis，ECDDMA）作为识别的特征矢量集。并对比分析了语音识别中经典特征参数Mel倒谱系数（MFCC），分别运用这两种特征参数对242例正常嗓音和234例病态嗓音运用高斯混合模型（GMM）进行了识别。结果显示：ECDDMA系数较传统的模拟人耳听觉非线性特性的MFCC及其动态特征能更准确地表征正常与病态嗓音之间的差异，有利于同时提高病态和正常嗓音的识别率。

关键词: 高斯混合模型（GMM）, 病态嗓音, Mel倒谱系数（MFCC）, 小波变换

YU Yan-ping ^1，2，HU Wei-ping¹. Research of extracting of pathological voice’s characteristics and recognition based on wavelet transformation and Gaussian mixture model[J]. Computer Engineering and Applications, 2009, 45(22): 194-196.

于燕平^1，2，胡维平¹. 病态嗓音特征的小波变换提取及识别研究[J]. 计算机工程与应用, 2009, 45(22): 194-196.

[1]	XU Limin1, WEI Xiang2. Analysis and design of speaker authentication system based on Android platform of parallel computation [J]. Computer Engineering and Applications, 2017, 53(3): 231-236.
[2]	CHANG Jingya, ZHANG Xiaojun, GU Lingling, YUAN Yue, GU Jihua, TAO Zhi. Wavelet domain energy spectrum and nonlinear dimensionality reduction in pathological voice recognition [J]. Computer Engineering and Applications, 2017, 53(2): 166-171.
[3]	HUANG Lixia1, WANG Yanan1, ZHANG Xueying1, WANG Hongcui2. Research on noise robustness of speech recognition based on deep auto-encoder neural network [J]. Computer Engineering and Applications, 2017, 53(13): 49-54.
[4]	DU Nannan, ZHAO Hui. Rearch on prosodic hierarchy conversion for Uyghur emotional speech [J]. Computer Engineering and Applications, 2016, 52(19): 154-160.
[5]	ZHANG Xiaoheng1，2, XIE Wenbin2, LI Yongming2. Multiple voice features types evolutionary selection algorithm [J]. Computer Engineering and Applications, 2016, 52(14): 150-155.
[6]	LIU Yuchao. Adaptive concept abstraction method on multi-granularity—Gaussian cloud transformation [J]. Computer Engineering and Applications, 2015, 51(9): 1-8.
[7]	DANG Xiaochao1，2, MAO Pengxin1, HAO Zhanjun1，2. Network traffic clustering algorithm based on quick solution of GMM [J]. Computer Engineering and Applications, 2015, 51(8): 96-101.
[8]	SHEN Yan, XIAO Zhongzhe, LI Bingjie, ZHOU Xiaojin, ZHOU Qiang, TAO Zhi. Speech emotion recognition using GW-MFCC feature [J]. Computer Engineering and Applications, 2015, 51(10): 219-222.
[9]	SU Peng, CHENG Jian. Application of DHMM to mechanical equipment audio recognition [J]. Computer Engineering and Applications, 2015, 51(1): 266-270.
[10]	SUN Yan. Self-adaption fuzzy clustering LBG vector-quantization algorithm [J]. Computer Engineering and Applications, 2014, 50(23): 203-205.
[11]	LV Xiaoting, LI Xin, QU Yanqin, HU Chen. Speaker identification based on sparse representation [J]. Computer Engineering and Applications, 2014, 50(20): 215-217.
[12]	KONG Rong, WU Di, LIAO Qipeng, ZHU Junjie, ZHOU Qiang, TAO Zhi. Using complex cepstrum peak filter for reverberation recognition by GMM [J]. Computer Engineering and Applications, 2014, 50(15): 191-193.
[13]	XU Hui1, Riyiman TURSUN1，2, Wushour SILAMU2. Online-handwriting recognition research of Uyghur word using GMM and HMM [J]. Computer Engineering and Applications, 2014, 50(11): 202-205.
[14]	ZHU Peng, WANG Chengru. Speaker recognition combining wavelet packet transform with Teager Energy Operator [J]. Computer Engineering and Applications, 2013, 49(9): 187-189.
[15]	CHEN Chengyi1, GAO Junfen2. study and recognition of pathological voice [J]. Computer Engineering and Applications, 2013, 49(7): 123-125.

Research of extracting of pathological voice’s characteristics and recognition based on wavelet transformation and Gaussian mixture model

病态嗓音特征的小波变换提取及识别研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics