Formant extraction algorithm of speech signal by combining EMD and WMCEP

Abstract

Abstract: This paper presents a method to realize formants extraction from speech signal. The speech signal is decomposed with Empirical Mode Decomposition（EMD） to obtain a set of formant-specific Intrinsic Mode Functions（IMF）. The new speech signal is then generated by adding the IMFs. The Weighted Mel-Cepstrum Coefficients（WMCC）, which contain main components of spectrum, are calculated from the new speech signal by using weighted mel-cepstrum analysis. The Discrete Cosine Transform（DCT） based smooth algorithm is then applied to the WMCCs to obtain the smooth contour of spectrum in which the peaks of contour are candidate formants. The formant frequencies are selected from candidate formants according to the continuity constrain and the frequency range of formants. Tests show that the errors of this method outperform the weighted mel-cepstrum based method. When signal-to-noise ratio is less than 20 dB, the proposed method still can accurately extract the formants.

Key words: formant, Empirical Mode Decomposition（EMD）, Intrinsic Mode Functions（IMF）, Weighted Mel-Cepstrum coefficient（WMCEP）, Discrete Cosine Transform（DCT）

摘要： 提出了一种利用经验模态分解（Empirical Mode Decomposition，EMD）和加权Mel倒谱（Weighted Mel-Cepstrum coefficients，WMCEP）提取语音信号共振峰的算法。对语音信号进行EMD分解，找出含有共振峰的固有模态函数（Intrinsic Mode Function，IMF），并将其重构得到一个新的重构语音信号。对重构语音信号进行加权Mel倒谱分析，获得包含频谱主要成分的加权Mel倒谱系数；利用离散余弦平滑算法，从加权Mel倒谱系数获得谱包络，并从谱包络的峰值位置获得候选共振峰；根据共振峰的连续性约束条件和频率范围，从候选共振峰筛选得到共振峰的估计值。实验结果表明，该算法比单独使用WMCEP提取的共振峰误差更小，而且在信噪比小于20 dB时仍然能够准确提取出共振峰。

关键词: 共振峰, 经验模态分解, 固有模态函数, 加权Mel倒谱, 离散余弦变换

ZHAO Taotao, YANG Hongwu. Formant extraction algorithm of speech signal by combining EMD and WMCEP[J]. Computer Engineering and Applications, 2015, 51(9): 207-212.

赵涛涛，杨鸿武. 结合EMD和加权Mel倒谱的语音共振峰提取算法[J]. 计算机工程与应用, 2015, 51(9): 207-212.

[1]	ZHANG Xuejun1，2, WANG Longqiang1, HUANG Wanlu1, HUANG Liya1，2, CHENG Xiefeng1，2. EEG signals feature extraction based on EMD and CSP combined WOSF [J]. Computer Engineering and Applications, 2018, 54(24): 149-155.
[2]	DING Zhuihui. Sonar signal identification method based on Discrete Cosine Transform [J]. Computer Engineering and Applications, 2018, 54(1): 133-139.
[3]	GUO Rui, FAN Yamin. Algorithm based on extreme learning machine to restrain the end effect of BS-EMD and its application [J]. Computer Engineering and Applications, 2017, 53(7): 256-262.
[4]	WANG Yan, WANG Yunyun. Face feature extraction method based on fusing DCT and ELBP features [J]. Computer Engineering and Applications, 2017, 53(4): 170-175.
[5]	WANG Xia1, WANG Dan1, WANG Guangyan2, ZHANG Yan1. Noisy face mask speech enhancement combining compressed sensing with EMD [J]. Computer Engineering and Applications, 2017, 53(18): 137-140.
[6]	ZHANG Xuejun1，2, HUANG Wanlu1, HUANG Liya1，2, CHENG Xiefeng1，2. EEG signals feature extraction combined with empirical mode decomposition and common spatial pattern [J]. Computer Engineering and Applications, 2017, 53(13): 9-15.
[7]	TAN Yangbo, CHENG Jinjun, LIU Shuai. Liquid solenoid valve fault diagnosis based on EMD and neighborhood rough set [J]. Computer Engineering and Applications, 2017, 53(12): 255-260.
[8]	YANG Hang, GUO Xiaojin. Improved method dealing with end issue of EMD [J]. Computer Engineering and Applications, 2016, 52(8): 266-270.
[9]	ZHANG Yanfei, OUYANG Jianfei, YAO Lifeng. Physiological signals extraction based on ICA and EMD [J]. Computer Engineering and Applications, 2016, 52(6): 167-171.
[10]	JING Bo, JIN Weidong, QIN Na. Track detection of wavelet ridge method based on EMD [J]. Computer Engineering and Applications, 2016, 52(4): 266-270.
[11]	LI Ming, LI Tianrui, CHEN Zhi, YANG Yan. Empirical mode decomposition of high-speed rail data based on Spark computing framework [J]. Computer Engineering and Applications, 2016, 52(20): 103-107.
[12]	XUE Juntao, WENG Yuru, ZHANG Jun. Speech endpoint detection based on EMD and cross-entropy [J]. Computer Engineering and Applications, 2016, 52(20): 149-153.
[13]	DU Nannan, ZHAO Hui. Rearch on prosodic hierarchy conversion for Uyghur emotional speech [J]. Computer Engineering and Applications, 2016, 52(19): 154-160.
[14]	WANG Wei1, YANG Weiwei1, LI Zhengchen2. Compressed sensing for image processing based on DCT fan-shaped segmentation [J]. Computer Engineering and Applications, 2015, 51(24): 186-189.
[15]	YUE Tingming, WANG Jinhai, WANG Huiquan. Integration of accelerometer information and EMD filtering algorithm in dynamic ECG acquisition [J]. Computer Engineering and Applications, 2015, 51(20): 192-197.

Formant extraction algorithm of speech signal by combining EMD and WMCEP

结合EMD和加权Mel倒谱的语音共振峰提取算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics