Speech endpoint detection based on EMD and cross-entropy

Abstract

Abstract: In view of the problem that speech endpoint detection based on Empirical Mode Decomposition（EMD） loses its accuracy and adaptive in adverse environments, this paper proposes a novel speech endpoint detection algorithm based on EMD and cross-entropy. EMD decomposition characteristic is analyzed that probability distribution of white noise in each Intrinsic Mode Functions（IMF） is identified and unrelated to noise amplitude. Since probability distribution of white noise is different from that of speech signal, cross-entropy is used to reflect the difference of speech-frames and noise-frames. EMD-energy feature and cross-entropy are complementary so that they are combined to be a comprehensive determination for speech endpoint detection. Adaptive threshold is set to adapt to negative environments. It catches the changes of noise energy and then it is self-updated to improve accuracy in speech endpoint detection. Simulation results indicate that it is effective and superior in the presence of low Signal-to-Noise Ratio（SNR） and non-stationary noise.

Key words: endpoint detection, Empirical Mode Decomposition（EMD）, cross entropy, adaptive threshold, low Signal-to-Noise Ratio（SNR）

摘要： In view of the problem that speech endpoint detection based on Empirical Mode Decomposition（EMD） loses its accuracy and adaptive in adverse environments, this paper proposes a novel speech endpoint detection algorithm based on EMD and cross-entropy. EMD decomposition characteristic is analyzed that probability distribution of white noise in each Intrinsic Mode Functions（IMF） is identified and unrelated to noise amplitude. Since probability distribution of white noise is different from that of speech signal, cross-entropy is used to reflect the difference of speech-frames and noise-frames. EMD-energy feature and cross-entropy are complementary so that they are combined to be a comprehensive determination for speech endpoint detection. Adaptive threshold is set to adapt to negative environments. It catches the changes of noise energy and then it is self-updated to improve accuracy in speech endpoint detection. Simulation results indicate that it is effective and superior in the presence of low Signal-to-Noise Ratio（SNR） and non-stationary noise.

关键词: endpoint detection, Empirical Mode Decomposition（EMD）, cross entropy, adaptive threshold, low Signal-to-Noise Ratio（SNR）

XUE Juntao, WENG Yuru, ZHANG Jun. Speech endpoint detection based on EMD and cross-entropy[J]. Computer Engineering and Applications, 2016, 52(20): 149-153.

薛俊韬，翁玉茹，张军. 基于EMD和交叉熵的语音端点检测算法[J]. 计算机工程与应用, 2016, 52(20): 149-153.

[1]	QI Xiaoxiang, LI Min, ZHU Ying, SONG Yu, DU Weidong. Adaptive Region Segmentation of SAR Image Based on Edge Detection [J]. Computer Engineering and Applications, 2021, 57(22): 232-240.
[2]	CHEN Xiaowen, LIU Guangshuai, LIU Wanghua, LI Xurui. Pairwise Rotation-Invariant Co-occurrence Adaptive Complete Local Ternary Pattern [J]. Computer Engineering and Applications, 2021, 57(1): 219-226.
[3]	ZHANG Junfang, ZHOU Yixuan, ZHOU Ligang, XIAO Jian. Approach to Multiple Attribute Decision Making Based on Pythagorean Hesitant Fuzzy Cross Entropy [J]. Computer Engineering and Applications, 2020, 56(9): 198-203.
[4]	FAN Jianping, JIA Xuefei, WU Meiqin. TOPSIS Method Based on Single-Valued Triangular Neutrosophic Cross Entropy [J]. Computer Engineering and Applications, 2020, 56(6): 239-245.
[5]	CHEN Yao, CHEN Si. Research on Application of Dynamic Weighted Bat Algorithm in Image Segmentation [J]. Computer Engineering and Applications, 2020, 56(14): 207-215.
[6]	WANGChundan, XIE Hongwei, LI Yaxuan, ZHANG Hao. Motion Object Detection with Improved Three-Frame Difference and ViBe Algorithm [J]. Computer Engineering and Applications, 2020, 56(13): 199-203.
[7]	CHENG Deqiang, BAI Chunmeng, GUO Xin, LI Tengteng, ZHUANG Huandong, XU Hui. Adaptive Image Registration Algorithm Based on Hierarchical Region [J]. Computer Engineering and Applications, 2019, 55(17): 199-206.
[8]	WANG Weijing, ZHANG Xuefeng. Cancelable Palmprint Template based on Gabor and Local Directional Pattern [J]. Computer Engineering and Applications, 2018, 54(9): 89-95.
[9]	CHEN Zewei, ZENG Qingning, XIE Xianming, LONG Chao. Speech endpoint detection method based on auto correlation function [J]. Computer Engineering and Applications, 2018, 54(6): 216-221.
[10]	ZHANG Xuejun1，2, WANG Longqiang1, HUANG Wanlu1, HUANG Liya1，2, CHENG Xiefeng1，2. EEG signals feature extraction based on EMD and CSP combined WOSF [J]. Computer Engineering and Applications, 2018, 54(24): 149-155.
[11]	KONG Xiangxin, ZHOU Wei, WANG Xiaodan, YU Mingqiu. Removing algorithm for incremental SVDD learning [J]. Computer Engineering and Applications, 2018, 54(18): 174-179.
[12]	FAN Jianping, YAN Yan, WU Meiqin. TOPSIS and cross entropy method for multicriteria decision making under Pythagorean fuzzy environment [J]. Computer Engineering and Applications, 2018, 54(16): 146-151.
[13]	GUO Rui, FAN Yamin. Algorithm based on extreme learning machine to restrain the end effect of BS-EMD and its application [J]. Computer Engineering and Applications, 2017, 53(7): 256-262.
[14]	GENG Xiuli, MA Wanyuan. Interval-valued intuitionistic VIKOR considering unknown attribute weights [J]. Computer Engineering and Applications, 2017, 53(24): 257-262.
[15]	WANG Xia1, WANG Dan1, WANG Guangyan2, ZHANG Yan1. Noisy face mask speech enhancement combining compressed sensing with EMD [J]. Computer Engineering and Applications, 2017, 53(18): 137-140.

Speech endpoint detection based on EMD and cross-entropy

基于EMD和交叉熵的语音端点检测算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics