Speech endpoint detection based on frequency domain and time domain analyses

Abstract

Abstract: In frequency domain voice activity is detected with the spectral harmonic energy of fundamental wave. The algorithm can effectively eliminate noises of sorts, for harmonics only appear in spectrum of musical tone. So the algorithm is sensitive and accurate. In time domain every pitch is detected by cross-correlation function in virtue of the time of voice activity and fundamental frequency that is obtained through voice activity detection. So the sonant boundary is precisely detected. Second order difference enhances the high frequency component of signal, and cross-correlation function is used to trace the energy of unvoiced sound. Experiments show that the algorithm is reliable and accurate.

Key words: harmonic, cross-correlation function, Teager energy operator

摘要： 通过计算语音频谱上谐波基频能量，在频域上检测浊音信号。因谐波频谱是乐音的基本特征，所以这种算法可以有效地消除各种非乐音噪音信号的影响，具有较高灵敏度和准确性。根据检测到的浊音位置和基频值，利用语音信号时域短时平稳特性，在时域上应用互相关系数确定相邻基音节，进而精确检测浊音信号的起始和终止端点。根据清音频率较高的特点，先对语音信号通过二阶微分提升高频能量。应用Teager能量算子可以同时分析能量和频率变化的特点检测纯净语音信号中清音的起始和终止端点。实验研究结果表明语音端点检测算法具有较高的可靠性和精确性。

关键词: 谐波, 互相关函数, Teager能量算子

WANG Kunchi, YUAN Yan, WANG Jianqiang, ZHANG Yusheng, YANG Yongjie. Speech endpoint detection based on frequency domain and time domain analyses[J]. Computer Engineering and Applications, 2012, 48(34): 144-147.

王坤赤，袁燕，王建强，张裕胜，杨永杰. 基于频域时域联合分析的语音端点检测[J]. 计算机工程与应用, 2012, 48(34): 144-147.

[1]	LIU Hongchen, LIU Zhaoxia, ZHANG Long. Mixed [L2] and KL Fidelity Item Image Recovery Algorithm [J]. Computer Engineering and Applications, 2020, 56(5): 214-221.
[2]	LI Qiang, YU Fengqin. Improved Melody Extraction Algorithm Based on Pitch Salience [J]. Computer Engineering and Applications, 2019, 55(3): 115-119.
[3]	WANG Jie1, YANG Chengcheng1, MO Jiayong2, WANG Dunze1, WANG Xiexie1. A priori SNR estimator based on harmonic regeneration [J]. Computer Engineering and Applications, 2018, 54(7): 44-48.
[4]	LUO Dai, TAO Yang, YANG Gang. Face texture mapping and deformation with constrains of facial feature [J]. Computer Engineering and Applications, 2018, 54(6): 188-192.
[5]	GUO Rui, FAN Yamin. Algorithm based on extreme learning machine to restrain the end effect of BS-EMD and its application [J]. Computer Engineering and Applications, 2017, 53(7): 256-262.
[6]	HU Sijie, XU Songtao, SHI Zhongya, XIN Peng. Optimization of jamming resources distribution decision based on IFS-IMQHOA algorithm [J]. Computer Engineering and Applications, 2017, 53(19): 252-256.
[7]	WANG Falin, GUO Yu, LIAO Wenhe, HUANG Shaohua. Collision detection technology of cable harness based on distance fields and sweep-and-prune algorithm [J]. Computer Engineering and Applications, 2017, 53(10): 27-34.
[8]	WANG Zhen1, SUN Wei1, LI Yongxin1, LIN Xiaolin2. Periodic orbits analysis and sliding control for dual-unit power system [J]. Computer Engineering and Applications, 2016, 52(19): 241-244.
[9]	LU Jianlong1, WEI Jianxun2, HUANG Huixian1, PENG Yixin1, FANG Xin1. Research of online harmonic detection based on lifting wavelet transform [J]. Computer Engineering and Applications, 2016, 52(14): 50-53.
[10]	WANG Xiangping, WANG Ling, LU Pujin. Research on color image restoration based on harmonic neural network [J]. Computer Engineering and Applications, 2015, 51(4): 188-191.
[11]	LI Song, LI Yan, WANG Liu. Combination forecasting model based on modified IOWHA operator [J]. Computer Engineering and Applications, 2015, 51(3): 260-264.
[12]	GUAN Weiguo1, YAO Qingzhi1, LU Baochun2. HHT harmonic detection and time-frequency analysis method in microgrid [J]. Computer Engineering and Applications, 2015, 51(20): 198-202.
[13]	GUO Jiling, XIAO Jian, QIU Zhongcai. Research on vector control of 7-phase induction motor under open-phase fault with harmonic current elimination [J]. Computer Engineering and Applications, 2015, 51(19): 236-241.
[14]	ZHOU Yang, WEN Xingping, ZHANG Lijuan, WANG Jun. Image denoising based on hybrid variational filter model [J]. Computer Engineering and Applications, 2014, 50(24): 183-186.
[15]	ZHU Peng, WANG Chengru. Speaker recognition combining wavelet packet transform with Teager Energy Operator [J]. Computer Engineering and Applications, 2013, 49(9): 187-189.

Speech endpoint detection based on frequency domain and time domain analyses

基于频域时域联合分析的语音端点检测

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics