Endpoint detection based on instantaneous energy frequency value of HHT for whispered speech

doi:10.3778/j.issn.1002-8331.2010.29.041

Computer Engineering and Applications ›› 2010, Vol. 46 ›› Issue (29): 147-150.DOI: 10.3778/j.issn.1002-8331.2010.29.041

• 数据库、信号与信息处理 • Previous Articles Next Articles

Endpoint detection based on instantaneous energy frequency value of HHT for whispered speech

TAN Xue-dan¹，GU Ji-hua¹，ZHAO He-ming²，TAO Zhi¹，HAN Tao¹，WU Jun¹

1.School of Physical Science and Technology，Soochow University，Suzhou，Jiangsu 215006，China
2.School of Electronics and Information Engineering，Soochow University，Suzhou，Jiangsu 215021，China

Received:2009-03-16 Revised:2009-05-15 Online:2010-10-11 Published:2010-10-11
Contact: TAN Xue-dan

基于HHT瞬时能频值的耳语音端点检测

谈雪丹¹，顾济华¹，赵鹤鸣²，陶智¹，韩韬¹，吴俊¹

1.苏州大学物理科学与技术学院，江苏苏州 215006
2.苏州大学电子信息学院，江苏苏州 215021

通讯作者: 谈雪丹

Abstract

Abstract: Because of the low SNR of the whispers，it is difficult to obtain both high accurate rates and good robustness with some traditional detection algorithms in whispered environment.An algorithm of whispered speech endpoint detection is presented which is based on Instantaneous Energy Frequency Value（IEFV） of Hilbert-Huang Transform（HHT）.This paper applies HHT to separate instantaneous amplitude and instantaneous frequency from whispers，and extracts IEFV，which is a temporal-amplitude-frequency character.Because IEFV can distinguish whispers form noise effectively，it is used as the feature for endpoint detection.The accurate rates of both initial and final of this algorithm are higher than the Zero-Energy-Product method，the Spectral Entropy method and the Fitting Characteristic method in the test with 700 samples at 2～10 dB SNR.As shown in the experiments，this algorithm can detect whispered speech endpoint accurately in various non-stable noisy backgrounds.

摘要： 由于耳语音信噪比较低，采用传统的算法进行耳语音端点检测存在正确率低、抗噪性能差等问题。提出了一种基于希尔伯特-黄变换瞬时能频值的耳语音端点检测的算法。运用希尔伯特-黄变换，分离出耳语音的瞬时幅值与频率，提取基于时间-能量-频率的特征参数瞬时能频值，利用该特征值对耳语音和噪声进行区分，进行端点检测。对700个信噪比为2～10 dB的耳语音测试样本进行仿真实验，该算法检测的起点正确率与终点正确率均高于零能积法、熵法以及拟和特征法。实验表明，该算法适应于多种非平稳噪声环境，能较好地检测耳语音的端点。

CLC Number:

TN912.34

TAN Xue-dan¹，GU Ji-hua¹，ZHAO He-ming²，TAO Zhi¹，HAN Tao¹，WU Jun¹. Endpoint detection based on instantaneous energy frequency value of HHT for whispered speech[J]. Computer Engineering and Applications, 2010, 46(29): 147-150.

谈雪丹¹，顾济华¹，赵鹤鸣²，陶智¹，韩韬¹，吴俊¹. 基于HHT瞬时能频值的耳语音端点检测[J]. 计算机工程与应用, 2010, 46(29): 147-150.

[1]	WANG Guang-yan^1，2，ZHAO Xiao-qun³，WANG Xia¹. Design and simulation of time-frequency features display system for speech signal [J]. Computer Engineering and Applications, 2010, 46(29): 73-75.
[2]	LI Yong-hong¹，YU Hong-zhi¹，KONG Jiang-ping². Design and implementation of Tibetan continuous speech corpus [J]. Computer Engineering and Applications, 2010, 46(13): 233-235.
[3]	LIU Ming-hui¹，HUANG Zhong-wei¹，XIONG Ji-ping². Score normalization for speaker identification [J]. Computer Engineering and Applications, 2010, 46(12): 133-135.
[4]	FAN Xiao-chun，QIU Zheng-quan. Hierarchical speaker identification based on HAAR wavelet [J]. Computer Engineering and Applications, 2010, 46(11): 122-124.
[5]	FAN Xiao-chun，QIU Zheng-quan. Speaker recognition based on Wiener filter and MMCE [J]. Computer Engineering and Applications, 2010, 46(10): 113-114.
[6]	HUANG Hao¹，Halidan². Direct F0 incorporation for acoustic modeling in Mandarin speech recognition [J]. Computer Engineering and Applications, 2009, 45(30): 132-134.

Endpoint detection based on instantaneous energy frequency value of HHT for whispered speech

基于HHT瞬时能频值的耳语音端点检测

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 6

Recommended Articles

Metrics