Computer Engineering and Applications ›› 2012, Vol. 48 ›› Issue (17): 161-164.

Previous Articles     Next Articles

Application of finite-state vector quantization in speech endpoint detection

WEI Yanna, ZHANG Jingfeng, JIN Yongtao   

  1. Department of Computer Science and Engineering, North China Institute of Aerospace Engineering, Langfang, Hebei 065000, China
  • Online:2012-06-11 Published:2012-06-20

有限状态矢量量化在语音端点检测中的应用

魏艳娜,张景峰,金永涛   

  1. 北华航天工业学院 计算机科学与工程系,河北 廊坊 065000

Abstract: Speech endpoint detection is very important in speech processing, the traditional algorithm is double threshold comparison based on short-term energy and zero-crossing rate, which is difficult to get accurate test results in lower SNR environments. In addition, the threshold value has a great influence on the test results, but it usually relies on experience, with instability. Because of the shortcoming of traditional method, according to speech inter-
frame correlation, a new algorithm is proposed. In this algorithm, double threshold is used to detect speech signal in condition of lower SNR, which completes primary detection at first. Then, a certain scope of signals vectors gotten in fuzzy frames section of starting and ending point pass finite-state vector quantization after processing, which completes more accurate endpoint detection. The improved endpoint detection algorithm is used for mandarin continuous digit speech recognition, the recognition time is shortened from 0.871 s to 0.719 s and the recognition rate is increased from 81.47% to 89.13%. Experimental results show that new algorithm is more effective than traditional algorithm.

Key words: endpoint detection, finite-state vector quantization, speech recognition

摘要: 语音端点检测在语音处理中占有非常重要的地位,传统的检测方法是基于短时能量和过量率的双门限比较法,但是在信噪比较低的情况下,利用短时能量和过量率很难得到准确的检测结果。另外,在双门限比较法中,判别门限的取值对整个端点的检测影响很大,而这个门限值往往是靠经验所得,具有不稳定性。因此,针对传统方法的不足,根据语音帧间相关性,提出了一种改进算法。让语音信号通过双门限比较,完成端点检测的一级粗判,在语音起止点的模糊帧段,取一定范围的信号矢量,让这些矢量经过处理后再通过有限状态矢量量化器(FSVQ),得到量化矢量,再对量化矢量进行二级细判,从而得到准确的语音起止点。将改进算法应用于汉语连续数字语音识别,平均识别时间由原来的0.871 s缩短为0.719 s,平均识别率由原来的81.47%上升至89.13%,实验结果表明了该算法的有效性。

关键词: 端点检测, 有限状态矢量量化, 语音识别