有限状态矢量量化在语音端点检测中的应用

计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (17): 161-164.

• 数据库、信号与信息处理 • 上一篇下一篇

有限状态矢量量化在语音端点检测中的应用

魏艳娜，张景峰，金永涛

北华航天工业学院计算机科学与工程系，河北廊坊 065000

出版日期:2012-06-11 发布日期:2012-06-20

Application of finite-state vector quantization in speech endpoint detection

WEI Yanna, ZHANG Jingfeng, JIN Yongtao

Department of Computer Science and Engineering, North China Institute of Aerospace Engineering, Langfang, Hebei 065000, China

Online:2012-06-11 Published:2012-06-20

摘要/Abstract

摘要： 语音端点检测在语音处理中占有非常重要的地位，传统的检测方法是基于短时能量和过量率的双门限比较法，但是在信噪比较低的情况下，利用短时能量和过量率很难得到准确的检测结果。另外，在双门限比较法中，判别门限的取值对整个端点的检测影响很大，而这个门限值往往是靠经验所得，具有不稳定性。因此，针对传统方法的不足，根据语音帧间相关性，提出了一种改进算法。让语音信号通过双门限比较，完成端点检测的一级粗判，在语音起止点的模糊帧段，取一定范围的信号矢量，让这些矢量经过处理后再通过有限状态矢量量化器（FSVQ），得到量化矢量，再对量化矢量进行二级细判，从而得到准确的语音起止点。将改进算法应用于汉语连续数字语音识别，平均识别时间由原来的0.871 s缩短为0.719 s，平均识别率由原来的81.47%上升至89.13%，实验结果表明了该算法的有效性。

关键词: 端点检测, 有限状态矢量量化, 语音识别

Abstract: Speech endpoint detection is very important in speech processing, the traditional algorithm is double threshold comparison based on short-term energy and zero-crossing rate, which is difficult to get accurate test results in lower SNR environments. In addition, the threshold value has a great influence on the test results, but it usually relies on experience, with instability. Because of the shortcoming of traditional method, according to speech inter-
frame correlation, a new algorithm is proposed. In this algorithm, double threshold is used to detect speech signal in condition of lower SNR, which completes primary detection at first. Then, a certain scope of signals vectors gotten in fuzzy frames section of starting and ending point pass finite-state vector quantization after processing, which completes more accurate endpoint detection. The improved endpoint detection algorithm is used for mandarin continuous digit speech recognition, the recognition time is shortened from 0.871 s to 0.719 s and the recognition rate is increased from 81.47% to 89.13%. Experimental results show that new algorithm is more effective than traditional algorithm.

Key words: endpoint detection, finite-state vector quantization, speech recognition

魏艳娜，张景峰，金永涛. 有限状态矢量量化在语音端点检测中的应用[J]. 计算机工程与应用, 2012, 48(17): 161-164.

WEI Yanna, ZHANG Jingfeng, JIN Yongtao. Application of finite-state vector quantization in speech endpoint detection[J]. Computer Engineering and Applications, 2012, 48(17): 161-164.

[1]	娄英丹，徐静林，黄丽霞，张雪英. MLLR和MAP在远场噪声混响下的语音识别研究[J]. 计算机工程与应用, 2020, 56(10): 122-126.
[2]	陈泽伟，曾庆宁，谢先明，龙超. 基于自相关函数的语音端点检测方法[J]. 计算机工程与应用, 2018, 54(6): 216-221.
[3]	赵悦，李要嫱，徐晓娜，吴立成. 临近最优主动学习的藏语语音识别方法研究[J]. 计算机工程与应用, 2018, 54(22): 156-159.
[4]	黄晓辉1，2，李京1，马睿2，3. 藏语口语语音语料库的设计与研究[J]. 计算机工程与应用, 2018, 54(13): 231-235.
[5]	宋春晓，孙颖. 面向情感语音识别的非线性几何特征提取算法[J]. 计算机工程与应用, 2017, 53(20): 128-133.
[6]	常静雅，张晓俊，顾玲玲，袁悦，顾济华，陶智. 小波域能量谱和非线性降维的病理嗓音识别[J]. 计算机工程与应用, 2017, 53(2): 166-171.
[7]	黄丽霞1，王亚楠1，张雪英1，王洪翠2. 基于深度自编码网络语音识别噪声鲁棒性研究[J]. 计算机工程与应用, 2017, 53(13): 49-54.
[8]	赵彩光，张树群，雷兆宜. 基于并行回火改进的GRBM的语音识别[J]. 计算机工程与应用, 2016, 52(8): 125-129.
[9]	苗敏敏，周治平. 手势认证中基于能量熵的端点检测方法研究[J]. 计算机工程与应用, 2016, 52(4): 205-210.
[10]	达吾勒·阿布都哈依尔，努尔买买提·尤鲁瓦斯，刘艳. 面向哈萨克语LVCSR的语言模型构建方法研究[J]. 计算机工程与应用, 2016, 52(24): 178-181.
[11]	晁浩，宋成，薛霄，刘志中. 基于模型自适应的声效鲁棒性语音识别算法[J]. 计算机工程与应用, 2016, 52(2): 156-160.
[12]	汪鲁才，曹鹏霞，姜小龙. 一种改进的含噪语音端点检测方法[J]. 计算机工程与应用, 2016, 52(15): 162-167.
[13]	恩德，张凤磊，张昭，忽胜强. 模糊熵在车载环境下语音端点检测中的应用[J]. 计算机工程与应用, 2016, 52(10): 147-150.
[14]	晁浩. 融合音素串编辑距离的随机段模型解码算法[J]. 计算机工程与应用, 2015, 51(6): 208-211.
[15]	王晓华，屈雷. 基于时频参数融合的自适应语音端点检测算法[J]. 计算机工程与应用, 2015, 51(20): 203-207.

有限状态矢量量化在语音端点检测中的应用

Application of finite-state vector quantization in speech endpoint detection

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics