计算机工程与应用 ›› 2015, Vol. 51 ›› Issue (6): 208-211.

• 信号处理 • 上一篇    下一篇

融合音素串编辑距离的随机段模型解码算法

晁  浩   

  1. 河南理工大学 计算机科学与技术学院,河南 焦作 454000
  • 出版日期:2015-03-15 发布日期:2015-03-13

Decoding algorithm of integrating phonetic string edit distance into stochastic segment models

CHAO Hao   

  1. School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, Henan 454000, China
  • Online:2015-03-15 Published:2015-03-13

摘要: 解码时声学特性最优的路径蕴含了揭示当前路径是否正确的重要参考信息,为此提出了一种随机段模型系统的解码优化方法。训练能够准确地衡量当前路径与声学最优路径相似性程度的上下文相关音素串编辑距离模型,在N-Best重打分的过程中将音素串编辑距离加入到路径总得分中。在“863-test”测试集上进行的连续语音识别实验显示汉语字的相对错误率下降了8.1%。实验结果表明了将音素串编辑距离应用到随机段模型的可行性。

关键词: 语音识别, 音素串编辑距离, 随机段模型, 解码

Abstract: The optimal path achieved according to acoustic characteristics implies some information which can reveal the correctness of current hypothesized path. Thus, the information may be used to improve the decoding algorithm of stochastic segment model. The context phonetic string edit distance model is built to measure the similarity between the best matched path and current hypothesized path. Then the edit distance is integrated into total score of current hypothesized path by the N-Best rescoring. Experiments conducted on “863-test” set show that about 8.1% relative improvement can be achieved in the recognition accuracy. Thus, potential of the method is demonstrated.

Key words: speech recognition, phonetic string edit distance, stochastic segment model, decoding