Computer Engineering and Applications ›› 2014, Vol. 50 ›› Issue (23): 21-25.

Previous Articles     Next Articles

Integrating tone models into speech recognition system based on articulatory feature

CHAO Hao, SONG Cheng, LIU Zhizhong   

  1. School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, Henan 454000, China
  • Online:2014-12-01 Published:2014-12-12

语音识别中基于发音特征的声调集成算法

晁  浩,宋  成,刘志中   

  1. 河南理工大学 计算机科学与技术学院,河南 焦作 454000

Abstract: The tone model based on articulatory features is improved in this paper, and a framework is proposed which attempts to integrate the proposed tone model into stochastic segment based Mandarin speech recognition system. A set of seven articulatory features which represent the articulatory information is given. As well as prosodic features, the posteriors of speech signal belonging to the 35 pronunciation categories of articulatory features are used for tone modeling. The tone models are fused into the SSM-based speech recognition system after second pruning according to the property of segmental models. Tone recognition experiments conducted on “863-test” set indicate that about 3.11% absolute increase of accuracy can be achieved when using new articulatory features. When the proposed tone model is integrated into SSM system, the character error rate is reduced significantly. Thus, potential of the method is demonstrated.

Key words: speech recognition, stochastic segment modeling, tone modeling, articulatory feature, hierarchical multilayer perceptron classifiers

摘要: 提出基于发音特征的声调建模改进方法,并将其用于随机段模型的一遍解码中。根据普通话的发音特点,确定了用于区别汉语元音、辅音信息的7种发音特征,并以此为目标值利用阶层式多层感知器计算语音信号属于发音特征的35个类别后验概率,将该概率作为发音特征与传统的韵律特征一起用于声调建模。根据随机段模型的解码特点,在两层剪枝后对保留下来的路径计算其声调模型概率得分,加权后加入路径总的概率得分中。在“863-test”测试集上进行的实验结果显示,使用了新的发音特征集合中声调模型的识别精度提高了3.11%;融入声调信息后随机段模型的字错误率从13.67%下降到12.74%。表明了将声调信息应用到随机段模型的可行性。

关键词: 语音识别, 随机段模型, 声调建模, 发音特征, 阶层式多层感知器