Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (8): 178-181.DOI: 10.3778/j.issn.1002-8331.2009.08.054

• 图形、图像、模式识别 • Previous Articles     Next Articles

Novel Articulatory Feature based Dynamic Bayesian Network model for speech recognition

WANG Feng-na,JIANG Dong-mei,SONG Pei-yan   

  1. School of Computer Science,Northwestern Polytechnical University,Xi’an 710129,China
  • Received:2008-10-06 Revised:2008-12-23 Online:2009-03-11 Published:2009-03-11
  • Contact: WANG Feng-na

结合发音特征的动态贝叶斯网络语音识别模型

王风娜,蒋冬梅,宋培岩   

  1. 西北工业大学 计算机学院,西安 710129
  • 通讯作者: 王风娜

Abstract: This paper presents a new articulatory feature based Asynchronous Dynamic Bayesian Network model(AWA-DBN) in which the dynamic pronunciation of a word is described by the moving of articulatory features.Word recognition experiments on Aurora5.0 are compared with those of WS-DBN model(in which a word is composed of a fixed number of states) and WP-DBN model(in which a word is composed of its phones).Results show that although WS-DBN model gets the highest recognition rates,it is only suitable for small vocabulary isolated word recognition.Both AWA-DBN and WP-DBN can be adopted in large vocabulary continuous speech recognition,and AWA-DBN model gets higher recognition rates and is more robust than WP-DBN model.

Key words: Articulatory Feature(AF), Dynamic Bayesian Network(DBN), speech recognition

摘要: 构建了一种新的基于动态贝叶斯网络(Dynamic Bayesian Network,DBN)的异步整词-发音特征语音识别模型AWA-DBN(每个词由其发音特征的运动来描述),定义了各发音特征节点及异步检查节点的条件概率分布。在标准数字语音库Aurora5.0上的语音识别实验表明,与整词-状态DBN(WS-DBN,每个词由固定个数的整词状态构成)和整词-音素DBN(WP-DBN,每个词由其对应的音素序列构成)模型相比,WS-DBN模型虽然具有最高的识别率,但其只适用于小词汇量孤立词语音识别,AWA-DBN和WP-DBN可以为大词汇量连续语音建模,而AWA-DBN模型比WP-DBN模型具有更高的语音识别率和系统鲁棒性。

关键词: 发音特征, 动态贝叶斯网络, 语音识别