计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (21): 141-144.

• 数据库、信号与信息处理 • 上一篇    下一篇

新疆非母语汉语语音识别中的字典自适应技术

李兵虎,黄 浩   

  1. 新疆大学 信息科学与工程学院 多语种信息实验室,乌鲁木齐 830046
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-07-21 发布日期:2011-07-21

Application of pronunciation dictionary adaptation for non-native mandarin speech recognition in Xinjiang

LI Binghu,HUANG Hao   

  1. Lab of Multi-lingual Information Technology,School of Information Science & Engineering,Xinjiang University,Urumqi 830046,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-07-21 Published:2011-07-21

摘要: 将标准普通话语音数据训练得到的声学模型应用于新疆维吾尔族说话人非母语汉语语音识别时,由于说话人的普通话发音存在较大偏误,将导致识别率急剧下降。针对这一问题,将多发音字典技术应用于新疆维吾尔族说话人汉语语音识别中,通过统计分析识别器的识别错误,建立音素混淆矩阵,获取音素的发音候选项。利用剪枝策略对发音候选项进行剪枝整合,扩展出符合维吾尔族说话人汉语发音规律的替代字典。对三种剪枝方法产生的发音字典的识别结果进行了对比。实验结果表明,使用相对最大剪枝策略产生的发音字典可以显著提高系统识别率。

关键词: 发音字典, 音素混淆矩阵, 剪枝策略, 新疆维吾尔族说话人, 非母语汉语语音识别

Abstract: When acoustic models trained on standard Mandarin speech database are applied to Putonghua speech uttered by Uighur speakers in Xinjiang,because of the significant pronunciation deviation of the speakers,recognition accuracy would drop dramatically.To solve this problem,the multi-pronunciation dictionary technique is adopted to improve the performance of non-native speech recognition.Statistical analysis of recognition errors is carried out to build phoneme confusion matrices from which pronunciation candidates can be made.Three pruning schemes are evaluated to best remove the useless pronunciation alternatives.The resulting pronunciation candidates are used to expand pronunciation dictionary for non-native speech recognition.Experimental results on continuous speech recognition show significant improvement can be obtained using resulting pronunciation dictionary.

Key words: pronunciation dictionary, phoneme confusion matrix, pruning strategy, Uighur speakers in Xinjiang, non-native Mandarin speech recognition