计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (33): 48-50.

• 学术探讨 • 上一篇    下一篇

训练数据有限的英文语音重音标注研究

赖 珉1,陈一宁2,初 敏2,胡访宇1   

  1. 1.中国科学技术大学 电子工程与信息科学系,合肥 230027
    2.微软亚洲研究院,北京 100000
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-11-21 发布日期:2007-11-21
  • 通讯作者: 赖 珉

Stress detection in English sentences with limited training data

LAI Min1,CHEN Yi-ning2,CHU Min2,HU Fang-yu1   

  1. 1.Department of Electronic Engineering and Information Science,University of Science and Technology of China,Hefei 230027,China
    2.Microsoft Research Asia,Beijing 100000,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-11-21 Published:2007-11-21
  • Contact: LAI Min

摘要:

大规模语料库的手工韵律标注消耗大量的时间和人力。这篇论文的目的在于研究如何充分利用少量的手工标注数据训练得到尽可能精确的语音重音自动标注器。论文列举并对比了四种训练方法的效果。在训练中结合声学分类器和语言学分类器,同时使用了综合分类器做后期优化。在实验中,使用机器数据训练声学分类器,并将有限的手工数据用于后期综合分类器能得到最佳的标注正确率。最终的正确率达到了94.0%,与手工标注的正确率上限97.2%比较接近。

关键词: 自动重音检测, 自动韵律标注, 自动语音识别

Abstract: It is money and labor consuming to label stressed syllables manually,especially when the speech database is very large.An efficient and reliable automatic prosody labeler is always desired.When training data is limited,how to get the best use of it? This paper proposes the optimization in using training data for automatic stress detection in English speech utterances.The detector consists of a linguistic classifier,an acoustic classifier and an AdaBoost classifier that can improve the accuracy by using more features and manual labels.The best result we obtained is 94.0%,which is approaching to the self-agreement ratio (97.2%) of the same annotator,or the upper bound of the performance.

Key words: automatic stress detection, automatic prosody labeler, automatic speech recognition