Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (3): 135-138.DOI: 10.3778/j.issn.1002-8331.2011.03.041

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Coding distortion compensation of speaker identification based on model synthesis

MA Miaomiao,HE Yongjun,HAN Jiqing
  

  1. College of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China
  • Received:2009-05-13 Revised:2009-07-12 Online:2011-01-21 Published:2011-01-21
  • Contact: MA Miaomiao

说话人识别中用模型合成的编码畸变补偿研究

马苗苗,何勇军,韩纪庆   

  1. 哈尔滨工业大学 计算机科学与技术学院,哈尔滨 150001
  • 通讯作者: 马苗苗

Abstract: Environment mismatch in enrollment and test sessions caused by different code strategies is one of main reasons degrading the performance of speaker recognition.Experiments with speech in different code formats and code rate raging from 5.15 Kb/s to 128 Kb/s show that the speech with high-bit rate causes little distortion,while the ones with low-bit rate make the recognition rate decreasing sharply.To solve this problem,speaker model synthesis based on UBM is adopted to synthesis speaker models for target code environments to compensate the distortion caused by low-bit rate.Experiments on NIST 2002 corpus in one-speaker detection task show that the proposed approach obtains better performance than those with no compensation.

Key words: speech coding, speaker identification, low-bit rate, model synthesis

摘要: 编码环境失配是影响说话人识别准确率的重要因素之一。在说话人识别系统上,对码速率在5.15~128 Kb/s之间的语音编码进行了实验分析,结果表明,高速率语音编码对说话人识别系统的影响不大,低速率语音编码使系统性能急剧下降。针对这一问题,采用基于UBM的说话人模型合成算法对低速率语音编码的说话人模型进行补偿,在NIST 2002单说话人识别数据库上的实验表明,此方法能显著提高系统识别率。

关键词: 语音编码, 说话人识别, 低速率, 模型合成

CLC Number: