计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (11): 158-163.

• 模式识别与人工智能 • 上一篇    下一篇

加权成对约束度量学习在说话人识别中的应用

罗  剑,杨印根,雷震春   

  1. 江西师范大学 计算机信息工程学院,南昌 330022
  • 出版日期:2016-06-01 发布日期:2016-06-14

Weighted pairwise constraint metric learning in speaker recognition

LUO Jian, YANG Yingen, LEI Zhenchun   

  1. School of Computer and Information Engineering, Jiangxi Normal University, Nanchang 330022, China
  • Online:2016-06-01 Published:2016-06-14

摘要: I-vector说话人识别系统常用距离来衡量说话人语音间的相似度。加权成对约束度量学习算法(WPCML)利用成对训练样本的加权约束信息训练一个用于计算马氏距离的度量矩阵。该度量矩阵表示的样本空间中,同类样本间的距离更小,非同类样本间的距离更大。在美国国家标准技术局(NIST)2008年说话人识别评测数据库(SRE08)的实验结果表明,WPCML算法训练度量矩阵用于马氏距离相似度打分,比用余弦距离相似度打分的性能更好。选择训练样本对方法用于构造度量学习训练样本集能进一步提高系统实验性能,并优于目前最流行的PLDA分类器。

关键词: 说话人识别, 马氏距离, 距离度量学习, 机器学习, 模式识别

Abstract: The i-vector speaker recognition systems usually compute the distance between the speaker utterances to determine their similarity. Weighted Pairwise Constraint Metric Learning algorithm(WPCML) trains a metric used to compute Mahalanobis distance by learning from the weighted constraints of pairwise training samples. In the sample space described by the metric, the distance between the same class samples is smaller, while that between the different class samples is larger. Experiment on the NIST 2008 speaker recognition data set demonstrates that the Mahalanobis distance scoring using the distance metric learned by WPCML algorithm has better performance than cosine distance scoring. It proposes a new way to construct the training pairwise set according to the Euclidean distance, which can improve the performance obviously and is better than the state-of-the-art PLDA classifier.

Key words: speaker recognition, Mahalanobis distance, distance metric learning, machine learning, pattern recognition