加权成对约束度量学习在说话人识别中的应用

计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (11): 158-163.

加权成对约束度量学习在说话人识别中的应用

罗剑，杨印根，雷震春

江西师范大学计算机信息工程学院，南昌 330022

出版日期:2016-06-01 发布日期:2016-06-14

Weighted pairwise constraint metric learning in speaker recognition

LUO Jian, YANG Yingen, LEI Zhenchun

School of Computer and Information Engineering, Jiangxi Normal University, Nanchang 330022, China

Online:2016-06-01 Published:2016-06-14

摘要/Abstract

摘要： I-vector说话人识别系统常用距离来衡量说话人语音间的相似度。加权成对约束度量学习算法（WPCML）利用成对训练样本的加权约束信息训练一个用于计算马氏距离的度量矩阵。该度量矩阵表示的样本空间中，同类样本间的距离更小，非同类样本间的距离更大。在美国国家标准技术局（NIST）2008年说话人识别评测数据库（SRE08）的实验结果表明，WPCML算法训练度量矩阵用于马氏距离相似度打分，比用余弦距离相似度打分的性能更好。选择训练样本对方法用于构造度量学习训练样本集能进一步提高系统实验性能，并优于目前最流行的PLDA分类器。

关键词: 说话人识别, 马氏距离, 距离度量学习, 机器学习, 模式识别

Abstract: The i-vector speaker recognition systems usually compute the distance between the speaker utterances to determine their similarity. Weighted Pairwise Constraint Metric Learning algorithm（WPCML） trains a metric used to compute Mahalanobis distance by learning from the weighted constraints of pairwise training samples. In the sample space described by the metric, the distance between the same class samples is smaller, while that between the different class samples is larger. Experiment on the NIST 2008 speaker recognition data set demonstrates that the Mahalanobis distance scoring using the distance metric learned by WPCML algorithm has better performance than cosine distance scoring. It proposes a new way to construct the training pairwise set according to the Euclidean distance, which can improve the performance obviously and is better than the state-of-the-art PLDA classifier.

Key words: speaker recognition, Mahalanobis distance, distance metric learning, machine learning, pattern recognition

罗剑，杨印根，雷震春. 加权成对约束度量学习在说话人识别中的应用[J]. 计算机工程与应用, 2016, 52(11): 158-163.

LUO Jian, YANG Yingen, LEI Zhenchun. Weighted pairwise constraint metric learning in speaker recognition[J]. Computer Engineering and Applications, 2016, 52(11): 158-163.

[1]	冉蓉，徐兴华，邱少华，崔小鹏，欧阳斌. 基于深度卷积神经网络的裂纹检测方法综述[J]. 计算机工程与应用, 2021, 57(9): 23-35.
[2]	张朕通，单玉刚，袁杰. 联合多尺度和注意力机制的遥感影像检测[J]. 计算机工程与应用, 2021, 57(9): 212-216.
[3]	韦佶宏，郑荣锋，刘嘉勇. 基于混合神经网络的恶意TLS流量识别研究[J]. 计算机工程与应用, 2021, 57(7): 107-114.
[4]	侯旋，薛飞，陈涛. 无人机目标检测量子多模式识别优化算法[J]. 计算机工程与应用, 2021, 57(7): 228-236.
[5]	张晓丽，张魁星，江梅，魏本征，丛金玉. 淋巴瘤图像分类技术研究综述[J]. 计算机工程与应用, 2021, 57(6): 1-9.
[6]	韩东方，吐尔地·托合提，艾斯卡尔·艾木都拉. 问答系统中问句分类方法研究综述[J]. 计算机工程与应用, 2021, 57(6): 10-21.
[7]	万梦翔，姚寒冰. 面向恶意网页训练数据生成的GAN模型[J]. 计算机工程与应用, 2021, 57(6): 124-130.
[8]	杨晔民，张慧军，张小龙. 随机森林的可解释性可视分析方法研究[J]. 计算机工程与应用, 2021, 57(6): 168-175.
[9]	徐可文，许波，吴英，徐浩然. 机器学习在超声图像中的应用综述[J]. 计算机工程与应用, 2021, 57(4): 11-17.
[10]	王振东，张林，李大海. 基于机器学习的物联网入侵检测系统综述[J]. 计算机工程与应用, 2021, 57(4): 18-27.
[11]	吕品，武秦娟，许嘉. 上市公司文本信息披露智能分析研究综述[J]. 计算机工程与应用, 2021, 57(24): 1-13.
[12]	张隅希，段宗涛，朱依水，王路阳，周祎，郭宇. 机动车油耗模型研究综述[J]. 计算机工程与应用, 2021, 57(24): 14-26.
[13]	安卫超，阎婷，张楠，张杉，相洁，曹锐，王彬. 病理图像纹理分析在胃癌MSI预测中的应用研究[J]. 计算机工程与应用, 2021, 57(24): 205-211.
[14]	王方，张雪英，胡风云，李凤莲. 集成分类器对脑卒中患者脑电的分类[J]. 计算机工程与应用, 2021, 57(24): 276-282.
[15]	高见，孙懿，王润正，袁得嵛. 基于机器学习的浏览器挖矿检测模型研究[J]. 计算机工程与应用, 2021, 57(22): 125-130.