计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (6): 151-155.

• 数据库、信号与信息处理 • 上一篇    下一篇

一种概率序列核在说话人识别中的应用

雷震春   

  1. 江西师范大学 计算机学院,南昌 330022
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-02-21 发布日期:2011-02-21

Probabilistic sequence kernel for speaker recognition

LEI Zhenchun   

  1. Department of Computer and Information Engineering,Jiangxi Normal University,Nanchang 330022,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-02-21 Published:2011-02-21

摘要: 以说话人识别中的背景模型为基础,根据模型中的各个高斯分量,构造出说话人特征空间,将长度不一样的语句映射成为空间中大小相同的向量,且经过相关矩阵进行规整后,采用线性支持向量机进行说话人识别。借鉴几种常见的特征规整方式,结合语句映射后的向量,提出四种不同的规整方法:均值/方差规整、权重规整、WLOG规整和球形规整,并与概率序列核进行比较研究。根据语音特征向量序列中相邻的特征向量的前后转移关系,结合提出的概率序列核,构造出转移概率序列核。实验在NIST2001库上进行,结果表明概率序列核模型识别性能接近经典的UBM-MAP模型,将这两类模型得分进行融合,可非常明显地提高识别性能,进一步融合转移概率序列核后,性能还可提高19.1%。

关键词: 说话人识别, 概率序列核, 通用背景模型, 支持向量机

Abstract: This paper proposes a probabilistic sequence kernel based on the universal background model.The Gaussian components are used to construct the speaker character space,and the utterances with different length are mapped into the fixed size vectors after normalization with correlation matrix.Then four feature normalization methods is proposed for the mapped vectors:mean/variance normalization,weight scaling,WLOG scaling and spherical normalization.Finally the normalized vectors are inputted to the linear support vector machine for speaker recognition.A transfer probabilistic sequence kernel is also proposed,which adapts the transfer information between neighbor frames.The experiments on NIST 2001 show that the probabilistic sequence kernel is compared with the traditionally UBM-MAP model and the performance will be improved clearly after the linear fusion of the models.

Key words: speaker recognition, probabilistic sequence kernel, universal background model, support vector machine