Probabilistic sequence kernel for speaker recognition

Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (6): 151-155.

• 数据库、信号与信息处理 • Previous Articles Next Articles

Probabilistic sequence kernel for speaker recognition

LEI Zhenchun

Department of Computer and Information Engineering，Jiangxi Normal University，Nanchang 330022，China

Received:1900-01-01 Revised:1900-01-01 Online:2011-02-21 Published:2011-02-21

一种概率序列核在说话人识别中的应用

雷震春

江西师范大学计算机学院，南昌 330022

Abstract

Abstract: This paper proposes a probabilistic sequence kernel based on the universal background model.The Gaussian components are used to construct the speaker character space，and the utterances with different length are mapped into the fixed size vectors after normalization with correlation matrix.Then four feature normalization methods is proposed for the mapped vectors：mean/variance normalization，weight scaling，WLOG scaling and spherical normalization.Finally the normalized vectors are inputted to the linear support vector machine for speaker recognition.A transfer probabilistic sequence kernel is also proposed，which adapts the transfer information between neighbor frames.The experiments on NIST 2001 show that the probabilistic sequence kernel is compared with the traditionally UBM-MAP model and the performance will be improved clearly after the linear fusion of the models.

Key words: speaker recognition, probabilistic sequence kernel, universal background model, support vector machine

摘要： 以说话人识别中的背景模型为基础，根据模型中的各个高斯分量，构造出说话人特征空间，将长度不一样的语句映射成为空间中大小相同的向量，且经过相关矩阵进行规整后，采用线性支持向量机进行说话人识别。借鉴几种常见的特征规整方式，结合语句映射后的向量，提出四种不同的规整方法：均值/方差规整、权重规整、WLOG规整和球形规整，并与概率序列核进行比较研究。根据语音特征向量序列中相邻的特征向量的前后转移关系，结合提出的概率序列核，构造出转移概率序列核。实验在NIST2001库上进行，结果表明概率序列核模型识别性能接近经典的UBM-MAP模型，将这两类模型得分进行融合，可非常明显地提高识别性能，进一步融合转移概率序列核后，性能还可提高19.1%。

关键词: 说话人识别, 概率序列核, 通用背景模型, 支持向量机

LEI Zhenchun. Probabilistic sequence kernel for speaker recognition[J]. Computer Engineering and Applications, 2011, 47(6): 151-155.

雷震春. 一种概率序列核在说话人识别中的应用[J]. 计算机工程与应用, 2011, 47(6): 151-155.

[1]	ZENG Chunyan, MA Chaofeng, WANG Zhifeng, ZHU Dongliang, ZHAO Nan, WANG Juan, LIU Cong. Survey of Speaker Recognition in Deep Learning Framework [J]. Computer Engineering and Applications, 2020, 56(7): 8-16.
[2]	WANG Xin, ZHANG Hongran. Robust i-vector speaker recognition method based on DNN processing [J]. Computer Engineering and Applications, 2018, 54(22): 167-172.
[3]	XU Limin1, WEI Xiang2. Analysis and design of speaker authentication system based on Android platform of parallel computation [J]. Computer Engineering and Applications, 2017, 53(3): 231-236.
[4]	SHU Yi1, XING Yujuan2. Speaker verification based on i-vector and sparse representation using PCA dictionary learning [J]. Computer Engineering and Applications, 2016, 52(18): 144-147.
[5]	ZHANG Xiaoheng1，2, XIE Wenbin2, LI Yongming2. Multiple voice features types evolutionary selection algorithm [J]. Computer Engineering and Applications, 2016, 52(14): 150-155.
[6]	LUO Jian, YANG Yingen, LEI Zhenchun. Weighted pairwise constraint metric learning in speaker recognition [J]. Computer Engineering and Applications, 2016, 52(11): 158-163.
[7]	HU Zhengquan, ZENG Yuming, ZONG Yuan, LI Mengchao. Improvement of MFCC parameters extraction in speaker recognition [J]. Computer Engineering and Applications, 2014, 50(7): 217-220.
[8]	DU Xiaoqing, YU Fengqin. Speaker recognition algorithm based on HHT cepstrum coefficient [J]. Computer Engineering and Applications, 2014, 50(3): 198-202.
[9]	XIONG Huaqiao, ZHENG Jianbin, ZHAN Enqi, WANG Yang, HUA Jian. Speaker recognition based on speaker model clustering [J]. Computer Engineering and Applications, 2014, 50(2): 133-136.
[10]	ZHU Peng, WANG Chengru. Speaker recognition combining wavelet packet transform with Teager Energy Operator [J]. Computer Engineering and Applications, 2013, 49(9): 187-189.
[11]	LIANG Hui, ZENG Shuiping. Application of wavelet multiresolution theory to extract personality characteristics [J]. Computer Engineering and Applications, 2013, 49(9): 120-122.
[12]	SUN Quanling, WANG Lixin. Fast kernel clustering back propagation algorithm [J]. Computer Engineering and Applications, 2013, 49(10): 118-120.
[13]	LIU Hong1, LIU Liqun2. Research on speaker recognition with improved MFCC [J]. Computer Engineering and Applications, 2012, 48(8): 155-157.
[14]	PAN Ping, HE Zhaoxia. Method of speaker feature parameter extraction based on duffing stochastic resonance [J]. Computer Engineering and Applications, 2012, 48(35): 123-125.
[15]	MA Zhen1, ZHANG Xiongwei2, YANG Jibin2. Speaker recognition method based on K-SVD [J]. Computer Engineering and Applications, 2012, 48(34): 112-115.

Probabilistic sequence kernel for speaker recognition

一种概率序列核在说话人识别中的应用

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics