Vector quantization technology based on clustering features in speaker recognition

Computer Engineering and Applications ›› 2007, Vol. 43 ›› Issue (27): 196-198.

• 工程与应用 • Previous Articles Next Articles

Vector quantization technology based on clustering features in speaker recognition

XU Li-min,TANG Zhen-min,HE Ke-ke,QIAN Bo

School of Computer,Nanjing University of Science and Technology,Nanjing 210094,China

Received:1900-01-01 Revised:1900-01-01 Online:2007-09-21 Published:2007-09-21
Contact: XU Li-min

说话人识别中基于聚类特征的矢量量化技术

徐利敏,唐振民,何可可,钱博

南京理工大学计算机科学与技术学院,南京 210094

通讯作者: 徐利敏

Abstract

Abstract: In this paper,in order to solve the problem of distortion in speaker recognition with vector quantization,we propose a method in which we apply speaker feature based on speech clustering to vector quantization in speaker recognition.Before codebook training,the training samples of speakers would be clustered and filtrated.The experiment showed that it could reduce the number of codebook from 64 with simple vector quantization to 8 with VQ based on clustering features.The result showed：on the one hand,with the approach,the problem of distortion because of the lack of training samples would be solved to a certain extent,on the other hand,better recognition results would be acquired in lower number of codebook with the approach.In other word,the efficiency of speaker recognition is to be increased.

Key words: speaker recognition, vector quantization, clustering features, MFCC

摘要： 为解决采用矢量量化的方法进行说话人识别时出现的失真问题,根据汉语语音的发音特性,提出了将矢量量化与语音特征的聚类技术相结合的方法,在进行矢量量化码书训练之前,先对特征矢量进行聚类筛选。实验结果表明,当测试语音片段长度为4 s时,在保持95％左右识别率下,采用普通矢量量化方法需64码本数,而采用该文方法只需8码本数,降低了8倍。结果说明该方法不但在一定程度上解决了因训练样本不足而引起的失真问题,而且通过方法的改进,实现了采用较低码字数产生较好的识别结果,从而提高识别效率。

关键词: 说话人识别, 矢量量化, 聚类特征, Mel频率倒谱系数

XU Li-min,TANG Zhen-min,HE Ke-ke,QIAN Bo. Vector quantization technology based on clustering features in speaker recognition[J]. Computer Engineering and Applications, 2007, 43(27): 196-198.

徐利敏,唐振民,何可可,钱博. 说话人识别中基于聚类特征的矢量量化技术[J]. 计算机工程与应用, 2007, 43(27): 196-198.

[1]	YAN Xiaoshen, GAO Qiang, ZHU Simeng, XI Xuecheng, ZHAO Wansheng. Study on Character Segmentation Algorithm of Pressed Character on Uneven Brightness Low Quality Images [J]. Computer Engineering and Applications, 2021, 57(8): 185-191.
[2]	ZENG Chunyan, MA Chaofeng, WANG Zhifeng, ZHU Dongliang, ZHAO Nan, WANG Juan, LIU Cong. Survey of Speaker Recognition in Deep Learning Framework [J]. Computer Engineering and Applications, 2020, 56(7): 8-16.
[3]	ZHOU Yijun, LI Dongdong, WANG Zhe, GAO Daqi. Cepstrum Feature Fusion for EEG Emotion Classification [J]. Computer Engineering and Applications, 2020, 56(21): 164-169.
[4]	WANG Huapeng, JIANG Nan, LIU En, CHAO Yadong. Study on Modeling Method of Inter-Speaker Variability in Forensic Voice Comparison [J]. Computer Engineering and Applications, 2019, 55(8): 110-115.
[5]	WANG Jiao1, LUO Siwei2, ZOU Qi2. Bag-of-Visual-Words Model Based on Classified Vector Quantization and Its Application in Image Classification [J]. Computer Engineering and Applications, 2019, 55(10): 141-145.
[6]	WANG Xin, ZHANG Hongran. Robust i-vector speaker recognition method based on DNN processing [J]. Computer Engineering and Applications, 2018, 54(22): 167-172.
[7]	XU Limin1, WEI Xiang2. Analysis and design of speaker authentication system based on Android platform of parallel computation [J]. Computer Engineering and Applications, 2017, 53(3): 231-236.
[8]	HUANG Lixia1, WANG Yanan1, ZHANG Xueying1, WANG Hongcui2. Research on noise robustness of speech recognition based on deep auto-encoder neural network [J]. Computer Engineering and Applications, 2017, 53(13): 49-54.
[9]	ZHANG Xiaoheng1，2, XIE Wenbin2, LI Yongming2. Multiple voice features types evolutionary selection algorithm [J]. Computer Engineering and Applications, 2016, 52(14): 150-155.
[10]	LUO Jian, YANG Yingen, LEI Zhenchun. Weighted pairwise constraint metric learning in speaker recognition [J]. Computer Engineering and Applications, 2016, 52(11): 158-163.
[11]	ZHUANG Yan, YU Fengqin. Combining beat semantic features and MFCC acoustic features for music genre classification [J]. Computer Engineering and Applications, 2015, 51(3): 197-201.
[12]	SHEN Yan, XIAO Zhongzhe, LI Bingjie, ZHOU Xiaojin, ZHOU Qiang, TAO Zhi. Speech emotion recognition using GW-MFCC feature [J]. Computer Engineering and Applications, 2015, 51(10): 219-222.
[13]	SU Peng, CHENG Jian. Application of DHMM to mechanical equipment audio recognition [J]. Computer Engineering and Applications, 2015, 51(1): 266-270.
[14]	HU Zhengquan, ZENG Yuming, ZONG Yuan, LI Mengchao. Improvement of MFCC parameters extraction in speaker recognition [J]. Computer Engineering and Applications, 2014, 50(7): 217-220.
[15]	HU Yaomin, XIONG Xin. Vehicle classification research based on improved GLVQ algorithm [J]. Computer Engineering and Applications, 2014, 50(7): 162-165.

Vector quantization technology based on clustering features in speaker recognition

说话人识别中基于聚类特征的矢量量化技术

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics