计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (27): 196-198.

• 工程与应用 • 上一篇    下一篇

说话人识别中基于聚类特征的矢量量化技术

徐利敏,唐振民,何可可,钱 博   

  1. 南京理工大学 计算机科学与技术学院,南京 210094
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-09-21 发布日期:2007-09-21
  • 通讯作者: 徐利敏

Vector quantization technology based on clustering features in speaker recognition

XU Li-min,TANG Zhen-min,HE Ke-ke,QIAN Bo   

  1. School of Computer,Nanjing University of Science and Technology,Nanjing 210094,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-09-21 Published:2007-09-21
  • Contact: XU Li-min

摘要: 为解决采用矢量量化的方法进行说话人识别时出现的失真问题,根据汉语语音的发音特性,提出了将矢量量化与语音特征的聚类技术相结合的方法,在进行矢量量化码书训练之前,先对特征矢量进行聚类筛选。实验结果表明,当测试语音片段长度为4 s时,在保持95%左右识别率下,采用普通矢量量化方法需64码本数,而采用该文方法只需8码本数,降低了8倍。结果说明该方法不但在一定程度上解决了因训练样本不足而引起的失真问题,而且通过方法的改进,实现了采用较低码字数产生较好的识别结果,从而提高识别效率。

关键词: 说话人识别, 矢量量化, 聚类特征, Mel频率倒谱系数

Abstract: In this paper,in order to solve the problem of distortion in speaker recognition with vector quantization,we propose a method in which we apply speaker feature based on speech clustering to vector quantization in speaker recognition.Before codebook training,the training samples of speakers would be clustered and filtrated.The experiment showed that it could reduce the number of codebook from 64 with simple vector quantization to 8 with VQ based on clustering features.The result showed:on the one hand,with the approach,the problem of distortion because of the lack of training samples would be solved to a certain extent,on the other hand,better recognition results would be acquired in lower number of codebook with the approach.In other word,the efficiency of speaker recognition is to be increased.

Key words: speaker recognition, vector quantization, clustering features, MFCC