GMM文本无关的说话人识别系统研究

doi:10.3778/j.issn.1002-8331.2010.11.055

计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (11): 179-182.DOI: 10.3778/j.issn.1002-8331.2010.11.055

• 图形、图像、模式识别 • 上一篇下一篇

GMM文本无关的说话人识别系统研究

蒋晔，唐振民

南京理工大学计算机科学与技术学院，南京 210094

收稿日期:2008-10-07 修回日期:2009-01-06 出版日期:2010-04-11 发布日期:2010-04-11
通讯作者: 蒋晔

Research on GMM text-independent speaker recognition

JIANG Ye，TANG Zhen-min

School of Computer Science and Technology，Nanjing University of Science and Technology，Nanjing 210094，China

Received:2008-10-07 Revised:2009-01-06 Online:2010-04-11 Published:2010-04-11
Contact: JIANG Ye

摘要/Abstract

摘要： 在高斯混合模型（Gaussian Mixture Model，GMM）训练时，对传统的模型参数初始化方法（随机法、K均值聚类法）进行改进，提出分裂法与K均值聚类相结合的新方法。实验表明，采用改进的方法与传统方法相比，系统平均识别率有15.47%和7.5%的提高。研究了GMM的阶数、协方差阈值、预加重系数对系统识别率的影响。对实验结果进行详细分析，并根据实验数据，取它们各自表现最好的值，从而使构建的说话人识别系统获得一个较高的识别率。实验表明，在规定的实验条件下，系统可达到90%以上的识别率。

关键词: 说话人识别, 高斯混合模型, 美尔频率倒谱系数（MFCC）, 分裂法与K均值聚类结合法

Abstract: This paper improves the traditional method of Gaussian Mixture Model（GMM） parameters initialization at the time of GMM training.A new approach which combines division and K-means clustering is presented.The experiment shows that the proposed method can achieve the average recognition rate increase by 15.47% and 7.5% compared with the randomization and K-means clustering.At the same time，the impact of the order of GMM，covariance threshold and pre-emphasis coefficient on system recognition rate are studied.Meanwhile，the experiment results are analyzed in detail.In order to make the speaker recognition system get a higher recognition rate，their optimal values are chosen from the experiment data.The experiment shows that the system can achieve the recognition rate with above 90% under the provided experimental condition.

Key words: speaker recognition, Gaussian Mixture Modal（GMM）, Mel Frequency Cepstrum Coefficient（MFCC）, combination division and K-means clustering

中图分类号:

TP391.4

蒋晔，唐振民. GMM文本无关的说话人识别系统研究[J]. 计算机工程与应用, 2010, 46(11): 179-182.

JIANG Ye，TANG Zhen-min. Research on GMM text-independent speaker recognition[J]. Computer Engineering and Applications, 2010, 46(11): 179-182.

[1]	潘沛鑫，潘中良. 结合显著性的主动轮廓图像分割[J]. 计算机工程与应用, 2021, 57(8): 225-230.
[2]	雷恒林，古兰拜尔·吐尔洪，买日旦·吾守尔，张东梅. 新奇检测综述[J]. 计算机工程与应用, 2021, 57(5): 47-55.
[3]	曾春艳，马超峰，王志锋，朱栋梁，赵楠，王娟，刘聪. 深度学习框架下说话人识别研究综述[J]. 计算机工程与应用, 2020, 56(7): 8-16.
[4]	贾兵兵，曹辉，秦驰杰. 基于SGMM和DNN结合提高音素识别率的研究[J]. 计算机工程与应用, 2019, 55(24): 117-121.
[5]	陈超. 高斯混合模型结合加权似然的目标跟踪算法[J]. 计算机工程与应用, 2019, 55(12): 124-131.
[6]	王昕，张洪冉. 基于DNN处理的鲁棒性I-Vector说话人识别算法[J]. 计算机工程与应用, 2018, 54(22): 167-172.
[7]	仇功达1，何明1，祝朝政1，杨杰2，刘勇1. 基于稀疏交界最大密度连通的模糊聚类方法[J]. 计算机工程与应用, 2018, 54(14): 82-88.
[8]	梁恺彬，管一弘. 基于隐高斯混合模型的人脑MRI分割方法[J]. 计算机工程与应用, 2018, 54(10): 196-203.
[9]	徐利敏1，魏翔2. Android平台说话人认证系统的并行计算与设计[J]. 计算机工程与应用, 2017, 53(3): 231-236.
[10]	陈卉，胡立坤，黄钰雯. 采用高斯混合模型及树结构的立体匹配算法[J]. 计算机工程与应用, 2017, 53(20): 195-200.
[11]	牛艺蓉，王士同. 基于噪音受益的快速图像分割算法[J]. 计算机工程与应用, 2016, 52(21): 195-201.
[12]	胡志立，郭敏. 基于SLIC的改进GrabCut彩色图像快速分割[J]. 计算机工程与应用, 2016, 52(2): 186-190.
[13]	杜楠楠，赵晖. 维吾尔语情感语音韵律转换研究[J]. 计算机工程与应用, 2016, 52(19): 154-160.
[14]	张明光，张钰. 基于ANN伪量测建模的配电网状态估计[J]. 计算机工程与应用, 2016, 52(17): 253-256.
[15]	张小恒1，2，谢文宾2，李勇明2. 多类型语音特征进化选择算法[J]. 计算机工程与应用, 2016, 52(14): 150-155.

GMM文本无关的说话人识别系统研究

Research on GMM text-independent speaker recognition

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics