Research on GMM based speaker recognition technology

Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (11): 114-117.

• 数据库、信号与信息处理 • Previous Articles Next Articles

Research on GMM based speaker recognition technology

CAO Jie，PAN Peng

College of Computer and Communication，Lanzhou University of Technology，Lanzhou 730050，China

Received:1900-01-01 Revised:1900-01-01 Online:2011-04-11 Published:2011-04-11

基于GMM的说话人识别技术研究

曹洁，潘鹏

兰州理工大学计算机与通信学院，兰州 730050

Abstract

Abstract: In order to investigate the function of Gaussian Mixture Model（GMM） in speaker recognition，a GMM based speaker recognition system is designed.The system consists of four modules that are audio signal pre-processing，speech activity detection，speaker modeling as well as audio signal recognition.The first three modules constitute the model training segment of the system and the last module constitutes the speech recognition segment of the system.A speech activity detector which is built by GMM in the second module is the innovation of the research.Some tunable parameters and recognition error rate of the system are tested using audio-visual meetings in the Augmented Multi-party Interaction（AMI） corpus.Simulations show that with the help of the speech activity detector and several filter algorithms，recognition accuracy rate of the system for audio signal with overlap speech can reach 83.02%.

Key words: Gaussian Mixture Model（GMM）, speech activity detection, recognition error rate

摘要： 为了探讨高斯混合模型在说话人识别中的作用，设计了一个基于GMM的说话人识别系统。整个系统由音频信号预处理，语音活动检测，说话人模型建立以及音频信号识别4个模块组成。前三个模块构成了系统的模型训练部分，最后一个模块构成了系统的语音识别部分。包含在第二个模块中的由GMM模型搭建的语音活动检测器是研究的创新之处。利用增强的多方互动会议语料库中的视听会议对系统中的部分可调参数以及系统的识别错误率进行了测试。仿真结果表明，在语音活动检测器和若干滤波算法的帮助下，系统对包含重叠语音的音频信号的识别准确率可以达到83.02%。

关键词: 高斯混合模型, 语音活动检测, 识别错误率

CAO Jie，PAN Peng. Research on GMM based speaker recognition technology[J]. Computer Engineering and Applications, 2011, 47(11): 114-117.

曹洁，潘鹏. 基于GMM的说话人识别技术研究[J]. 计算机工程与应用, 2011, 47(11): 114-117.

[1]	DU Nannan, ZHAO Hui. Rearch on prosodic hierarchy conversion for Uyghur emotional speech [J]. Computer Engineering and Applications, 2016, 52(19): 154-160.
[2]	LIU Yuchao. Adaptive concept abstraction method on multi-granularity—Gaussian cloud transformation [J]. Computer Engineering and Applications, 2015, 51(9): 1-8.
[3]	DANG Xiaochao1，2, MAO Pengxin1, HAO Zhanjun1，2. Network traffic clustering algorithm based on quick solution of GMM [J]. Computer Engineering and Applications, 2015, 51(8): 96-101.
[4]	SHEN Yan, XIAO Zhongzhe, LI Bingjie, ZHOU Xiaojin, ZHOU Qiang, TAO Zhi. Speech emotion recognition using GW-MFCC feature [J]. Computer Engineering and Applications, 2015, 51(10): 219-222.
[5]	LV Xiaoting, LI Xin, QU Yanqin, HU Chen. Speaker identification based on sparse representation [J]. Computer Engineering and Applications, 2014, 50(20): 215-217.
[6]	KONG Rong, WU Di, LIAO Qipeng, ZHU Junjie, ZHOU Qiang, TAO Zhi. Using complex cepstrum peak filter for reverberation recognition by GMM [J]. Computer Engineering and Applications, 2014, 50(15): 191-193.
[7]	XU Hui1, Riyiman TURSUN1，2, Wushour SILAMU2. Online-handwriting recognition research of Uyghur word using GMM and HMM [J]. Computer Engineering and Applications, 2014, 50(11): 202-205.
[8]	CHEN Chengyi1, GAO Junfen2. study and recognition of pathological voice [J]. Computer Engineering and Applications, 2013, 49(7): 123-125.
[9]	ZHONG Jinqin1，3, GU Lichuan2, TAN Jieqing3, LI Yingying3. Estimating parameters of GMM based on split EM [J]. Computer Engineering and Applications, 2012, 48(34): 28-32.
[10]	SONG Jiasheng1，2. Study of adaptive Gaussian mixture models for dynamic scenes [J]. Computer Engineering and Applications, 2012, 48(1): 8-12.
[11]	XU Ming，HAN Junwei，GUO Lei，YIN Wenjie. Determine word number of Visual Bag-of-Words model by model selection method [J]. Computer Engineering and Applications, 2011, 47(31): 148-150.
[12]	RUI Rui，BAO Changchun. Fast classification method of narrow-band audio signals under noisy environment [J]. Computer Engineering and Applications, 2011, 47(16): 22-25.
[13]	HUANG Xiaozhong，LI Hui，XU Dongxing，GUO Wei. SVM speaker verification based on prosodic feature [J]. Computer Engineering and Applications, 2011, 47(15): 148-151.
[14]	TAN Jianhui1，2，PAN Baochang3. Algorithm of human thermal image segmentation under complicated background [J]. Computer Engineering and Applications, 2011, 47(14): 13-16.
[15]	WANG Yu-xin，LIU Yan-fei，GUO He，LIU Tian-yang，YANG Yuan-sheng. Research of image matching method in sea ice observation [J]. Computer Engineering and Applications, 2010, 46(35): 245-248.

Research on GMM based speaker recognition technology

基于GMM的说话人识别技术研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics