Speaker recognition algorithm based on HHT cepstrum coefficient

Abstract

Abstract: According to the problem that LPCC only reacts speech signal static characteristics and can not describe the low frequency local characteristics of speech signal well, a new speaker recognition algorithm based on HHT cepstrum coefficient is proposed. The low frequency local characteristics of the signal can be described better by the empirical mode decomposition of HHT. The dynamic characteristics are reacted by the Hilbert transform, improving the LPCC deficiencies. Speech signal is decomposed into intrinsic mode components using empirical mode decomposition. Hilbert transform is done for each component to get the Hilbert marginal spectrum. The logarithmic power spectrum of total marginal spectrum is calculated and then done the DCT to get 13-dimensional cepstrum coefficient. The feature is sent into the gaussian mixture model to do speaker recognition. Simulation results demonstrate that compared to the LPCC, the HHT cepstrum coefficient gets a higher recognition rate. Recognition rate is increased by 12.59%, but feature extraction time is increased by 19.27 s.

Key words: speaker recognition, Hilbert-Huang Transform（HHT）, cepstrum coefficient

摘要： 针对LPCC只反应语音静态特征且不能突出其低频局部特征问题，提出一种以HHT倒谱系数为特征的说话人识别算法，HHT的经验模态分解使语音的低频局部特征得到更好的描述，Hilbert变换能够刻画语音动态特性，改进了LPCC的不足。用经验模态分解将语音分解为一系列固有模态函数分量并做Hilbert变换求得Hilbert边际谱，计算总边际谱的对数功率谱并做DCT得13维倒谱系数，将此特征送入高斯混合模型进行说话人识别。仿真实验结果表明，基于HHT倒谱系数的说话人识别算法，相较LPCC识别率提高了12.59%，但特征提取时间增加了19.27 s。

关键词: 说话人识别, 希尔伯特黄变换（HHT）, 倒谱系数

DU Xiaoqing, YU Fengqin. Speaker recognition algorithm based on HHT cepstrum coefficient[J]. Computer Engineering and Applications, 2014, 50(3): 198-202.

杜晓青，于凤芹. 基于HHT倒谱系数的说话人识别算法[J]. 计算机工程与应用, 2014, 50(3): 198-202.

[1]	ZENG Chunyan, MA Chaofeng, WANG Zhifeng, ZHU Dongliang, ZHAO Nan, WANG Juan, LIU Cong. Survey of Speaker Recognition in Deep Learning Framework [J]. Computer Engineering and Applications, 2020, 56(7): 8-16.
[2]	WANG Xin, ZHANG Hongran. Robust i-vector speaker recognition method based on DNN processing [J]. Computer Engineering and Applications, 2018, 54(22): 167-172.
[3]	GUO Rui, FAN Yamin. Algorithm based on extreme learning machine to restrain the end effect of BS-EMD and its application [J]. Computer Engineering and Applications, 2017, 53(7): 256-262.
[4]	XU Limin1, WEI Xiang2. Analysis and design of speaker authentication system based on Android platform of parallel computation [J]. Computer Engineering and Applications, 2017, 53(3): 231-236.
[5]	HUANG Lixia1, WANG Yanan1, ZHANG Xueying1, WANG Hongcui2. Research on noise robustness of speech recognition based on deep auto-encoder neural network [J]. Computer Engineering and Applications, 2017, 53(13): 49-54.
[6]	ZHANG Xiaoheng1，2, XIE Wenbin2, LI Yongming2. Multiple voice features types evolutionary selection algorithm [J]. Computer Engineering and Applications, 2016, 52(14): 150-155.
[7]	LUO Jian, YANG Yingen, LEI Zhenchun. Weighted pairwise constraint metric learning in speaker recognition [J]. Computer Engineering and Applications, 2016, 52(11): 158-163.
[8]	ZHAO Taotao, YANG Hongwu. Formant extraction algorithm of speech signal by combining EMD and WMCEP [J]. Computer Engineering and Applications, 2015, 51(9): 207-212.
[9]	GUAN Weiguo1, YAO Qingzhi1, LU Baochun2. HHT harmonic detection and time-frequency analysis method in microgrid [J]. Computer Engineering and Applications, 2015, 51(20): 198-202.
[10]	LI Zuoqiang, GAO Yong. Robust speaker identification based on CFCC and phase information [J]. Computer Engineering and Applications, 2015, 51(17): 228-232.
[11]	LIU Xiaowen, JIANG Lei, XU Hua. Ultra-wideband signal detection on Hilbert-Huang transform [J]. Computer Engineering and Applications, 2015, 51(12): 223-229.
[12]	SU Peng, CHENG Jian. Application of DHMM to mechanical equipment audio recognition [J]. Computer Engineering and Applications, 2015, 51(1): 266-270.
[13]	HU Zhengquan, ZENG Yuming, ZONG Yuan, LI Mengchao. Improvement of MFCC parameters extraction in speaker recognition [J]. Computer Engineering and Applications, 2014, 50(7): 217-220.
[14]	WANG Xi1, LI Ying2. Multi-band spectral subtraction method applied to natural sounds classification [J]. Computer Engineering and Applications, 2014, 50(3): 190-193.
[15]	SUN Yan. Self-adaption fuzzy clustering LBG vector-quantization algorithm [J]. Computer Engineering and Applications, 2014, 50(23): 203-205.

Speaker recognition algorithm based on HHT cepstrum coefficient

基于HHT倒谱系数的说话人识别算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles 0

Metrics