Feature Joint Optimization of Deep Belief Network for Speech Enhancement

doi:10.3778/j.issn.1002-8331.1806-0449

Abstract

Abstract: Concerning the problem that the poor generalization ability of Deep Believe Network（DBN） which leads to poor speech enhancement performance, a regression DBN speech enhancement algorithm based on features jointing optimization is proposed. It is not necessary to make any assumptions about speech and noise in advance. The Log-Mel frequency Power Spectrum（LMPS） of speech is extracted to be used directly for constructing the enhanced speech signals to ensure the quality of speech hearing, and the Mel-Frequency Cepstral Coefficients（MFCC） of speech is extracted as an auxiliary features, respectively. All the parameters of the original deep belief network architecture are optimized by integrating the combination feature into DBN system. This joint optimization estimation scheme imposes MFCC constraints not available in the direct prediction of LMPS, and improves the generalization ability of the model to estimate the LMPS, and reconstructs the enhanced speech more accurately. Simulation results in different SNR enviroment show that compared with single feature optimization such as Log Power Spectrum（LPS） and LMPS, LMPS and MFCC joint optimization can enable the enhanced speech obtain higher PESQ and SNR, and improve speech quality and intelligibility.

Key words: Deep Believe Network（DBN）, speech enhancement, joint optimization, regression

摘要： 针对深度信念网络（Deep Believe Network，DBN）模型泛化能力较弱，导致语音增强效果不佳的问题，提出了一种特征联合优化的回归DBN语音增强算法。该算法对语音和噪声不做任何假设。该算法分别提取语音信号的LMPS（Log-Mel frequency Power Spectrum）和MFCC（Mel-Frequency Cepstral Coefficients）特征。LMPS用于直接重构增强语音，保证了语音听觉质量，MFCC作为辅助次级特征。将两种特征联合输入到DBN体系中对网络参数进行优化。这种联合优化在对LMPS的直接预测中加入MFCC限制，提升了模型对LMPS估计的泛化能力，更加准确地重构增强语音。仿真结果表明，在不同的信噪比环境下，与LPS（Log Power Spectrum）和LMPS单特征优化相比，LMPS和MFCC联合优化使增强语音获得了较高的PESQ和SNR，提高了语音质量和可懂度。

关键词: 深度信念网络, 语音增强, 联合优化, 回归

WANG Yan, JIA Hairong, JI Huifang, WANG Weimei. Feature Joint Optimization of Deep Belief Network for Speech Enhancement[J]. Computer Engineering and Applications, 2019, 55(9): 38-42.

王雁，贾海蓉，吉慧芳，王卫梅. 特征联合优化深度信念网络的语音增强算法[J]. 计算机工程与应用, 2019, 55(9): 38-42.

[1]	MA Mengping, YANG Zhixia. Asymmetric [ν]-Kernel-Free Quadratic Surface Support Vector Regression [J]. Computer Engineering and Applications, 2021, 57(7): 70-77.
[2]	ZHAO Linsuo, MA Ruiqiang, JIANG Tian, SONG Baoyan,PAN Yishan. Adaptive Early Warning Method for Streaming Big Data Events Based on Two-Stage Regression [J]. Computer Engineering and Applications, 2021, 57(7): 88-94.
[3]	YANG Li, WU Yi, WEI Debin, PAN Chengsheng. Satellite Network Traffic Prediction Based on Spatiotemporal Correlation [J]. Computer Engineering and Applications, 2021, 57(7): 101-106.
[4]	WEI Lifei, LI Mengsi, ZHANG Lei, CHEN Congcong, CHEN Yujiao, WANG Qin. Privacy-Preserving Linear Regression Algorithm Based on Secure Two-Party Computation [J]. Computer Engineering and Applications, 2021, 57(22): 139-146.
[5]	ZHANG Cuiwen, ZHANG Changlun, HE Qiang, WANG Hengyou. Research on Loss Function of Box Regression in Object Detection [J]. Computer Engineering and Applications, 2021, 57(20): 97-103.
[6]	WANG Shiqi, ZENG Qingning, LONG Chao, XIONG Songling, QI Xiaoxiao. Multi-task Learning for Speech Enhancement and Detection [J]. Computer Engineering and Applications, 2021, 57(20): 197-202.
[7]	ZHANG Lin, WANG Tinghua, ZHOU Huiying. Research Progress on Parameter Optimization of SVR Based on Swarm Intelligence Algorithm [J]. Computer Engineering and Applications, 2021, 57(16): 50-64.
[8]	LI Sha, LIN Hui. Spatio-Temporal Modelling and Prediction Combined with MLR and ARIMA Model [J]. Computer Engineering and Applications, 2021, 57(13): 276-282.
[9]	WANG Faming, LI Jianwei, CHEN Sixi. Overview of Research on 3D Human Pose Estimation [J]. Computer Engineering and Applications, 2021, 57(10): 26-38.
[10]	SHU Shike, LI Lu. Multi-factor Quantitative Stock Selection Strategy Based on Sparsity Penalty [J]. Computer Engineering and Applications, 2021, 57(1): 110-117.
[11]	DONG Yanhua, ZHANG Shumei, ZHAO Junli. Review of Occlusion Face Recognition Method [J]. Computer Engineering and Applications, 2020, 56(9): 1-12.
[12]	HAN Song, HAN Qiuhong. Review of Semi-Supervised Learning Research [J]. Computer Engineering and Applications, 2020, 56(6): 19-27.
[13]	LOU Yingdan, XU Jinglin, HUANG Lixia, ZHANG Xueying. Speech Recognition Based on MLLR and MAP Under Distant Noise Reverberation Environment [J]. Computer Engineering and Applications, 2020, 56(10): 122-126.
[14]	GAO Ning1, WANG Xingyuan1，2, WANG Xiukun1. Blink Detection Based on Eyes Motion Sequence Analysis [J]. Computer Engineering and Applications, 2019, 55(8): 40-47.
[15]	JI Huifang, JIA Hairong, WANG Yan. Speech Enhancement Method for Improving Phase Spectrum Compensation [J]. Computer Engineering and Applications, 2019, 55(8): 48-52.

Feature Joint Optimization of Deep Belief Network for Speech Enhancement

特征联合优化深度信念网络的语音增强算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics