特征联合优化深度信念网络的语音增强算法

doi:10.3778/j.issn.1002-8331.1806-0449

计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (9): 38-42.DOI: 10.3778/j.issn.1002-8331.1806-0449

特征联合优化深度信念网络的语音增强算法

王雁，贾海蓉，吉慧芳，王卫梅

太原理工大学信息与计算机学院，山西榆次 030600

出版日期:2019-05-01 发布日期:2019-04-28

Feature Joint Optimization of Deep Belief Network for Speech Enhancement

WANG Yan, JIA Hairong, JI Huifang, WANG Weimei

College of Information and Computer, Taiyuan University of Technology, Yuci, Shanxi 030600, China

Online:2019-05-01 Published:2019-04-28

摘要/Abstract

摘要： 针对深度信念网络（Deep Believe Network，DBN）模型泛化能力较弱，导致语音增强效果不佳的问题，提出了一种特征联合优化的回归DBN语音增强算法。该算法对语音和噪声不做任何假设。该算法分别提取语音信号的LMPS（Log-Mel frequency Power Spectrum）和MFCC（Mel-Frequency Cepstral Coefficients）特征。LMPS用于直接重构增强语音，保证了语音听觉质量，MFCC作为辅助次级特征。将两种特征联合输入到DBN体系中对网络参数进行优化。这种联合优化在对LMPS的直接预测中加入MFCC限制，提升了模型对LMPS估计的泛化能力，更加准确地重构增强语音。仿真结果表明，在不同的信噪比环境下，与LPS（Log Power Spectrum）和LMPS单特征优化相比，LMPS和MFCC联合优化使增强语音获得了较高的PESQ和SNR，提高了语音质量和可懂度。

关键词: 深度信念网络, 语音增强, 联合优化, 回归

Abstract: Concerning the problem that the poor generalization ability of Deep Believe Network（DBN） which leads to poor speech enhancement performance, a regression DBN speech enhancement algorithm based on features jointing optimization is proposed. It is not necessary to make any assumptions about speech and noise in advance. The Log-Mel frequency Power Spectrum（LMPS） of speech is extracted to be used directly for constructing the enhanced speech signals to ensure the quality of speech hearing, and the Mel-Frequency Cepstral Coefficients（MFCC） of speech is extracted as an auxiliary features, respectively. All the parameters of the original deep belief network architecture are optimized by integrating the combination feature into DBN system. This joint optimization estimation scheme imposes MFCC constraints not available in the direct prediction of LMPS, and improves the generalization ability of the model to estimate the LMPS, and reconstructs the enhanced speech more accurately. Simulation results in different SNR enviroment show that compared with single feature optimization such as Log Power Spectrum（LPS） and LMPS, LMPS and MFCC joint optimization can enable the enhanced speech obtain higher PESQ and SNR, and improve speech quality and intelligibility.

Key words: Deep Believe Network（DBN）, speech enhancement, joint optimization, regression

王雁，贾海蓉，吉慧芳，王卫梅. 特征联合优化深度信念网络的语音增强算法[J]. 计算机工程与应用, 2019, 55(9): 38-42.

WANG Yan, JIA Hairong, JI Huifang, WANG Weimei. Feature Joint Optimization of Deep Belief Network for Speech Enhancement[J]. Computer Engineering and Applications, 2019, 55(9): 38-42.

[1]	马梦萍，杨志霞. 非对称ν-无核二次曲面支持向量回归机[J]. 计算机工程与应用, 2021, 57(7): 70-77.
[2]	赵林锁，马瑞强，姜天，宋宝燕，潘一山. 两级回归的流式大数据事件自适应预警方法[J]. 计算机工程与应用, 2021, 57(7): 88-94.
[3]	杨力，吴义，魏德宾，潘成胜. 基于时空相关性的卫星网络流量预测[J]. 计算机工程与应用, 2021, 57(7): 101-106.
[4]	徐先峰，蔡路路，张丽. 融合MLP和DBN的光伏发电预测算法[J]. 计算机工程与应用, 2021, 57(3): 266-272.
[5]	魏立斐，李梦思，张蕾，陈聪聪，陈玉娇，王勤. 基于安全两方计算的隐私保护线性回归算法[J]. 计算机工程与应用, 2021, 57(22): 139-146.
[6]	王师琦，曾庆宁，龙超，熊松龄，祁潇潇. 语音增强与检测的多任务学习方法研究[J]. 计算机工程与应用, 2021, 57(20): 197-202.
[7]	张翠文，张长伦，何强，王恒友. 目标检测中框回归损失函数的研究[J]. 计算机工程与应用, 2021, 57(20): 97-103.
[8]	张琳，汪廷华，周慧颖. 基于群智能算法的SVR参数优化研究进展[J]. 计算机工程与应用, 2021, 57(16): 50-64.
[9]	李莎，林晖. 结合MLR和ARIMA模型的时空建模及预测[J]. 计算机工程与应用, 2021, 57(13): 276-282.
[10]	康梦轩，宋俊平，范鹏飞，高博文，周旭，李琢. 基于深度学习的网络流量预测研究综述[J]. 计算机工程与应用, 2021, 57(10): 1-9.
[11]	王发明，李建微，陈思喜. 三维人体姿态估计研究综述[J]. 计算机工程与应用, 2021, 57(10): 26-38.
[12]	舒时克，李路. 正则稀疏化的多因子量化选股策略[J]. 计算机工程与应用, 2021, 57(1): 110-117.
[13]	董艳花，张树美，赵俊莉. 有遮挡人脸识别方法综述[J]. 计算机工程与应用, 2020, 56(9): 1-12.
[14]	周涛，陆惠玲，霍兵强. 深度信念网络研究进展[J]. 计算机工程与应用, 2020, 56(9): 24-32.
[15]	韩嵩，韩秋弘. 半监督学习研究的述评[J]. 计算机工程与应用, 2020, 56(6): 19-27.

特征联合优化深度信念网络的语音增强算法

Feature Joint Optimization of Deep Belief Network for Speech Enhancement

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics