Voice conversion using deep belief networks

Computer Engineering and Applications ›› 2016, Vol. 52 ›› Issue (15): 168-171.

Previous Articles Next Articles

Voice conversion using deep belief networks

WANG Min, HUANG Fei, LIU Li, WEI Mingfei, WANG Mingming

School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, China

Online:2016-08-01 Published:2016-08-12

采用深度信念网络的语音转换方法

王民，黄斐，刘利，卫铭斐，王明明

西安建筑科技大学信息与控制工程学院，西安 710055

Abstract

Abstract: This paper presents a voice conversion technique using Deep Belief Nets（DBN） to build high-order eigen spaces of the source/target speakers, where it is easier to convert the source speech to the target speech than in the traditional cepstrum space. Training the DBNs for a source speaker and a target speaker, it can then connect and convert the speaker individuality abstractions using Artificial Neural Networks（ANN）. The converted abstraction of the source speaker is then brought back to the cepstrum space using an inverse process of the DBNs of the target speaker. It conducts speaker voice conversion experiments and confirms the efficacy of the method with respect to subjective and objective criteria, when comparing it with the conventional Gaussian Mixture Model-based method.

Key words: voice conversion, speaker characteristics, deep belief networks, high-order eigen spaces

摘要： 对说话人语音个性特征信息的表征和提取进行了深入研究，提出了一种基于深度信念网络（Deep Belief Nets，DBN）的语音转换方法。分别用提取出的源说话人和目标说话人语音频谱参数来训练DBN，分别得到其在高阶空间的语音个性特征表征；通过人工神经网络（Artificial Neural Networks，ANN）来连接这两个高阶空间并进行特征转换；使用基于目标说话人数据训练出的DBN来对转换后的特征信息进行逆处理得到转换后语音频谱参数，合成转换语音。实验结果表明，与传统的基于GMM方法相比，该方法效果更好，转换语音音质和相似度同目标语音更接近。

关键词: 语音转换, 语音个性特征, 深度信念网络模型, 高阶空间

WANG Min, HUANG Fei, LIU Li, WEI Mingfei, WANG Mingming. Voice conversion using deep belief networks[J]. Computer Engineering and Applications, 2016, 52(15): 168-171.

王民，黄斐，刘利，卫铭斐，王明明. 采用深度信念网络的语音转换方法[J]. 计算机工程与应用, 2016, 52(15): 168-171.

[1]	XU Xianfeng, CAI Lulu, ZHANG Li. Photovoltaic Power Generation Prediction Algorithm Based on MLP and DBN [J]. Computer Engineering and Applications, 2021, 57(3): 266-272.
[2]	KANG Mengxuan, SONG Junping, FAN Pengfei, GAO Bowen, ZHOU Xu, LI Zhuo. Survey of Network Traffic Forecast Based on Deep Learning [J]. Computer Engineering and Applications, 2021, 57(10): 1-9.
[3]	CHEN Qiuju, XU Jianguo. Optimized Orthogonal Matching Pursuit and Short-Time Spectrum Estimation for Sound Recognition [J]. Computer Engineering and Applications, 2020, 56(7): 162-169.
[4]	CHEN Hong, WANG Runting, XIAO Chenglong, GUO Pengfei, HUANG Jie, CHEN Honglin. Research on Intrusion Detection Model Based on DBN-XGBDT [J]. Computer Engineering and Applications, 2020, 56(22): 83-91.
[5]	LIU Ruilan, MAO Jiamin. Soft sensor modeling of 4-CBA based on deep belief networks [J]. Computer Engineering and Applications, 2017, 53(6): 227-230.
[6]	SONG Lijuan. Recognition model of disease image based on discriminative deep belief networks [J]. Computer Engineering and Applications, 2017, 53(21): 32-36.
[7]	TIAN Wanglan, LI Jiasheng. Improved use of deep belief networks for voice activity detection [J]. Computer Engineering and Applications, 2014, 50(20): 207-210.

Voice conversion using deep belief networks

采用深度信念网络的语音转换方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 7

Recommended Articles

Metrics