Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (9): 50-59.DOI: 10.3778/j.issn.1002-8331.2101-0044

Previous Articles     Next Articles

Overview of Deep Learning Speech Synthesis Technology

ZHANG Xiaofeng, XIE Jun, LUO Jianxin, YANG Tao   

  1. 1.Command & Control Engineering College, Army Engineering University of PLA, Nanjing 210007, China
    2.Unit 31121 of PLA, China
  • Online:2021-05-01 Published:2021-04-29

深度学习语音合成技术综述

张小峰,谢钧,罗健欣,杨涛   

  1. 1.中国人民解放军陆军工程大学 指挥控制工程学院,南京 210007
    2.中国人民解放军 31121部队

Abstract:

Speech synthesis technology plays an important role in human-machine interaction. The development of deep learning drives the rapid development of speech synthesis technology. Speech synthesis technology based on deep learning surpasses traditional speech synthesis technology in both quality and speed. This paper reviews speech synthesis technology based on deep learning vocoders and acoustic models, discusses the working principles and advantages and disadvantages of various vocoders and acoustic models, and then summarizes the speech synthesis system, systematically reviews the classic speech synthesis system based on deep learning, and finally looks forward to the speech synthesis technology based on deep learning.

Key words: speech synthesis, vocoder, acoustic model, end to end speech synthesis

摘要:

语音合成技术在人机交互中扮演着重要角色,深度学习的发展带动语音合成技术高速发展。基于深度学习的语音合成技术在合成语音的质量和速度上都超过了传统语音合成技术。从基于深度学习的声码器和声学模型出发对语音合成技术进行综述,探讨各类声码器和声学模型的工作原理及其优缺点,在此基础上对语音合成系统进行综述,系统综述经典的基于深度学习的语音合成系统,对基于深度学习的语音合成技术进行展望。

关键词: 语音合成, 声码器, 声学模型, 端到端语音合成系统