%0 Journal Article
%A WANG Dong
%A WANG Liyuan
%A WANG Daliang
%A QI Hongwei
%T DTZH1505：Large Scale Open Source Mandarin Speech Corpus
%D 2022
%R 10.3778/j.issn.1002-8331.2112-0333
%J Computer Engineering and Applications
%P 295-301
%V 58
%N 11
%X In recent years, deep learning has made a breakthrough in the field of speech recognition, and pushes forward the wide application of speech recognition technology in people’s daily lives. Further optimization of the speech recognition model needs to be supported by a larger scale calibrated data. However, the scale of the current open source audio data set is still too small, and corpus is mostly written language of news-based long texts. This paper, by talking about the popular speech recognition applications like human-computer interaction and intelligent customer service, builds and opens the largest ever Chinese Mandarin speech corpus DTZH1505 through crowdsourcing. Data set records natural speech of 6?408 speakers from 8 major Chinese dialect regions and 33 provinces, up to 1?505 hours and on various scenes like social networking, human-computer interaction, intelligent customer service and on-board commands. It can be widely used in the researches of corpus linguistics, conversation analysis, speech recognition, as well as speaker recognition. This paper implements a series benchmark speech recognition experiments, and the results show that：compared to the same scale Chinese speech corpus aishell2, the speech recognition model based on this data set has better performance.
%U http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2112-0333