Optimization of acoustic model for Uyghur continuous speech recognition

Abstract

Abstract: This paper gives an introduction to the application of Tandem feature extraction method which holds the advantages of Gaussian mixture model and artificial neural network frameworks to Uyghur acoustic modeling. At the beginning, a series of processes convert the original Mel Frequency Cepstrum Coefficient（MFCC） feature to Tandem feature as the input to the hidden Markov model based speech recognition system, then the acoustic model is discriminatively trained according to the minimum phone error discriminative criterion, finally the experiments are carried out on the test set. Experimental results show that minimum phone error trained acoustic model on Tandem feature can give a relative word error rate reduction of 13% over the maximum likelihood estimated system.

Key words: Uyghur, speech recognition, minimum phone error, Tandem feature

摘要： 综合了语音识别中常用的高斯混合模型和人工神经网络框架优点的Tandem特征提取方法应用于维吾尔语声学模型训练中，经过一系列后续处理，将原始的MFCC特征转化为Tandem特征，以此作为基于隐马尔可夫统计模型的语音识别系统的输入，并使用最小音素错误区分性训练准则训练声学模型，进而完成在测试集上的识别实验。实验结果显示，Tandem区分性训练方法使识别系统的单词错误率比原先的基于最大似然估计准则的系统相对减少13%。

关键词: 维吾尔语, 语音识别, 最小音素错误, Tandem特征

Nurmemet Yolwas, Wushour Silamu. Optimization of acoustic model for Uyghur continuous speech recognition[J]. Computer Engineering and Applications, 2013, 49(2): 145-147.

努尔麦麦提·尤鲁瓦斯，吾守尔·斯拉木. 维吾尔语连续语音识别声学模型优化研究[J]. 计算机工程与应用, 2013, 49(2): 145-147.

[1]	Hasan Wumaier, Sirajahmat Ruzmamat, Xireaili Hairela, LIU Wenqi, Tuergen Yibulayin, WANG Liejun, Wayit Abulizi. Bi-directional Uyghur-Chinese Neural Machine Translation with Marked Syllables [J]. Computer Engineering and Applications, 2021, 57(4): 161-168.
[2]	LIU Chang, Abudukelimu·Abulizi, YAO Dengfeng, Halidanmu·Abudukelimu. Survey for Uyghur Morphological Analysis [J]. Computer Engineering and Applications, 2021, 57(15): 42-61.
[3]	Ahmatjan Mattohti, Askar Hamdulla, Abdusalam Dawut. Uyghur Text Regions Localization Using Channel-Enhanced MSER and CNN [J]. Computer Engineering and Applications, 2020, 56(16): 132-138.
[4]	XU Xuebin, Hornisa Mamat, Alim Aysa, ZHU Yali, Kurban Ubul. Word Segmentation of Uyghur Image Based on Clustering and Conjoined Segment Identification [J]. Computer Engineering and Applications, 2020, 56(14): 148-155.
[5]	LOU Yingdan, XU Jinglin, HUANG Lixia, ZHANG Xueying. Speech Recognition Based on MLLR and MAP Under Distant Noise Reverberation Environment [J]. Computer Engineering and Applications, 2020, 56(10): 122-126.
[6]	Yibulayin·Wusiman, GUO Wenqiang, YU Kai. Research on Filtering Algorithm for Senstive Information in Multi-form Uyghur [J]. Computer Engineering and Applications, 2020, 56(10): 127-133.
[7]	AYSADET·Abliz, HOJAHMAT·Ismayil, KAMIL·Muyidin, ASKAR·Hamdulla. Word extraction from Uyghur handwritten documents [J]. Computer Engineering and Applications, 2018, 54(9): 133-138.
[8]	XUE Pengqiang, XIAN Ying, Nurbol, Wushour Silamu. Sensitive information filtering algorithm based on Uyghur text information network research [J]. Computer Engineering and Applications, 2018, 54(5): 236-241.
[9]	ZHAO Yue, LI Yaoqiang, XU Xiaona, WU Licheng. Near-optimal active learning for Tibetan speech recognition [J]. Computer Engineering and Applications, 2018, 54(22): 156-159.
[10]	Yibulayin·WUSIMAN1, ZHANG Shaowu2, YU Kai1. Research and implementation of converting mechanism of multiple characters Uyghur on the Internet [J]. Computer Engineering and Applications, 2018, 54(19): 114-121.
[11]	MUHETAER Palidan, SILAMU Wushouer, Maimaitayifu, YOULUWASI Nuermaimaiti. Application of RNN encoder-decoder in Uyghur-Chinese machine translation [J]. Computer Engineering and Applications, 2018, 54(15): 235-240.
[12]	HUANG Xiaohui1，2, LI Jing1, MA Rui2，3. Design and research of Tibetan spoken speech corpus [J]. Computer Engineering and Applications, 2018, 54(13): 231-235.
[13]	Guljamal Mamateli1, Askar rozi2, Askar Hamdulla3. Uyghur prosodic boundary prediction based on hierarchical feature template selection [J]. Computer Engineering and Applications, 2017, 53(8): 250-253.
[14]	JIANG Wen，LIU Likang. Recognition of handwritten Uyghur character based on combination of two features [J]. Computer Engineering and Applications, 2017, 53(5): 192-196.
[15]	NIAN Mei1, FAN Zukui2, LIU Ruolan1. Study on construction of emotional dictionary of Uyghur language [J]. Computer Engineering and Applications, 2017, 53(4): 152-155.

Optimization of acoustic model for Uyghur continuous speech recognition

维吾尔语连续语音识别声学模型优化研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics