Speech Recognition Based on MLLR and MAP Under Distant Noise Reverberation Environment

doi:10.3778/j.issn.1002-8331.1901-0186

Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (10): 122-126.DOI: 10.3778/j.issn.1002-8331.1901-0186

Previous Articles Next Articles

Speech Recognition Based on MLLR and MAP Under Distant Noise Reverberation Environment

LOU Yingdan, XU Jinglin, HUANG Lixia, ZHANG Xueying

College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China

Online:2020-05-15 Published:2020-05-13

MLLR和MAP在远场噪声混响下的语音识别研究

娄英丹，徐静林，黄丽霞，张雪英

太原理工大学信息与计算机学院，太原 030024

Abstract

Abstract:

Adaptive technology can adjust the acoustic model parameters with less data to achieve better speech recognition effect. Most of them are used to adapt to accented speech. In this paper, Maximum Likelihood Linear Regression（MLLR） and Maximum A Posteriori（MAP） adaptive techniques are used in distant noise reverberation environment to analyze their recognition performance in this background. The experimental results show that under the simulation conditions, the reflection coefficient of the wall is 0.6, and the MAP has the best adaptive performance under various noise environments. When the Signal-to-Noise Ratio（SNR） is 5 dB, 10 dB, 15 dB, respectively, MAP reduces the Word Error Rate（WER） by 1.51%, 12.82% and 2.95% on average. Under real conditions, MAP reduces the WER to a maximum of 37.13%. On this basis, the good graduality of MAP is verified. When the number of adaptive sentences is 1000, the WER of distant noise reverberation continuous speech obtained by MAP acoustic model adaptive method is 12.5% lower than that precious adaptive period.

Key words: Maximum Likelihood Linear Regression（MLLR）, Maximum A Posteriori（MAP）, environmental adaptation, distant speech recognition

摘要：

自适应技术可以用较少的数据来调整声学模型参数，从而达到较好的语音识别效果，它们大多用于自适应有口音的语音。将最大似然线性回归（Maximum Likelihood Linear Regression，MLLR）、最大后验概率（Maximum A Posteriori，MAP）自适应技术用在远场噪声混响环境下来分析其在此环境下的识别性能。实验结果表明，仿真条件下，在墙壁反射系数为0.6，各种噪声环境下MAP有最好的自适应性能，在信噪比（Signal-to-Noise Ratio，SNR）分别为5 dB、10 dB、15 dB时，MAP使远场连续语音词错率（Word Error Rate，WER）平均降低了1.51%、12.82%、2.95%。真实条件下，MAP使WER下降幅度最大达到了37.13%。进一步验证了MAP良好的渐进性，且当自适应句数为1 000时，用MAP声学模型自适应方法得到的远场噪声混响连续语音的识别词错率比自适应前平均降低了12.5%。

关键词: 最大似然线性回归（MLLR）, 最大后验概率（MAP）, 环境自适应, 远场语音识别

LOU Yingdan, XU Jinglin, HUANG Lixia, ZHANG Xueying. Speech Recognition Based on MLLR and MAP Under Distant Noise Reverberation Environment[J]. Computer Engineering and Applications, 2020, 56(10): 122-126.

娄英丹，徐静林，黄丽霞，张雪英. MLLR和MAP在远场噪声混响下的语音识别研究[J]. 计算机工程与应用, 2020, 56(10): 122-126.

[1]	LI Andi1，2, LIU Yi1，2, ZHANG Quan1，2, GUI Zhiguo1，2. MAP projection domain denoising based on anisotropic weighted prior model [J]. Computer Engineering and Applications, 2018, 54(22): 180-185.
[2]	ZHAO Chenyang1, ZHAI Shaodan1, SI Jie2. Unsupervised clustering algorithm based on Maximum a Posteriori [J]. Computer Engineering and Applications, 2013, 49(19): 131-134.
[3]	NI Cui，GUAN Zequn，WANG Bin，ZHU Sujuan. Improvement to feature-level fusion of images based on MRF [J]. Computer Engineering and Applications, 2011, 47(32): 211-214.
[4]	ZHAN Ling，JING Xinxing. Speaker recognition system based on VQ-MAP and SVM [J]. Computer Engineering and Applications, 2011, 47(13): 136-138.
[5]	YAN He，LIU Jia-ling，ZHANG Xiao-chuan. MAP image restoration in dual-tree complex wavelet transform domain [J]. Computer Engineering and Applications, 2009, 45(26): 182-184.

Speech Recognition Based on MLLR and MAP Under Distant Noise Reverberation Environment

MLLR和MAP在远场噪声混响下的语音识别研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 5

Recommended Articles

Metrics