Computer Engineering and Applications ›› 2016, Vol. 52 ›› Issue (8): 125-129.

Previous Articles     Next Articles

Improved speech recognition of GRBM based on parallel tempering

ZHAO Caiguang, ZHANG Shuqun, LEI Zhaoyi   

  1. School of Information Science and Technology, Jinan University, Guangzhou 510632, China
  • Online:2016-04-15 Published:2016-04-19

基于并行回火改进的GRBM的语音识别

赵彩光,张树群,雷兆宜   

  1. 暨南大学 信息科学技术学院,广州 510632

Abstract: To improve the performance of continuous data in speech recognition, the Gaussian-Bernoulli Restricted Boltzmann Machine(GRBM) is used to train and recognize the speech signal based on a developed recognition method. An improved GRBM network based on Parallel Tempering(GRBM-PT) is proposed by combining with the parallel tempering learning algorithm, which samples and swaps the reconstructed data in the different temperatures of entire distribution. Based on a scheme of pre-training and modeling the speech signal, the outputs are classified with a Support Vector Machine(SVM). The experimental results of digit speech recognition on the core test of TI-Digits show that the proposed scheme works very well, the accuracy can be as high as 83.14%. It is found that the GRBM-PT performs better than other methods, such as RBM, RBM-PT and GRBM.

Key words: Gaussian-Bernoulli Restricted Boltzmann Machine(GRBM), restricted Boltzmann machine, parallel tempering, speech recognition

摘要: 为提高连续语音识别中的识别准确率,采用高斯伯努利受限玻尔兹曼机进行语音训练和识别。通过结合并行回火算法的思想,采样、交换不同的温度链下的重构数据,实现在全局范围内对整个分布进行采样,提出一种基于并行回火改进的高斯伯努利受限玻尔兹曼机(GRBM-PT)的建模方法。该方法通过对语音信号的连续数据进行预训练分析、建模,最后使用支持向量机作为语音识别的分类器。在TI-Digits数字语音训练和数字测试数据库上的实验结果表明,语音识别率能够达到83.14%,基于GRBM-PT模型下的语音识别率明显优于RBM,RBM-PT以及GRBM模型的性能。

关键词: 高斯伯努利受限玻尔兹曼机(GRBM), 受限玻尔兹曼机, 并行回火, 语音识别