计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (8): 125-129.

• 模式识别与人工智能 • 上一篇    下一篇

基于并行回火改进的GRBM的语音识别

赵彩光,张树群,雷兆宜   

  1. 暨南大学 信息科学技术学院,广州 510632
  • 出版日期:2016-04-15 发布日期:2016-04-19

Improved speech recognition of GRBM based on parallel tempering

ZHAO Caiguang, ZHANG Shuqun, LEI Zhaoyi   

  1. School of Information Science and Technology, Jinan University, Guangzhou 510632, China
  • Online:2016-04-15 Published:2016-04-19

摘要: 为提高连续语音识别中的识别准确率,采用高斯伯努利受限玻尔兹曼机进行语音训练和识别。通过结合并行回火算法的思想,采样、交换不同的温度链下的重构数据,实现在全局范围内对整个分布进行采样,提出一种基于并行回火改进的高斯伯努利受限玻尔兹曼机(GRBM-PT)的建模方法。该方法通过对语音信号的连续数据进行预训练分析、建模,最后使用支持向量机作为语音识别的分类器。在TI-Digits数字语音训练和数字测试数据库上的实验结果表明,语音识别率能够达到83.14%,基于GRBM-PT模型下的语音识别率明显优于RBM,RBM-PT以及GRBM模型的性能。

关键词: 高斯伯努利受限玻尔兹曼机(GRBM), 受限玻尔兹曼机, 并行回火, 语音识别

Abstract: To improve the performance of continuous data in speech recognition, the Gaussian-Bernoulli Restricted Boltzmann Machine(GRBM) is used to train and recognize the speech signal based on a developed recognition method. An improved GRBM network based on Parallel Tempering(GRBM-PT) is proposed by combining with the parallel tempering learning algorithm, which samples and swaps the reconstructed data in the different temperatures of entire distribution. Based on a scheme of pre-training and modeling the speech signal, the outputs are classified with a Support Vector Machine(SVM). The experimental results of digit speech recognition on the core test of TI-Digits show that the proposed scheme works very well, the accuracy can be as high as 83.14%. It is found that the GRBM-PT performs better than other methods, such as RBM, RBM-PT and GRBM.

Key words: Gaussian-Bernoulli Restricted Boltzmann Machine(GRBM), restricted Boltzmann machine, parallel tempering, speech recognition