一种k均值和神经网络集成的语音识别方法

计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (12): 144-147.

• 数据库、信号与信息处理 • 上一篇下一篇

一种k均值和神经网络集成的语音识别方法

姚敏锋1，李心广1，黄文涛2

1.广东外语外贸大学信息学院，广州 510006
2.广东工业大学自动化学院，广州 510006

出版日期:2012-04-21 发布日期:2012-04-20

Speech recognition based on k-means clustering and neural network ensemble

YAO Minfeng1, LI Xinguang1, HUANG Wentao2

1.School of Informatics, Guangdong University of Foreign Studies, Guangzhou 510006, China
2.School of Automation, Guangdong University of Technology, Guangzhou 510006, China

Online:2012-04-21 Published:2012-04-20

摘要/Abstract

摘要： 提出了一种基于k均值聚类和BP神经网络集成的语音识别方法，该方法以神经网络集成模型为基础，利用k均值聚类算法选择部分有差异性的个体神经网络再进行集成学习，既克服了单个BP网络模型容易局部收敛和不稳定性的缺点，又解决了传统集成方法训练时间长和个体网络差异性不明显的问题。通过对非特定人孤立词的语音识别的实验，证实了该方法的有效性。

关键词: k均值聚类, 神经网络集成, 语音识别

Abstract: In this paper, a method of speech recognition based on k-means clustering and neural network ensemble is proposed. The method is based on neural network model. After a number of individual neural networks are trained, the k-means clustering algorithm is used to select a part of the trained individual networks’ weights and thresholds with small similarity. Many neural networks with the selected weights and thresholds are combined. The method not only overcomes the shortcomings that single BP neural network model is easy to local convergence and lack of stability, but also solves the problems that the traditional method in training lasts for a long time and the differences of individual network are not obvious. The experimental results prove the effectiveness of this method.

Key words: k-means clustering, neural network ensemble, speech recognition

姚敏锋1，李心广1，黄文涛2. 一种k均值和神经网络集成的语音识别方法[J]. 计算机工程与应用, 2012, 48(12): 144-147.

YAO Minfeng1, LI Xinguang1, HUANG Wentao2. Speech recognition based on k-means clustering and neural network ensemble[J]. Computer Engineering and Applications, 2012, 48(12): 144-147.

[1]	曹林，王之腾，陈亮，李洪顺，高申，张自立. 基于改进量子免疫算法的神经网络集成[J]. 计算机工程与应用, 2020, 56(22): 142-147.
[2]	娄英丹，徐静林，黄丽霞，张雪英. MLLR和MAP在远场噪声混响下的语音识别研究[J]. 计算机工程与应用, 2020, 56(10): 122-126.
[3]	赵悦，李要嫱，徐晓娜，吴立成. 临近最优主动学习的藏语语音识别方法研究[J]. 计算机工程与应用, 2018, 54(22): 156-159.
[4]	黄晓辉1，2，李京1，马睿2，3. 藏语口语语音语料库的设计与研究[J]. 计算机工程与应用, 2018, 54(13): 231-235.
[5]	宋春晓，孙颖. 面向情感语音识别的非线性几何特征提取算法[J]. 计算机工程与应用, 2017, 53(20): 128-133.
[6]	常静雅，张晓俊，顾玲玲，袁悦，顾济华，陶智. 小波域能量谱和非线性降维的病理嗓音识别[J]. 计算机工程与应用, 2017, 53(2): 166-171.
[7]	黄丽霞1，王亚楠1，张雪英1，王洪翠2. 基于深度自编码网络语音识别噪声鲁棒性研究[J]. 计算机工程与应用, 2017, 53(13): 49-54.
[8]	赵彩光，张树群，雷兆宜. 基于并行回火改进的GRBM的语音识别[J]. 计算机工程与应用, 2016, 52(8): 125-129.
[9]	达吾勒·阿布都哈依尔，努尔买买提·尤鲁瓦斯，刘艳. 面向哈萨克语LVCSR的语言模型构建方法研究[J]. 计算机工程与应用, 2016, 52(24): 178-181.
[10]	晁浩，宋成，薛霄，刘志中. 基于模型自适应的声效鲁棒性语音识别算法[J]. 计算机工程与应用, 2016, 52(2): 156-160.
[11]	晁浩. 融合音素串编辑距离的随机段模型解码算法[J]. 计算机工程与应用, 2015, 51(6): 208-211.
[12]	王路露1，夏旭2，冯璐1，刘光灿1. 基于频谱方差和谱减法的语音端点检测新算法[J]. 计算机工程与应用, 2014, 50(8): 194-197.
[13]	黄芬1，于琪1，姚霞2，商贵艳2，朱艳2，伍艳莲1，黄宇2. 小麦冠层图像H分量的K均值聚类分割[J]. 计算机工程与应用, 2014, 50(3): 129-134.
[14]	晁浩，宋成，刘志中. 语音识别中基于发音特征的声调集成算法[J]. 计算机工程与应用, 2014, 50(23): 21-25.
[15]	许竣玮，徐蔚鸿. 基于扰动免疫粒子群和K均值的混合聚类算法[J]. 计算机工程与应用, 2014, 50(22): 163-169.

一种k均值和神经网络集成的语音识别方法

Speech recognition based on k-means clustering and neural network ensemble

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics