Computer Engineering and Applications ›› 2013, Vol. 49 ›› Issue (7): 115-118.

Previous Articles     Next Articles

Multiclass classification pre-selection of SVM in speech recognition application

HE Yuanyuan1, ZHANG Xueying1, LIU Xiaofeng2   

  1. 1.College of Information Engineering, Taiyuan University of Technology, Taiyuan 030024, China
    2.Deparment of Math, College of Science, Taiyuan University of Technology, Taiyuan 030024, China
  • Online:2013-04-01 Published:2013-04-15

多类分类预选取的SVM在语音识别中的应用

贺元元1,张雪英1,刘晓峰2   

  1. 1.太原理工大学 信息工程学院,太原 030024
    2.太原理工大学 理学院 数学系,太原 030024

Abstract: For Support Vector Machine(SVM) training process, a lot of time is wasted in the complex calculations on non-support vectors. Especially for large-scale speech recognition systems, training time of SVM on unnecessary overhead will be more remarkable. Kernel Fuzzy C-Means(KFCM) clustering is a typical and dynamic clustering algorithm, and the advantages of kernel is that it can non-linearly map the model space data to high dimensional feature space. This method is based on KFCM clustering and combined the idea of one-versus-one method in multiclass classification SVM. According to the established guidelines, it pre-selects the sample data which is likely to belong to the support vectors in training sample set, and the experiment has achieved satisfactory results in speech recognition application. By the method, the learning efficiency and generalization ability of SVM classifier can be significant improved.

Key words: Support Vector Machine(SVM), Kernel Fuzzy C-Means(KFCM), pre-selection, multiclass classification, speech recognition

摘要: 支持向量机在训练过程中,将很多时间都浪费在对非支持向量的复杂计算上,特别是对于大规模数据量的语音识别系统来说,支持向量机在训练时间上不必要的开销将会更加显著。核模糊C均值聚类是一种常用的典型动态聚类算法,并且有核函数能够把模式空间的数据非线性映射到高维特征空间。在核模糊C均值聚类的基础上,结合了多类分类支持向量机中的一对一方法,按照既定的准则把训练样本集中有可能属于支持向量的样本数据进行预选取,并应用到语音识别中。实验取得了较好的结果,该方法有效地提高了支持向量机分类器的学习效率和泛化能力。

关键词: 支持向量机, 核模糊C均值, 预选取, 多类分类, 语音识别