计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (14): 150-155.

• 模式识别与人工智能 • 上一篇    下一篇

多类型语音特征进化选择算法

张小恒1,2,谢文宾2,李勇明2   

  1. 1.重庆广播电视大学,重庆 400052
    2.重庆大学 通信工程学院,重庆 400044
  • 出版日期:2016-07-15 发布日期:2016-07-18

Multiple voice features types evolutionary selection algorithm

ZHANG Xiaoheng1,2, XIE Wenbin2, LI Yongming2   

  1. 1.Chongqing Radio & TV University, Chongqing 400052, China
    2.College of Communication Engineering, Chongqing University, Chongqing 400044, China
  • Online:2016-07-15 Published:2016-07-18

摘要: 基于特征选择的语音特征获取用于说话人识别是目前较为有效的方式。但是,最优语音特征随着具体应用环境的变化而不同。因此,提出了基于四类型语音特征封装式遗传特征选择算法(FSF-WrGAF),该算法提取了四种类型的语音特征参数,通过链式智能体遗传算法和GMM-UBM进行封装式动态特征选择,获取高精度的识别准确率。采用了多种指标完成该算法的性能测试。实验结果表明,该算法具体实现过程简便,改进效果明显,较同类算法在多项指标(识别率,EER,DET曲线)上都有显著提高。

关键词: 说话人识别, 多类型语音特征, 链式智能体遗传算法, 伽马通滤波器倒谱系数(GFCC), 梅尔频率倒谱系数(MFCC), 线性预测倒谱系数(LPCC)

Abstract: Speech feature extraction based on feature selection is a very effective method for speaker recognition. However, the optimal speech features have also changed. Therefore, this paper proposes a kind of four kinds of speech feature wrapper selection framework algorithm(FSF-WrGAF). The algorithm extracts four kinds of speech features, and conducts dynamic wrapper feature selection by Chainlike Agent Genetic Algorithm(CAGA) and Gaussian Mixture Model-Universal Background Model(GMM-UBM), thereby obtaining high recognition accuracy. Several algorithms are compared in the experiment part. Experimental results show that the FSF-WrGAF algorithm can obtain apparent improvement in terms of accuracy, equal error rate and detection cost compared with some other algorithms.

Key words: speaker recognition, multiple voice features types, chain-like agent genetic algorithm, Gammatone Frequency Cepstrum Coefficient(GFCC), Mel Frequency Cepstrum Coefficient(MFCC), Linear Prediction Cepstrum Coefficient(LPCC)