Computer Engineering and Applications ›› 2017, Vol. 53 ›› Issue (6): 162-168.DOI: 10.3778/j.issn.1002-8331.1510-0147

Previous Articles     Next Articles

Multi-set canonical correlations analysis based on ensemble learning

QIU Aikun, ZHU Jiagang   

  1. College of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2017-03-15 Published:2017-05-11

基于集成学习的多重集典型相关分析方法

邱爱昆,朱嘉钢   

  1. 江南大学 物联网工程学院,江苏 无锡 214122

Abstract: Feature extraction, which has significance to improve the classification performance, is an important problem in pattern recognition. The commonly used feature extraction method includes: Principal Components Analysis, Linear Discriminant Analysis and Canonical Correlations Analysis and so on. Multi-set Canonical Correlations Analysis, which is based on traditional CCA, uses multiple dataset to do feature extraction. This paper proposes a new method EMCCA which combines MCCA with Ensemble learning, divides the sample dataset into some sample datasets and does feature extraction on these datasets. The experimental results on UCI standard sets show that: compared with traditional PCA, CCA, EMCCA has better performance in feature extraction and classification with the help of ensemble learning.

Key words: feature extraction, multi-set canonical correlations analysis, ensemble learning, pattern recognition

摘要: 特征提取是模式识别中的关键问题之一,对提高系统分类性能具有重要意义。常用的特征提取方法包括主成分分析、线性鉴别分析、典型相关分析等等,多重集典型相关分析是基于传统的典型相关分析基础上发展而来,利用多组(大于2)特征数据集进行特征提取。基于集成学习的多重集典型相关分析的方法(EMCCA),是通过将样本化分成若干小的样本,形成若干个特征数据集,利用多重集典型相关分析对这组数据集做特征提取,并结合集成学习对样本进行分类。在UCI上的多特征手写体数据集上的实验结果表明:相比于传统的PCA,CCA特征提取方法,多重集典型相关分析具有更优的特征提取效果,结合集成学习后具有更好的分类效果。

关键词: 特征提取, 多重集典型相关分析, 集成学习, 模式识别