计算机工程与应用 ›› 2017, Vol. 53 ›› Issue (22): 126-129.DOI: 10.3778/j.issn.1002-8331.1605-0416

• 模式识别与人工智能 • 上一篇    下一篇

基于样本加权的格拉斯曼平均算法

钟  倩,杨  丽,梁志贞   

  1. 中国矿业大学 计算机科学与技术学院,江苏 徐州 221116
  • 出版日期:2017-11-15 发布日期:2017-11-29

Grassmann average algorithm based on sample weight

ZHONG Qian, YANG Li, LIANG Zhizhen   

  1. School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
  • Online:2017-11-15 Published:2017-11-29

摘要: 格拉斯曼平均子空间对应着高斯数据的主成分,解决了PCA的扩展性问题,但算法假定样本的贡献取决于样本的长度,这可能导致离群点对算法的干扰较强。为此,利用无监督学习数据的局部特性或监督学习中样本的类别信息建立样本的权重,从而提出一种基于样本加权的格拉斯曼平均的算法,在UCI数据集和ORL人脸数据库上的实验结果表明,新算法有好的鲁棒性并且其识别率比已有方法提高1%~2%。

关键词: 样本加权, 格拉斯曼平均, 主成分分析法, 鲁棒性

Abstract: Grassmann average subspace corresponds to the leading principal component for Gaussian data  and a scalable principal component analysis is provided. But it assumes that the contribution of each sample is determined by its length, which is sensitive to outliers. For this reason, this paper proposes a novel algorithm based on sample weight and Grassmann average by using local characteristics of data in unsupervised learning or class information in supervised learning.The experimental results on UCI data sets and ORL data set show that the algorithm is more robust and it has improved recognition rate by 1%~2%.

Key words: sample weight, Grassmann average, Principal Component Analysis(PCA), robust