Computer Engineering and Applications ›› 2010, Vol. 46 ›› Issue (20): 172-175.DOI: 10.3778/j.issn.1002-8331.2010.20.048

• 人工智能 • Previous Articles     Next Articles

Judgment of suitability of neighborhood size in manifold learning algorithms

SHAO Chao1,ZHANG Bin2,WAN Chun-hong1   

  1. 1.School of Computer and Information Engineering,Henan University of Finance and Economics,Zhengzhou 450002,China
    2.Experimental Teaching Center of Economics and Management,Henan University of Finance and Economics,Zhengzhou 450002,China
  • Received:2010-04-14 Revised:2010-05-18 Online:2010-07-11 Published:2010-07-11
  • Contact: SHAO Chao


邵 超1,张 斌2,万春红1   

  1. 1.河南财经学院 计算机与信息工程学院,郑州 450002
    2.河南财经学院 经济管理实验教学中心,郑州 450002
  • 通讯作者: 邵 超

Abstract: The success of manifold learning algorithms depends greatly upon selecting a suitable neighborhood size,however,it is an open problem how to do this efficiently.To solve this problem,this paper proposes an efficient method to judge the suitability of a given neighborhood size,by which a suitable neighborhood size can be selected efficiently.Based on the locally Euclidean property of the manifold,this method uses the PCA(Principal Component Analysis) reconstruction error to measure the linearity of each neighborhood in the neighborhood graph,and then judges the suitability of the corresponding neighborhood size according to the number of clusters of all the PCA reconstruction errors in the neighborhood graph,which is detected by BIC(Bayesian Information Criterion) in this paper.This method can judge the suitability of a given neighborhood size while not running the time-consuming manifold learning algorithm,so it is much more efficient than those methods based on residual variance.Finally,the effectivity of this method can be verified by experimental results well.

Key words: manifold learning, neighborhood size, Principal Component Analysis(PCA) reconstruction error, Bayesian Information Criterion(BIC)

摘要: 流形学习算法能否成功应用严重依赖于其邻域大小参数的选择是否合适,为此,提出了一种高效的邻域大小参数的合适性判定方法。基于流形的局部欧氏性,该方法用PCA(Principal Component Analysis,主成分分析)重建误差对邻域图上每一个邻域的线性程度进行衡量,然后根据邻域图上所有PCA重建误差的聚类个数来判定相应邻域大小的合适性。该方法无需象残差那样运行相对耗时的流形学习算法,从而具有较高的运行效率,其有效性可通过实验结果得以证实。

关键词: 流形学习, 邻域大小, 主成分分析(PCA)重建误差, 贝叶斯信息准则

CLC Number: