Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (24): 182-185.

• 图形、图像、模式识别 • Previous Articles     Next Articles

Fast fuzzy clustering algorithm based on L-ISOMAP for dimensional reduction

SUN Liping,DING Nan,WANG Yunzhong,MA Honglian   

  1. Department of Electronic and Information Engineering,Dalian University of Technology,Dalian,Liaoning 116024,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-08-21 Published:2011-08-21

基于L-ISOMAP降维的快速模糊聚类算法

孙丽萍,丁 男,王云中,马洪连   

  1. 大连理工大学 电子与信息工程学院,辽宁 大连 116024

Abstract: Fuzzy C-means(FCM) clustering algorithm is one of the widely applied algorithms in non-supervision of pattern recognition.However,FCM algorithm in the iterative process requires a lot of calculations,especially when feature vectors has high-dimensional,using clustering algorithm to sub-heap,not only is inefficient,but also may lead to “the curse of dimensionality”.For the problem,this paper analyzes the fuzzy C-means clustering algorithm in high dimensional feature of the process,the problem of cluster center is an np-hard problem.In order to improve the effectiveness and real-time of fuzzy C-means clustering algorithm in high dimensional feature analysis,an improved algorithm FCM-LI is proposed combining of landmark isometric(L-ISOMAP) algorithm.It analyzes the samples preliminarily,uses clustering results and the correlation of sample data,uses landmark isometric(L-ISOMAP) algorithm to reduce the dimension,further analyzes on the basis,obtains the final results.Experimental results show that the effectiveness and real-time of FCM-LI algorithm in high dimensional feature analysis.

Key words: fuzzy C-means clustering, isometric feature mapping, nonlinear dimensionality reduction

摘要: 模糊C-均值聚类算法是非监督模式识别中广泛应用的算法之一。但是,FCM算法在迭代过程中需要大量的计算,尤其当特征向量维数较高时,使用聚类分堆训练,不仅效率低下,还有可能导致“维数灾难”。针对该问题,分析模糊C-均值聚类算法在高维特征分析过程中,聚类中心的求解问题是一个np-hard问题,为了提高模糊C-均值聚类算法在高维特征分析中的实时性与有效性,结合界标等距映射(L-ISOMAP)算法,提出了改进算法FCM-LI,先对样本初步分析,利用聚类结果及样本数据相关性,使用界标等距映射(L-ISOMAP)算法降维,在此基础上进一步分析,获得最终分析结果。通过实验证明,FCM-LI算法在高维数据分析过程中的有效性与实时性。

关键词: 模糊C-均值聚类, 等距映射, 非线性降维