Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (6): 176-180.

• 图形、图像、模式识别 • Previous Articles     Next Articles

Kernel-based fast improved possibilistic C-means clustering method

HAN Xudong,XIA Shixiong,LIU Bing,ZHOU Yong   

  1. College of Computer Science and Technology,China University of Mining and Technology,Xuzhou,Jiangsu 221116,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-02-21 Published:2011-02-21

一种基于核的快速可能性聚类算法

韩旭东,夏士雄,刘 兵,周 勇   

  1. 中国矿业大学 计算机学院,江苏 徐州 221116

Abstract: Most of the traditional fast clustering algorithms are based on fuzzy C means algorithm(Fuzzy C-means,FCM),while FCM is sensitive to the initial cluster centers and noises and is prone to converge to local minimum values,leading to low clustering accuracy.The possibilistic C-means clustering algorithm solves the noise-sensitivity for FCM but tends to find identical clusters.The clustering algorithm which combines FCM and the possibilistic C-means clustering algorithm,solves the problem of coincidence.In order to further improve the speed of convergence and the robustness of the algorithm,this paper proposes a kernel-based fast improved possibilistic C-means clustering algorithm.The algorithm introduces the idea of kernel clustering,and at the same time,optimizes the parameter η in the objective function by the help of the sample variance.The results of the experiments on standard data sets and synthetic data sets show that the kernel-based fast improved possibilistic C-means clustering algorithm can improve the accuracy of clustering and the speed of convergence.

Key words: fuzzy C-means clustering, possibilistic clustering, kernel clustering

摘要: 传统的快速聚类算法大多基于模糊C均值算法(Fuzzy C-means,FCM),而FCM对初始聚类中心敏感,对噪音数据敏感并且容易收敛到局部极小值,因而聚类准确率不高。可能性C-均值聚类较好地解决了FCM对噪声敏感的问题,但容易产生一致性聚类。将FCM和可能性C-均值聚类结合的聚类算法较好地解决了一致性聚类问题。为进一步提高算法收敛速度和鲁棒性,提出一种基于核的快速可能性聚类算法。该方法引入核聚类的思想,同时使用样本方差对目标函数中参数η进行优化。标准数据集和人造数据集的实验结果表明这种基于核的快速可能性聚类算法提高了算法的聚类准确率,加快了收敛速度。

关键词: 模糊C-均值聚类, 可能性聚类, 核聚类