计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (6): 178-181.

• 数据库与信息处理 • 上一篇    下一篇

基于最近共享邻居节点的K-means聚类算法

单世民,于 红,张业嘉诚,刘馨月   

  1. 大连理工大学 软件学院,辽宁 大连 116620
  • 收稿日期:2007-06-12 修回日期:2007-08-24 出版日期:2008-02-21 发布日期:2008-02-21
  • 通讯作者: 单世民

K-means based on shared nearest neighbor

SHAN Shi-min,YU Hong,ZHANG Ye-jia-cheng,LIU Xin-yue   

  1. Software College,Dalian University of Technology,Dalian,Liaoning 116620,China
  • Received:2007-06-12 Revised:2007-08-24 Online:2008-02-21 Published:2008-02-21
  • Contact: SHAN Shi-min

摘要: 聚类分析是一种重要的数据挖掘方法。K-means聚类算法在数据挖掘领域具有非常重要的应用价值。针对K-means需要人工设定聚类个数并且易陷入局部极优的缺陷,提出了一种基于最近共享邻近节点的K-means聚类算法(KSNN)。KSNN在数据集中搜索中心点,依据中心点查找数据集个数,为K-means聚类提供参数。从而克服了K-means需要人工设定聚类个数的问题,同时具有较好的全局收敛性。实验证明KSNN算法比K-means、粒子群K-means(pso)以及多中心聚类算法(MCA)有更好的聚类效果。

Abstract: Cluster analysis is one of the most important fields of data mining.K-means algorithm has important value in data mining.K-means must be given the number of clusters and it forms local convergence easily.So a new clustering algorithm,K-means based on the Shared Nearest Neighbor(KSNN),is designed.KSNN finds the core nodes of the data to get the number of clusters and takes it as the parameter for K-means automatically.It conquers the problem that the number of clusters to K-means must be defined by humans,meanwhile it has better global convergence.Experimental results show that KSNN is more effective than K-means,Particle Swarm Optimal(PSO) and multiseed core algorithm(MCA).