Computer Engineering and Applications ›› 2018, Vol. 54 ›› Issue (7): 152-158.DOI: 10.3778/j.issn.1002-8331.1610-0342

Previous Articles     Next Articles

K-means algorithm of clustering number and centers self-determination

JIA Ruiyu, LI Yugong   

  1. School of Computer Science and Technology, Anhui University, Hefei 230601, China
  • Online:2018-04-01 Published:2018-04-16

类簇数目和初始中心点自确定的K-means算法

贾瑞玉,李玉功   

  1. 安徽大学 计算机科学与技术学院,合肥 230601

Abstract: K-means algorithm is a classical clustering algorithm based on partition. However, it is difficult for K-means to determine the number of clustering. Besides, K-means is sensitive to the initial centers of clustering. In order to solve the two defects of K-means algorithm, an improved K-means algorithm is proposed. Main work of this paper is putting forward a new method of calculating the density of the object, and using residual analysis method to automatically obtain the initial centers and number of clustering from the decision diagram. The result of experiment shows that the algorithm can get better clustering results.

Key words: clustering, local density, decision diagram, residual analysis

摘要: K-means算法是经典的基于划分的聚类算法。针对K-means算法的类簇数目难以确定、对初始聚类中心敏感的缺陷,提出了改进的K-means算法,重新定义了计算样本对象密度的方法,并且运用残差分析的方法从决策图中自动获取初始聚类中心和类簇数目。实验结果表明该算法可获得更好的聚类效果。

关键词: 聚类, 局部密度, 决策图, 残差分析