计算机工程与应用 ›› 2013, Vol. 49 ›› Issue (2): 157-159.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

改进的层次K均值聚类算法

胡  伟   

  1. 山西财经大学 实验教学中心,太原 030006
  • 出版日期:2013-01-15 发布日期:2013-01-16

Improved hierarchical K-means clustering algorithm

HU Wei   

  1. Experimental Teaching Center, Shanxi University of Finance and Economics, Taiyuan 030006, China
  • Online:2013-01-15 Published:2013-01-16

摘要: 针对传统K均值聚类方法采用聚类前随机选择聚类个数K而导致的聚类结果不理想的问题,结合空间中的层次结构,提出一种改进的层次K均值聚类算法。该方法通过初步聚类,判断是否达到理想结果,从而决定是否继续进行更细层次的聚类,如此迭代执行,从而生成一棵层次型K均值聚类树,在该树形结构上可以自动地选择聚类的个数。标准数据集上的实验结果表明,与传统的K均值聚类方法相比,提出的改进的层次聚类方法的确能够取得较优秀的聚类效果。

关键词: K均值聚类, 聚类个数, 层次结构, 层次K均值聚类算法, 聚类树

Abstract: This paper presents an improved hierarchical K-means clustering algorithm combining hierarchical structure of space, in order to solve the problem that bad result of traditional K-means clustering method  by selecting the number of categories randomly before clustering. By primary K-means clustering, it determines whether re-clustering in the more fine level by the result of initial clustering. By repeated execution, a hierarchical K-means clustering tree is produced, and the number of clusters is selected automatically on this tree structure. Simulation results on UCI datasets demonstrate that comparing with traditional K-means clustering means, the better clustering results are obtained by the hierarchical K-means clustering model.

Key words: K-means clustering, clustering number, hierarchical structure, hierarchical K-means algorithm, clustering tree