计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (22): 149-153.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

改进的最小生成树自适应分层聚类算法

徐晨凯,高茂庭   

  1. 上海海事大学 信息工程学院,上海 201306
  • 出版日期:2014-11-15 发布日期:2014-11-13

Improved adaptive hierarchical clustering algorithm based on minimum spanning tree

XU Chenkai, GAO Maoting   

  1. College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
  • Online:2014-11-15 Published:2014-11-13

摘要: 针对传统最小生成树聚类算法需要事先知道聚类数目和使用静态全局分类依据,导致聚类密度相差较大时,算法有效性下降,计算复杂度大等问题,提出一种改进的最小生成树自适应分层聚类算法,根据最近邻关系,自动为每个聚类簇设定独立的阈值,使之适应分布密度相差较大的情况,并能自动确定聚类数目。实验表明,算法具有较好的性能,尤其对数据密度分布不均匀的情况也能得到较好的聚类结果。

关键词: 最近邻, 自适应聚类, 最小生成树, 聚类分析

Abstract: Classical clustering algorithm based on the minimum spanning tree often needs to know the number of clusters beforehand and use static global threshold to cluster, which leads to the performance of the algorithm low and the computation complex for the uneven distributed data. An improved adaptive hierarchical clustering algorithm based on minimum spanning tree is proposed, which automatically generates different thresholds for every cluster to adapt for the uneven distributed data according to the nearest neighbor relationship and adaptively determines the number of clusters. Experiments demonstrate that this algorithm has good performance, especially could cluster effectively for the uneven distributed data.

Key words: nearest neighbor, adaptive clustering, minimum spanning tree, clustering analysis