计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (27): 120-123.DOI: 10.3778/j.issn.1002-8331.2010.27.033

• 数据库、信号与信息处理 • 上一篇    下一篇

层次聚类的簇集成方法研究

李 凯,王 兰   

  1. 河北大学 数学与计算机学院,河北省机器学习与计算智能实验室,河北 保定 071002
  • 收稿日期:2009-03-03 修回日期:2009-05-11 出版日期:2010-09-21 发布日期:2010-09-21
  • 通讯作者: 李 凯

Research on cluster ensembles methods based on hierarchical clustering

LI Kai,WANG Lan   

  1. School of Mathematic and Computer,HeBei University,Key Lab in Machine Learning and Computational Intelligence of Hebei Province,Baoding,Hebei 071002,China
  • Received:2009-03-03 Revised:2009-05-11 Online:2010-09-21 Published:2010-09-21
  • Contact: LI Kai

摘要: 聚类集成比单个聚类方法具有更高的鲁棒性和精确性,它主要由两部分组成,即个体成员的产生和结果的融合。针对聚类集成,首先用k-means聚类算法得到个体成员,然后使用层次聚类中的单连接法、全连接法与平均连接法进行融合。为了评价聚类集成方法的性能,实验中使用了ARI(Adjusted Rand Index)。实验结果表明,平均连接法的聚类集成性能优于单连接法和全连接法。研究并讨论了融合方法的聚类正确率和集成规模的关系。

Abstract: Cluster ensembles method is considered as a robust and accurate alternative to single clustering runs.It mainly consists of both generation of individual member and fusion methods.In this paper,the cluster ensembles are studied where individual members are obtained based on k-means clustering algorithm and fusion method of hierarchical clustering is used.Three consensus functions,which are single linkage,complete linkage and average linkage,respectively,is studied and discussed in hierarchical clustering fusion.For evaluating performance of cluster ensembles,Adjusted Rand Index is considered.Experimental results show that performance of cluster ensembles with the average linkage is superior to one with single linkage and complete linkage.Moreover,the relationship between accuracy and ensemble size of the three fusion methods is also studied and discussed.

中图分类号: