计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (33): 121-123.DOI: 10.3778/j.issn.1002-8331.2010.33.034

• 数据库、信号与信息处理 • 上一篇    下一篇

基于代表点的快速聚类算法

贾瑞玉,耿锦威,宁再早,何成刚   

  1. 安徽大学 计算机科学与技术学院,合肥 230039
  • 收稿日期:2010-05-24 修回日期:2010-08-27 出版日期:2010-11-21 发布日期:2010-11-21
  • 通讯作者: 贾瑞玉

Fast clustering algorithm based on representative points

JIA Rui-yu,GENG Jin-wei,NING Zai-zao,HE Cheng-gang   

  1. School of Computer Science and Technology,Anhui University,Hefei 230039,China
  • Received:2010-05-24 Revised:2010-08-27 Online:2010-11-21 Published:2010-11-21
  • Contact: JIA Rui-yu

摘要: 针对传统的层次聚类算法每次迭代只将距离最小的那对类簇合并,容易受离群点影响,偏向于发现凸状或球状簇等缺点,受CURE算法启发,采用簇中固定数量代表点来代表簇对象进行距离的计算,并结合90_10规则,提出了一种改进的层次聚类算法REPBFC(REpresentative Points Based Fast Clustering),实验表明该算法是有效的。

关键词: 90_10规则, 多阶段聚类, 聚类算法

Abstract: To the drawback of traditional hierarchical clustering algorithm that only merges one pair of the most similar clusters each iteration,easily influenced by outliers and favoring clusters with spherical shapes,inspired by CURE algorithm and the 90_10 rule,an improved clustering algorithm named REPBFC(REpresentative Points Based Fast Clustering) is proposed.The experiments show that the improved algorithm is efficient.

Key words: 90_10 rule, multistage clustering, clustering algorithm

中图分类号: