Computer Engineering and Applications ›› 2012, Vol. 48 ›› Issue (12): 100-111.

Previous Articles     Next Articles

Clustering algorithm research advances on data mining

ZHOU Tao, LU Huiling   

  1. School of Science, Ningxia Medical University, Yinchuan, Ningxia 750004, China
  • Online:2012-04-21 Published:2012-04-20

数据挖掘中聚类算法研究进展

周  涛,陆惠玲   

  1. 宁夏医科大学 理学院,宁夏 银川 750004

Abstract: Clustering analysis is one of important research branches in data mining. Clustering criterion, similarity degree are illustrated; five kinds of traditional clustering algorithms are summarized, and their latest developments are pointed out; according to attribution ralation of the sample, sample data pre-processing, similarity measure of sample, sample update strategy, high-dimension of sample and integration with other disciplines, there are more than 20 clustering algorithms are explained and summarized, such as granular clustering, uncertainty clustering, quantum clustering, kernel clustering, spectral clustering, clustering ensemble, concept clustering, spherical shell clustering, affinity propagation clustering. That is a good summary and of positive significance for the clustering.

Key words: data mining, clustering algorithm, clustering criterion

摘要: 聚类分析是数据挖掘中重要的研究内容之一,对聚类准则进行了总结,对五类传统的聚类算法的研究现状和进展进行了较为全面的总结,就一些新的聚类算法进行了梳理,根据样本归属关系、样本数据预处理、样本的相似性度量、样本的更新策略、样本的高维性和与其他学科的融合等六个方面对聚类中近20多个新算法,如粒度聚类、不确定聚类、量子聚类、核聚类、谱聚类、聚类集成、概念聚类、球壳聚类、仿射聚类、数据流聚类等,分别进行了详细的概括。这对聚类是一个很好的总结,对聚类的发展具有积极意义。

关键词: 数据挖掘, 聚类算法, 聚类准则