Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (14): 15-26.DOI: 10.3778/j.issn.1002-8331.2103-0547

Previous Articles     Next Articles

Survey of Spectral Clustering Algorithms

BAI Lu, ZHAO Xin, KONG Yuting, ZHANG Zhenghang, SHAO Jinxin, QIAN Yurong   

  1. 1.College of Software, Xinjiang?University, Urumqi 830046, China
    2.Key Laboratory of Software Engineering, Xinjiang?University, Urumqi 830046, China
    3.Key Laboratory of Signal Detection and Processing in Xinjiang Uygur Autonomous Region, Urumqi 830046, China
  • Online:2021-07-15 Published:2021-07-14

谱聚类算法研究综述

白璐,赵鑫,孔钰婷,张正航,邵金鑫,钱育蓉   

  1. 1.新疆大学 软件学院,乌鲁木齐 830046
    2.新疆大学 软件学院重点实验室,乌鲁木齐 830046
    3.新疆维吾尔自治区信号检测与处理重点实验室,乌鲁木齐 830046

Abstract:

Cluster analysis is a common analysis method. As a branch of cluster analysis, spectral clustering has attracted much attention because of its characteristics such as not being restricted by sample shape. In order to timely grasp the current research trends of spectral clustering algorithm, the spectral clustering optimization algorithms are divided into three categories from three perspectives:semi-supervised learning, two-stage clustering algorithm selection and algorithm execution efficiency optimization, and the optimization ideas of each category of algorithms are summarized. Firstly, the classical k-way spectral clustering and its basic theory are introduced, the reasons and influences of the selection of similarity matrix, eigenvalues and eigenvectors are introduced and analyzed. The purpose is to clarify its importance in the clustering process and the necessity of optimization in this part. Furthermore, based on the difference of algorithm improvement strategies, it sorts out and summarizes the improvement ideas, research status, advantages and disadvantages of each type of algorithm. Finally, the spectral clustering algorithm and optimization algorithms are compared on UCI data sets and handwritten data sets, and the future research trends in spectral clustering optimization algorithm are discussed.

Key words: clustering algorithm, spectral clustering;[K]-means, affinity matrix

摘要:

聚类分析是一种常见的分析方法,谱聚类作为聚类分析的一支,因其不受样本形状约束等特点备受瞩目。为及时掌握当前谱聚类算法研究动态,通过对比分析众多谱聚类优化算法,从半监督学习、二阶段聚类算法选择、算法执行效率优化等三个角度,将谱聚类优化算法分为三类,并对每类算法的优化思想进行综述。介绍经典多路谱聚类与基本理论,并分析相似矩阵及其特征值、特征向量选取原因及影响,旨在明确特征矩阵的重要性与优化的必要性。基于算法改进策略差异,梳理并总结每类算法的改进思想、研究现状及优缺点。在UCI数据集与手写体数据集上,针对谱聚类算法与优化算法进行实验对比,并对谱聚类优化算法的未来研究方向进行展望。

关键词: 聚类算法, 谱聚类算法, [K]-均值算法, 相似矩阵