K-Means聚类算法研究综述

doi:10.3778/j.issn.1002-8331.1908-0347

计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (23): 7-14.DOI: 10.3778/j.issn.1002-8331.1908-0347

K-Means聚类算法研究综述

杨俊闯，赵超

河北工程大学信息与电气工程学院，河北邯郸 056038

出版日期:2019-12-01 发布日期:2019-12-11

Survey on K-Means Clustering Algorithm

YANG Junchuang, ZHAO Chao

College of Information and Electrical Engineering, Hebei University of Engineering, Handan, Hebei 056038, China

Online:2019-12-01 Published:2019-12-11

摘要/Abstract

摘要： K-均值（K-Means）算法是聚类分析中一种基于划分的算法，同时也是无监督学习算法。其具有思想简单、效果好和容易实现的优点，广泛应用于机器学习等领域。但是K-Means算法也有一定的局限性，比如：算法中聚类数目K值难以确定，初始聚类中心如何选取，离群点的检测与去除，距离和相似性度量等。从多个方面对K-Means算法的改进措施进行概括，并和传统K-Means算法进行比较，分析了改进算法的优缺点，指出了其中存在的问题。对K-Means算法的发展方向和趋势进行了展望。

关键词: K-Means, 聚类算法, 聚类中心, 离群点

Abstract: The K-Means algorithm is a partition-based algorithm in cluster analysis. With an unsupervised learning algorithm, its advantages of simple thinking, good effect and easy implementation are widely used in fields such as machine learning. But the K-Means algorithm also has certain limitations. For example, the K number of clusters in the algorithm is difficult to determine how to choose the initial cluster center, how to detect and remove outliers and the distance and similarity measure. This paper summarizes the improvement of K-Means algorithm from several aspects, and compares it with the classical K-Means algorithm. In addition, it analyzes the advantages and disadvantages of the improved algorithm, and points out the problems. Finally, the development direction and trend of K-Means algorithm are prospected.

Key words: K-Means, clustering algorithm, cluster center, outliers

杨俊闯，赵超. K-Means聚类算法研究综述[J]. 计算机工程与应用, 2019, 55(23): 7-14.

YANG Junchuang, ZHAO Chao. Survey on K-Means Clustering Algorithm[J]. Computer Engineering and Applications, 2019, 55(23): 7-14.

[1]	王昌龙，张远东，缪宏，杨煜恒. 双通道卷积神经网络在南瓜病害识别上的应用[J]. 计算机工程与应用, 2021, 57(5): 183-189.
[2]	王俊玲，卢新明. 基于语义相关的视频关键帧提取算法[J]. 计算机工程与应用, 2021, 57(4): 192-198.
[3]	王芙银，张德生，张晓. 结合鲸鱼优化算法的自适应密度峰值聚类算法[J]. 计算机工程与应用, 2021, 57(3): 94-102.
[4]	张子然，黄卫华，陈阳，章政，李梓远. 基于双向搜索的改进蚁群路径规划算法[J]. 计算机工程与应用, 2021, 57(21): 270-277.
[5]	丁松阳，田青云. Ball-Tree优化的密度峰值聚类算法[J]. 计算机工程与应用, 2021, 57(20): 90-96.
[6]	翁玉尚，肖金球，夏禹. 改进Mask R-CNN算法的带钢表面缺陷检测[J]. 计算机工程与应用, 2021, 57(19): 235-242.
[7]	程婧怡，段先华，朱伟. 改进YOLOv3的金属表面缺陷检测研究[J]. 计算机工程与应用, 2021, 57(19): 252-258.
[8]	白璐，赵鑫，孔钰婷，张正航，邵金鑫，钱育蓉. 谱聚类算法研究综述[J]. 计算机工程与应用, 2021, 57(14): 15-26.
[9]	相益萱，姜合，潘品臣，孙聪慧. 二次幂耦合的[K]-means聚类算法研究[J]. 计算机工程与应用, 2021, 57(14): 95-102.
[10]	周玉，朱文豪，房倩，白磊. 基于聚类的离群点检测方法研究综述[J]. 计算机工程与应用, 2021, 57(12): 37-45.
[11]	潘成胜，张斌，吕亚娜，杜秀丽，邱少明. 改进灰狼优化算法的K-Means文本聚类[J]. 计算机工程与应用, 2021, 57(1): 188-193.
[12]	高玮军，师阳，杨杰，张春霞. 一种改进的轻量人头检测方法[J]. 计算机工程与应用, 2021, 57(1): 207-212.
[13]	韩纪普，段先华，常振. 基于SLIC和区域生长的目标分割算法[J]. 计算机工程与应用, 2021, 57(1): 213-218.
[14]	范文兵，孙志远. 基于小波域广义高斯分布的SAR图像分割算法[J]. 计算机工程与应用, 2020, 56(5): 222-226.
[15]	王卫红，曾英杰. 基于聚类和用户偏好的协同过滤推荐算法[J]. 计算机工程与应用, 2020, 56(3): 68-73.

K-Means聚类算法研究综述

Survey on K-Means Clustering Algorithm

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics