二次幂耦合的[K]-means聚类算法研究

doi:10.3778/j.issn.1002-8331.2006-0318

计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (14): 95-102.DOI: 10.3778/j.issn.1002-8331.2006-0318

二次幂耦合的[K]-means聚类算法研究

相益萱，姜合，潘品臣，孙聪慧

齐鲁工业大学（山东省科学院）计算机科学与技术学院，济南 250353

出版日期:2021-07-15 发布日期:2021-07-14

Study on [K]-means Clustering Algorithm of Quadratic Power Coupling

XIANG Yixuan, JIANG He, PAN Pinchen, SUN Conghui

School of Computer Science and Technology, Qilu University of Technology（Shandong Academy of Sciences）, Jinan 250353, China

Online:2021-07-15 Published:2021-07-14

摘要/Abstract

摘要：

在聚类研究中，通常认为数据集的对象、属性等方面是满足独立同分布的，它们之间是互不影响的，然而实际上它们之间存在着某些潜在的联系，即非独立同分布。为了更好地挖掘其存在的潜在关系，将数据集进行二次幂处理，计算皮尔森相关系数后得到二次幂耦合的数据集样本，为了解决[K]-means聚类算法存在选取初始中心点的敏感性问题，基于密度的思想，通过计算密度参数合理调整高密度区域，利用聚类迭代的方法进行选点，将高密度区域中的密度最大点作为初始点，距离初始点最远点作为第二个点，以前两个点为中心聚类迭代得到两个质心，将距离两个质心最远的点作为第三点，以此类推，实验结果表明所给的算法能够得到较高的准确率，较少的迭代次数，以及相对较好的聚类效果。

关键词: 非独立同分布, 二次幂耦合, 皮尔森相关系数, 聚类迭代, [K]-means聚类算法

Abstract:

In clustering research, it is generally believed that the objects, attributes and other aspects of data sets are independent and identically distributed, and they do not affect each other. However, in fact, there are some potential relations between them, namely, Non-IID. In order to better mine the potential relationship, the data set is processed by the second power, and the data set samples coupled by the second power are obtained after calculating Pearson correlation coefficient. In order to solve the sensitivity problem of [K]-means clustering algorithm in selecting the initial center point, based on the idea of density, the high-density region is reasonably adjusted by calculating the density parameters, The clustering iteration method is used to select the points. The maximum density point in the high-density region is taken as the initial point, the farthest point from the initial point is taken as the second point, and the previous two points are taken as the center. Two centroids are obtained by clustering iteration, and the farthest point from the two centroids is taken as the third point, By analogy, the results show that it can get higher accuracy, fewer iterations, and relatively good clustering effect.

Key words: non-IID（Independent and Identically Distributed）, quadratic power coupling, Pearson correlation coefficient, clustering iteration;[K]-means clustering algorithm

相益萱，姜合，潘品臣，孙聪慧. 二次幂耦合的[K]-means聚类算法研究[J]. 计算机工程与应用, 2021, 57(14): 95-102.

XIANG Yixuan, JIANG He, PAN Pinchen, SUN Conghui. Study on [K]-means Clustering Algorithm of Quadratic Power Coupling[J]. Computer Engineering and Applications, 2021, 57(14): 95-102.

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	1	0	57

来源	本网站	其他网站

次数	56	2
比例	97%	3%

摘要

最新录用	在线预览	正式出版

0	0	75

	来源	本网站

	次数	75
	比例	100%

[1]	张子然，黄卫华，陈阳，章政，李梓远. 基于双向搜索的改进蚁群路径规划算法[J]. 计算机工程与应用, 2021, 57(21): 270-277.
[2]	郭永坤，章新友，刘莉萍，丁亮，牛晓录. 优化初始聚类中心的K-means聚类算法[J]. 计算机工程与应用, 2020, 56(15): 172-178.
[3]	董本志，聂丽郦，景维鹏，崔航. 基于Faster R-CNN的榆紫叶甲虫识别方法研究[J]. 计算机工程与应用, 2018, 54(23): 89-93.
[4]	董丽丽，董玮，张翔. 利用CUDA提高内存数据聚类效能的研究[J]. 计算机工程与应用, 2015, 51(22): 243-251.
[5]	张端¹，刘渊^1，2，郝建东¹. 模糊聚类和QPSO算法在Ad Hoc异常检测中的应用[J]. 计算机工程与应用, 2010, 46(30): 92-94.
[6]	张燕平，王杨，赵姝. 应用Normal矩阵谱平分法的多社团发现[J]. 计算机工程与应用, 2010, 46(27): 43-45.
[7]	周涛^1，2. 具有自适应参数的粗糙k-means聚类算法[J]. 计算机工程与应用, 2010, 46(26): 7-10.

二次幂耦合的[K]-means聚类算法研究

Study on [K]-means Clustering Algorithm of Quadratic Power Coupling

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 7

编辑推荐 0

Metrics