Improved K-Prototypes Clustering Algorithm

doi:10.3778/j.issn.1002-8331.1912-0106

Abstract

Abstract:

There are some problems in the K-Prototypes clustering algorithm, such as manually specifying the initial clustering center and the number of clusters, which will lead to low accuracy and stability of the algorithm. In order to solve these problems, this paper proposes a K-Prototypes clustering algorithm based on density optimization, which can adaptively optimize the setting of the number of clusters and the initial clustering according to the distribution density of data objects, and can improve the accuracy of clustering by distinguishing the different influence weights of each attribute on clustering results and improve the distance calculation formula by distinguishing the different influence weights of each attribute on clustering results, which will improve the accuracy of clustering. The experimental results on synthetic data set and UCI data set show that the proposed method achieves better clustering results. Compared with K-Prototypes, DPCM and Fuzzy K-Prototypes, the average accuracy of the proposed method is improved by 8.52%, 4.28% and 8.33% respectively.

Key words: clustering algorithm, initial center points, density peak, mixed attributes

摘要：

针对K-Prototypes聚类算法中人为指定初始聚类中心和聚类数目导致算法准确度和稳定性低下的问题，提出了基于密度优化的K-Prototypes聚类算法，该算法根据数据对象的密度分布，自适应地优化聚类数目和初始聚类中心的设置，并通过区分每个属性对聚类结果的不同影响权重，改进相异度计算公式，提升聚类的准确度。在合成数据集和UCI数据集上实验结果表明，该算法与K-Prototypes算法、DPCM算法和Fuzzy K-Prototypes算法相比，平均准确率分别提高了8.52%、4.28%和8.33%，达到了相对较好的聚类结果。

关键词: 聚类算法, 初始中心点, 密度, 混合属性

SUN Zhiran, SU Hang, LIANG Yi. Improved K-Prototypes Clustering Algorithm[J]. Computer Engineering and Applications, 2020, 56(21): 54-59.

孙志冉，苏航，梁毅. 一种改进的K-Prototypes聚类算法[J]. 计算机工程与应用, 2020, 56(21): 54-59.

[1]	LAN Hong, HUANG Min. Fusion of KNN Optimized Density Peaks and FCM Clustering Algorithm [J]. Computer Engineering and Applications, 2021, 57(9): 81-88.
[2]	LI Li, JI Xinyuan, SONG Song. Prediction Model for Number of Software Defects in Loop [J]. Computer Engineering and Applications, 2021, 57(7): 158-163.
[3]	PENG Qihui, XUAN Shibin, GAO Qing. Distribution Automatic Threshold Density Peak Clustering Algorithm [J]. Computer Engineering and Applications, 2021, 57(5): 71-78.
[4]	WANG Junling, LU Xinming. Video Key Frame Extraction Algorithm Based on Semantic Correlation [J]. Computer Engineering and Applications, 2021, 57(4): 192-198.
[5]	WANG Fuyin, ZHANG Desheng, ZHANG Xiao. Adaptive Density Peaks Clustering Algorithm Combining with Whale Optimization Algorithm [J]. Computer Engineering and Applications, 2021, 57(3): 94-102.
[6]	ZHANG Ziran, HUANG Weihua, CHEN Yang, ZHANG Zheng, LI Ziyuan. Improved Ant Colony Path Planning Algorithm Based on Bidirectional Search [J]. Computer Engineering and Applications, 2021, 57(21): 270-277.
[7]	DING Songyang, TIAN Qingyun. Density Peak Clustering Algorithm Based on Ball-Tree [J]. Computer Engineering and Applications, 2021, 57(20): 90-96.
[8]	WEI Danni, YANG Youlong, QIU Haiquan. Self-Training Algorithm Combining Density Peak and Cut Edge Weight [J]. Computer Engineering and Applications, 2021, 57(2): 70-76.
[9]	WENG Yushang, XIAO Jinqiu, XIA Yu. Strip Surface Defect Detection Based on Improved Mask R-CNN Algorithm [J]. Computer Engineering and Applications, 2021, 57(19): 235-242.
[10]	BAI Lu, ZHAO Xin, KONG Yuting, ZHANG Zhenghang, SHAO Jinxin, QIAN Yurong. Survey of Spectral Clustering Algorithms [J]. Computer Engineering and Applications, 2021, 57(14): 15-26.
[11]	XIANG Yixuan, JIANG He, PAN Pinchen, SUN Conghui. Study on [K]-means Clustering Algorithm of Quadratic Power Coupling [J]. Computer Engineering and Applications, 2021, 57(14): 95-102.
[12]	YUE Xiaoxin, JIA Junxia, CHEN Xidong, LI Guang’an. Road Small Target Detection Algorithm Based on Improved YOLO V3 [J]. Computer Engineering and Applications, 2020, 56(21): 218-223.
[13]	WANG Pengyu, YOU Youpeng, YANG Xuefeng. Color Image Segmentation Based on Color Quantization and Density Peak Clustering [J]. Computer Engineering and Applications, 2020, 56(2): 211-215.
[14]	GUO Yongkun, ZHANG Xinyou, LIU Liping, DING Liang, NIU Xiaolu. K-means Clustering Algorithm of Optimizing Initial Clustering Center [J]. Computer Engineering and Applications, 2020, 56(15): 172-178.
[15]	JIA Lu, ZHANG Desheng, LV Duanduan. Optimized Density Peak Clustering Algorithm in Physics [J]. Computer Engineering and Applications, 2020, 56(13): 47-53.

Improved K-Prototypes Clustering Algorithm

一种改进的K-Prototypes聚类算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics