K-means clustering algorithm based on coefficient of variation

Abstract

Abstract: The performance of k-means clustering algorithm depends on the selection of distance metrics. The Euclid distance is commonly chosen as the similarity measure in k-means clustering algorithm, which treats all features equally and does not accurately reflect the dissimilarity among samples. K-means clustering algorithm based on Coefficient of Variation（CV-k-means） is proposed in this paper to solve this problem. The CV-k-means clustering algorithm uses variation coefficient weight vector to decrease the affects of irrelevant features. The experimental results show that the proposed algorithm can generate better clustering results than k-means algorithm.

Key words: k-means clustering, dissimilarity measure, weighting, coefficient of variation

摘要： K-means聚类算法的性能依赖于距离度量的选择，k-means算法将欧几里德距离作为最常用的距离度量方法。欧氏距离认为所有属性在聚类中作用是相同的，但是这种距离度量方法并不能准确反映样本间的相异性。针对这种不足，提出了融合变异系数的k-means聚类分析方法（CV-k-means），利用变异系数权重向量来减少不相关属性的影响。实验结果表明，该方法的聚类结果优于k-means算法。

关键词: k-means 算法, 相异性度量, 权, 变异系数

FAN Alin, REN Shuhua. K-means clustering algorithm based on coefficient of variation[J]. Computer Engineering and Applications, 2012, 48(35): 114-117.

范阿琳，任树华. 一种融合变异系数的k-mean聚类分析方法[J]. 计算机工程与应用, 2012, 48(35): 114-117.

[1]	LI Li, JI Xinyuan, SONG Song. Prediction Model for Number of Software Defects in Loop [J]. Computer Engineering and Applications, 2021, 57(7): 158-163.
[2]	WANG Changlong, ZHANG Yuandong, MIAO Hong, YANG Yuheng. Application of Double Channel Convolutional Neural Network in Pumpkin Diseases Identification [J]. Computer Engineering and Applications, 2021, 57(5): 183-189.
[3]	WANG Peng, YE Xueyi, WANG Tao, QIAN Dingwei. Face Recognition Based on Double Variation and Double Space Local Directional Pattern [J]. Computer Engineering and Applications, 2021, 57(4): 91-99.
[4]	CHEN Junfeng, ZHENG Zhongtuan. Over-Sampling Method on Imbalanced Data Based on WKMeans and SMOTE [J]. Computer Engineering and Applications, 2021, 57(23): 106-112.
[5]	ZHANG Ziran, HUANG Weihua, CHEN Yang, ZHANG Zheng, LI Ziyuan. Improved Ant Colony Path Planning Algorithm Based on Bidirectional Search [J]. Computer Engineering and Applications, 2021, 57(21): 270-277.
[6]	SONG Zhonghao, GU Yu, CHEN Xu, NIE Shengdong. Target Detection in High-Resolution Remote Sensing Image Based on Weighted Strategy [J]. Computer Engineering and Applications, 2021, 57(13): 199-206.
[7]	XU Le, WEI Yuke. Triangulation Algorithm Based on Three-Point Positioning and Weighted Coordinates [J]. Computer Engineering and Applications, 2020, 56(9): 111-116.
[8]	LU Junjie, HUANG Jinquan, LU Feng. Likelihood K-means Clustering for Gas Path Failure Diagnostics of Turbofan Engine [J]. Computer Engineering and Applications, 2020, 56(9): 136-141.
[9]	WANG Weihong, ZENG Yingjie. Collaborative Filtering Recommendation Algorithm Based on Clustering and User Preference [J]. Computer Engineering and Applications, 2020, 56(3): 68-73.
[10]	WEI Lihua, CHEN Gang. Multi-stage Multi-attribute Emergency Decision Making Method Based on Prospect Theory with Interval-Valued Pythagorean Fuzzy Linguistic Numbers [J]. Computer Engineering and Applications, 2020, 56(24): 109-115.
[11]	YI Junyan, WU Boya, YONG Qiaoling. Research on Clustering Algorithm of Elastic Net with Weighted Characteristics [J]. Computer Engineering and Applications, 2020, 56(22): 55-65.
[12]	MA Jinghui, PAN Wei, WANG Ru. 3D Point Cloud Classification Based on K-means Clustering [J]. Computer Engineering and Applications, 2020, 56(17): 181-186.
[13]	GUO Yongkun, ZHANG Xinyou, LIU Liping, DING Liang, NIU Xiaolu. K-means Clustering Algorithm of Optimizing Initial Clustering Center [J]. Computer Engineering and Applications, 2020, 56(15): 172-178.
[14]	PANG Lili1, XU Qiqing1, XIE Jiaye1，2. Reduction of NLOS Error in Three-Dimensional Positioning under Ellipsoid Constraint [J]. Computer Engineering and Applications, 2019, 55(7): 115-119.
[15]	LIU Qiang1, SHI Hong1, WANG Pingxin2，3, YANG Xibei1. Three-Way Clustering Analysis Based on [ε] Neighborhood [J]. Computer Engineering and Applications, 2019, 55(6): 140-144.

K-means clustering algorithm based on coefficient of variation

一种融合变异系数的k-mean聚类分析方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics