计算机工程与应用 ›› 2015, Vol. 51 ›› Issue (11): 124-128.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

模糊C-均值聚类算法的优化

熊拥军,刘卫国,欧鹏杰   

  1. 中南大学 信息科学与工程学院,长沙 410083
  • 出版日期:2015-06-01 发布日期:2015-06-12

New optimized fuzzy C-means clustering algorithm

XIONG Yongjun, LIU Weiguo, OU Pengjie   

  1. School of Information Science and Engineering, Central South University, Changsha 410083, China
  • Online:2015-06-01 Published:2015-06-12

摘要: 针对传统模糊C-均值聚类算法(FCM算法)初始聚类中心选择的随机性和距离向量公式应用的局限性,提出一种基于密度和马氏距离优化的模糊C-均值聚类算法(Fuzzy C-Means Based on Mahalanobis and Density,FCMBMD算法)。该算法通过计算样本点的密度来确定初始聚类中心,避免了初始聚类中心随机选取而产生的聚类结果的不稳定;采用马氏距离计算样本集的相似度,以满足不同度量单位数据的要求。实验结果表明,FCMBMD算法在聚类中心、收敛速度、迭代次数以及准确率等方面具有良好的效果。

关键词: 聚类, 模糊C-均值, 密度函数, 马氏距离, 基于密度和马氏距离优化的模糊C-均值聚类(FCMBMD)算法

Abstract: In the light of the randomness of the initial clustering center selection and the limitations of distance vector formula application with the traditional Fuzzy C-Means clustering algorithm (FCM), the optimized fuzzy C-means clustering algorithm (FCMBMD) is proposed. The algorithm is to determine the initial cluster center by computing the density of sample point, so it avoids the instability of clustering result generated randomly by initial cluster centers. In addition, it also meets the requirements of different units of measurement data using the similarity of Mahalanobis distance calculation sample set. The experimental result shows that FCMBMD algorithm has better effect in clustering center, convergence speed, iterations, accuracy, and so on.

Key words: clustering, Fuzzy C-Means(FCM), density function, Mahalanobis distance, Fuzzy C-Means Based on Mahalanobis and Density(FCMBMD) algorithm