计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (14): 84-89.

• 大数据与云计算 • 上一篇    下一篇

基于改进流形距离的粗糙集k-means聚类算法

欧  慧,夏卓群,武志伟   

  1. 长沙理工大学 计算机与通信工程学院,长沙 410114
  • 出版日期:2016-07-15 发布日期:2016-07-18

Rough k-means clustering algorithm based on improved manifold distance

OU Hui, XIA Zhuoqun, WU Zhiwei   

  1. College of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha 410114, China
  • Online:2016-07-15 Published:2016-07-18

摘要: 针对现有的基于流形距离的聚类算法对“绝对流形”数据集较“相对流形”数据集聚类效果佳和参数[ρ]在较大范围内变化时,聚类性能较差等问题,提出基于改进流形距离的粗糙集k-means聚类算法。该算法通过用属性划分和最大最小距离选择初始聚类中心,以改进的流形距离和粗糙集优化k-means,并结合终止判断条件以达到解决边界数据聚类问题和提升聚类效果的目的。仿真结果表明:该算法对“绝对流形”和“相对流形”数据集聚类效果均有较好改善,且参数变化对聚类性能影响较大。

关键词: k-means算法, 最大最小距离, 改进流形距离, 粗糙集, 适应度函数

Abstract: “Absolute manifold” dataset has better performance than the “relative manifold” one, the sick clustering performance while the parameter[ρ]varies with a wide range, which are the defects exited in clustering algorithm based on the manifold distance. To resolve these problems, a rough k-means clustering algorithm based on the improved manifold distance is proposed. In this algorithm, boundary data clustering problem and the clustering performance has been resolved and improved by choosing clustering center with attribute partitioning and the max-min distance method, optimizing k-means with the improved manifold distance and rough set and combining the termination of judgement conditions. The simulation results show that this algorithm can effectively improve both on the “absolute manifold” and “relative manifold” dataset clustering, and the variation of parameters has a greater impact of the clustering performance.

Key words: k-means algorithm, max-min distance, improved manifold distance, rough set, criterion function