计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (9): 156-161.DOI: 10.3778/j.issn.1002-8331.1901-0117

• 模式识别与人工智能 • 上一篇    下一篇

改进的模糊C-均值聚类有效性指标

严加展,陈华,李阳   

  1. 1.新疆大学 电气工程学院,乌鲁木齐 830047
    2.山东科技大学 计算机科学与工程学院,山东 青岛 266590
  • 出版日期:2020-05-01 发布日期:2020-04-29

Improved Fuzzy C-Means Clustering Validity Index

YAN Jiazhan, CHEN Hua, LI Yang   

  1. 1.College of Electrical Engineering, Xinjiang University, Urumqi 830047, China
    2.College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, Shandong 266590, China
  • Online:2020-05-01 Published:2020-04-29

摘要:

针对模糊C-均值的现有评价指标没有涉及到数据集的真实几何分布结构和先验信息的问题,为了能准确找到与数据样本自然分布相匹配的簇,提出了一种改进的有效性评价指标VCSC。该指标结合簇内数据平方误差和、隶属度权值及根号权值定义紧凑性度量,结合簇中心距离最小值、隶属度及各簇中心到平均簇中心的距离和定义分离度,结合隶属度范围及样本分布情况定义结合度量。实验结果表明,所提出的指标能够有效地对聚类结果进行评估,能够准确得出数据中最佳的聚类数目。

关键词: 模糊聚类, 有效性指标, 隶属度, 几何结构, 最佳聚类

Abstract:

Aiming at the problem that the existing evaluation index of fuzzy C-means does not involve the real geometric distribution structure and prior information of data sets, an improved evaluation index VCSC is proposed to accurately find clusters that match the natural distribution of data samples. This index defines the compactness measure by combining the sum of squared errors, membership degree weights and root number weights of cluster data, the minimum distance between cluster centers, membership degree and the distance between cluster centers and average cluster centers and the definition separation degree, and the combination measure is defined by combining the range of membership degree and the distribution of samples. The experimental results show that the proposed index can effectively evaluate the clustering results and accurately get the best number of clusters in the data.

Key words: fuzzy clustering, validity index, membership degree, geometric structure, optimal clustering