计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (19): 58-65.DOI: 10.3778/j.issn.1002-8331.1903-0292

• 理论与研发 • 上一篇    下一篇

特征逐减的可能性模糊聚类算法

余炳光,刘冬梅   

  1. 合肥工业大学 电气与自动化工程学院,合肥 230009
  • 出版日期:2019-10-01 发布日期:2019-09-30

Feature-Reduction Possibilistic Fuzzy Clustering Algorithm

YU Bingguang, LIU Dongmei   

  1. School of Electrical Engineering and Automation, Hefei University of Technology, Hefei 230009, China
  • Online:2019-10-01 Published:2019-09-30

摘要: 模糊C均值聚类(FCM)和可能性模糊C均值聚类(PFCM)没有考虑样本特征项及每个样本对聚类的贡献程度,存在对噪声较敏感的问题。特征减少的模糊聚类算法FRFCM可剔除数据集中无效特征量,且考虑了剩余特征量的权重,具有更好的聚类性能。对此,在可能性模糊C均值聚类算法(PFCM)的基础上将其与FRFCM算法相结合,提出新的特征逐减的可能性模糊C均值聚类算法(FRPFCM)。该算法解决了PFCM算法参数依赖的问题,且在迭代过程中可自动淘汰无效特征项并更新各特征项对聚类的贡献程度。对人工数据集以及UCI数据集进行测试的结果表明,提出的FRPFCM算法可得到更高的聚类准确率,所需迭代次数更少,算法收敛速度更快。

关键词: 聚类分析, 模糊聚类, 可能性模糊聚类, 特征逐减

Abstract: Fuzzy C-Means(FCM) algorithm and Possibilistic Fuzzy C-Means(PFCM) algorithm are sensitive to noise points because they do not consider the contribution of data features and individual data points. The Feature-Reduction Fuzzy C-Means(FRFCM) algorithm can remove the useless features of a dataset and compute the feature weights of remainders. So the FRFCM algorithm has better clustering performance. Based on the PFCM algorithm, a new Feature-Reduction Possibilistic Fuzzy C-Means(FRPFCM) algorithm is proposed. The FRPFCM algorithm not only solves the parameter dependency problem of the PFCM algorithm, but also can automatically weed out invalid data features and update the contribution degree to clustering of the rest data features. The experimental results on the synthetic and UCI datasets show that the proposed FRPFCM algorithm can get higher clustering precisions and need less iterations so that speed up its convergence rate.

Key words: clustering analyses;fuzzy clustering;possibilistic fuzzy clustering, feature-reduction