Computer Engineering and Applications ›› 2015, Vol. 51 ›› Issue (4): 154-157.

Previous Articles     Next Articles

Fuzzy C-means clustering algorithm based on stacked sparse autoencoders

DUAN Baobin1,2, HAN Lixin2, XIE Jin1   

  1. 1.Department of Mathematics and Physics, Hefei University, Hefei 230601, China
    2.College of Computer and Information, Hohai University, Nanjing 211100, China
  • Online:2015-02-15 Published:2015-02-04

基于堆叠稀疏自编码的模糊C-均值聚类算法

段宝彬1,2,韩立新2,谢  进1   

  1. 1.合肥学院 数学与物理系,合肥 230601
    2.河海大学 计算机与信息学院,南京 211100

Abstract: In order to solve the sensitivity of fuzzy C-means clustering algorithm to the outlier and the randomly initialized clustering center, the stacked sparse autoencoders and traditional fuzzy C-means clustering algorithm are combined to improve the traditional fuzzy C-means clustering algorithm. Because the stacked sparse autoencoders can extract features of the original data set from low-level to high-level, and high-level features can reflect the nature features of the sample data to be clustered better than the original data set, which will help to improve the clustering effect with high-level features instead of the original data. With experimenting on several standard data sets of UCI, it is shown that the improved algorithm is feasible.

Key words: stacked sparse autoencoders, fuzzy C-means clustering, features, deep learning

摘要: 针对模糊C-均值聚类算法对孤立点、随机初始化的聚类中心比较敏感的问题,将堆叠稀疏自编码与传统模糊C-均值聚类算法相结合,对传统模糊C-均值聚类算法进行了改进。由于堆叠稀疏自编码可以提取原始数据集从低层到高层的特征,而高层的特征通常比原始数据集更能反映待聚类样本的本质特征,用其代替原始数据集进行聚类,有助于提高聚类的效果。利用改进后的算法在UCI的几个标准数据集上进行实验,结果表明改进后的算法是有效可行的。

关键词: 堆叠稀疏自编码, 模糊C-均值聚类, 特征, 深度学习