计算机工程与应用 ›› 2013, Vol. 49 ›› Issue (19): 112-118.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

特征选择的和声模糊聚类研究与应用

王华秋1,罗  江1,Michael GERNDT2,Ventsislav PETKOV2   

  1. 1.重庆理工大学 计算机学院,重庆 400054
    2.慕尼黑工业大学 信息系 I10研究所,德国 慕尼黑 D-85748
  • 出版日期:2013-10-01 发布日期:2015-04-20

Research and application of harmony search fuzzy clustering with feature selection

WANG Huaqiu1, LUO Jiang1, Michael GERNDT2, Ventsislav PETKOV2   

  1. 1.School of Computer Science, Chongqing University of Technology, Chongqing 400054, China
    2.Institute of Informatics I10, Technology University Munich, D-85748, Munich, Germany
  • Online:2013-10-01 Published:2015-04-20

摘要: 采取了3种必要的措施提高了聚类质量:考虑到各维数据特征属性对聚类效果影响不同,采用了基于统计方法的维度加权的方法进行特征选择;对于和声搜索算法的调音概率进行了改进,将改进的和声搜索算法和模糊聚类相结合用于快速寻找最优的聚类中心;循环测试各种中心数情况下的聚类质量以获得最佳的类中心数。该算法被应用于并行计算性能分析中,用于识别并行程序运行时各处理器运行性能瓶颈的类别。实验结果表明该算法较其他算法更优,这样的性能分析方法可以提高并行程序的运行效率。

关键词: 和声搜索, 模糊聚类, 特征选择, 并行性能分析

Abstract: Three methods are adopted to achieve a better clustering quality. Considering the different influences of each dimension attribute of data on the clustering effect, statistic method is used to weight each dimension to select feature. Some improvements are carried out for the probability of harmony search algorithm and combine fuzzy clustering algorithm with harmony search to rapidly find the optimal cluster centers. The iterative method is used to test clustering quality to get the best number of cluster center. The proposed algorithm is applied to parallel computing performance analysis to distinguish and identify the performance bottleneck category of various processors during parallel program running. Experimental results show that the proposed clustering algorithm outperforms other similar algorithms. This performance analysis method can improve the operating efficiency of the parallel program.

Key words: harmony search, fuzzy clustering, feature selection, parallel performance analysis