计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (16): 191-192.

• 数据库与信息处理 • 上一篇    下一篇

基于遗传算法的高维数据模糊聚类

王宝文1,阎俊梅1,刘文远1,石 岩2   

  1. 1.燕山大学 信息学院,河北 秦皇岛 066004
    2.日本九州东海大学 工程学院 信息系统工程系
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-06-01 发布日期:2007-06-01
  • 通讯作者: 王宝文

High dimensional datas fuzzy clustering based on genetic algorithm

WANG Bao-wen1,YAN Jun-mei1,LIU Wen-yuan1,SHI Yan2   

  1. 1.Informatin Science and Engineering Institute of Yanshan University,Qinhuangdao,Hebei 066004,China
    2.Department of Information System Engineering,School of Engineering,Kyushu Tokai University,Japan
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-06-01 Published:2007-06-01
  • Contact: WANG Bao-wen

摘要: 提出了一种基于遗传算法的高维数据模糊聚类方法。引入了一个模糊非相似矩阵来表示高维样本之间的非相似程度,并将高维样本初始化到二维平面。利用遗传算法进行迭代优化二维样本的坐标值,实现二维样本之间的欧氏距离向样本间的模糊非相似度的趋近,使高维样本映射到二维平面。最后将得到的最优的二维样本利用模糊C-均值聚类(FCM)算法聚类,克服了聚类有效性对高维样本空间分布的依赖。实验仿真表明利用该方法有较好的聚类效果,且比用FCM算法直接聚类收敛速度快。

Abstract: A high dimensional datas fuzzy clustering method is presented based on genetic algorithm,by importing a fuzzy dissimilar matrix to express the dissimilar degree between any two datas,and initializing the high dimensional samples to two dimensional plane.And then iteratively optimize the coordinate value of two dimensional plane using genetic algorithm,which makes the euclidean distance between the two dimensional plane approximate to the fuzzy dissimilar degree between samples gradually,and the high dimensional samples are mapped into two dimensional plane.At last,using FCM algorithm to the two dimensional datas,avoids the dependence of the validity of clustering on the space distribution of high dimensional samples.Experimental results show that the method this paper proposed has more exact clustering result and faster convergence speed than FCM algorithm.