基于遗传算法的高维数据模糊聚类

计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (16): 191-192.

基于遗传算法的高维数据模糊聚类

王宝文¹，阎俊梅¹，刘文远¹，石岩²

1.燕山大学信息学院，河北秦皇岛 066004
2.日本九州东海大学工程学院信息系统工程系

收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-06-01 发布日期:2007-06-01
通讯作者: 王宝文

High dimensional datas fuzzy clustering based on genetic algorithm

WANG Bao-wen¹，YAN Jun-mei¹，LIU Wen-yuan¹，SHI Yan²

1.Informatin Science and Engineering Institute of Yanshan University，Qinhuangdao，Hebei 066004，China
2.Department of Information System Engineering，School of Engineering，Kyushu Tokai University，Japan

Received:1900-01-01 Revised:1900-01-01 Online:2007-06-01 Published:2007-06-01
Contact: WANG Bao-wen

摘要/Abstract

摘要： 提出了一种基于遗传算法的高维数据模糊聚类方法。引入了一个模糊非相似矩阵来表示高维样本之间的非相似程度，并将高维样本初始化到二维平面。利用遗传算法进行迭代优化二维样本的坐标值，实现二维样本之间的欧氏距离向样本间的模糊非相似度的趋近，使高维样本映射到二维平面。最后将得到的最优的二维样本利用模糊C-均值聚类（FCM）算法聚类，克服了聚类有效性对高维样本空间分布的依赖。实验仿真表明利用该方法有较好的聚类效果，且比用FCM算法直接聚类收敛速度快。

Abstract: A high dimensional datas fuzzy clustering method is presented based on genetic algorithm，by importing a fuzzy dissimilar matrix to express the dissimilar degree between any two datas，and initializing the high dimensional samples to two dimensional plane.And then iteratively optimize the coordinate value of two dimensional plane using genetic algorithm，which makes the euclidean distance between the two dimensional plane approximate to the fuzzy dissimilar degree between samples gradually，and the high dimensional samples are mapped into two dimensional plane.At last，using FCM algorithm to the two dimensional datas，avoids the dependence of the validity of clustering on the space distribution of high dimensional samples.Experimental results show that the method this paper proposed has more exact clustering result and faster convergence speed than FCM algorithm.

王宝文¹，阎俊梅¹，刘文远¹，石岩². 基于遗传算法的高维数据模糊聚类[J]. 计算机工程与应用, 2007, 43(16): 191-192.

WANG Bao-wen¹，YAN Jun-mei¹，LIU Wen-yuan¹，SHI Yan². High dimensional datas fuzzy clustering based on genetic algorithm[J]. Computer Engineering and Applications, 2007, 43(16): 191-192.