计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (8): 236-238.

• 工程与应用 • 上一篇    下一篇

一种模糊相似关系的基因表达数据聚类方法

姜永森1,陆 媛2,杨慧中2   

  1. 1.北华大学 科研处,吉林 132013
    2.江南大学 物联网工程学院,江苏 无锡 214122
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-03-11 发布日期:2011-03-11

Improved clustering algorithm based on fuzzy similarity relation

JIANG Yongsen1,LU Yuan2,YANG Huizhong2   

  1. 1.Science and Technology Office,Beihua University,Jilin 132013,China
    2.School of IoT Engineering,Jiangnan University,Wuxi,Jiangsu 214122,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-03-11 Published:2011-03-11

摘要: 对于时间序列的基因表达数据,传统的聚类算法都是以距离为相似性度量标准,没有考虑基因随时间变化的相似趋势。从基因变化的趋势出发,构造了一种新的模糊相似关系矩阵,提出了改进的基于模糊相似关系的聚类算法,并以该算法计算FCM的初始聚类中心。将该方法应用在酵母菌基因表达数据中,实验结果表明该算法不仅克服了FCM算法易陷入局部极小值、对初值敏感的缺点,而且能够发现一些表达模式变化趋势相似的共调控基因。

关键词: 模糊相似关系矩阵, 聚类中心, 模糊C均值聚类(FCM)算法, 时序基因表达数据

Abstract: For time series gene expression data,the similarity measure of traditional clustering algorithm is measured based on distance.There is no consideration the coherent trend of expression patterns gene exhibit with time process.A new fuzzy similar relation matrix is constructed and a modified clustering algorithm based on fuzzy similarity relation is proposed.On this base,a new method is used to find the initial center of FCM algorithm.The method is used in yeast gene expression data.Experimental results show that the method not only overcomes the limitation of FCM algorithm,but also identifies cell-cycle regulated genes where expression levels change periodically during the cell cycle.

Key words: fuzzy similar relation matrix, cluster centers, Fuzzy C-Means(FCM) algorithm, time series gene expression data