计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (18): 242-245.

• 工程与应用 • 上一篇    下一篇

基于遗传算法的结肠癌基因选择与样本分类

何爱香   

  1. 山东工商学院 信息与电子工程学院,山东 烟台 264005
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-06-21 发布日期:2007-06-21
  • 通讯作者: 何爱香

Gene selection and classification for microarray data of colon and normal tissues using genetic algorithms

HE Ai-xiang   

  1. School of Information and Electronics Engineering,Shandong Institute of Business and Technology,Yantai,Shandong 264005,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-06-21 Published:2007-06-21
  • Contact: HE Ai-xiang

摘要: 提出了一种基于两轮遗传算法的用于结肠癌微阵列数据基因选择与样本分类的新方法。该方法先根据基因的Bhattacharyya距离指标过滤大部分与分类不相关的基因,而后使用结合了遗传算法和CFS(Correlation-based Feature Selection)的GA/CFS方法选择优秀基因子集,并存档记录这些子集。根据存档子集中基因被选择的频率选择进一步搜索的候选子集,最后以结合了遗传算法和SVM的GA/SVM从候选基因子集中选择分类特征子集。把这种GA/CFS-GA/SVM方法应用到结肠癌微阵列数据,实验结果及与文献的比较表明了该方法效果良好。

Abstract: We describe a novel approach for gene selection and cancer classification of microarray data,which combines Support Vector Machines (SVM),Correlation-based Feature Selection(CFS) and Genetic Algorithms(GA).First,the Bhattacharyya distance of each gene is used as the criterion for filtering the irrelevant genes for classification.Then GA combined with CFS is adopted to find informative gene subsets.Finally,using archive records of these subsets,the 50 most frequently selected genes are defined as a candidate subset through which the GA is used to evolve gene subsets whose fitness is evaluated by a SVM classifier.Our method is assessed on the colon dataset and is able to select small subsets and still improve classification accuracy.