计算机工程与应用 ›› 2013, Vol. 49 ›› Issue (17): 245-249.

• 工程与应用 • 上一篇    下一篇

强相关树基因选择方法及AE-RSVM分类研究

张  岩,闫德勤,吕志超,郑宏亮   

  1. 辽宁师范大学 计算机与信息技术学院,辽宁 大连 116081
  • 出版日期:2013-09-01 发布日期:2013-09-13

Strong correlative tree for gene selection and AE-RSVM for classification

ZHANG Yan, YAN Deqin, LV Zhichao, ZHENG Hongliang   

  1. School of Computer and Information Technology, Liaoning Normal University, Dalian, Liaoning 116081, China
  • Online:2013-09-01 Published:2013-09-13

摘要: 对肿瘤基因表达谱进行分析,从而有效区分正常样本与肿瘤样本的关键是:准确找出能够决定样本类别的最少特征基因,并用一个性能较好的分类器进行分类预测。针对该问题,用修订的特征记分准则(RFSC)去除分类无关基因;对两两冗余法进行改进,提出强相关树法用于冗余基因的去除;对粗糙支持向量机(RSVM)改进,提出近似等价粗糙支持向量机(AE-RSVM)对样本集进行分类测试。以肿瘤样本集为例进行测试,实验结果表明了提出方法的可行性和有效性。

关键词: 基因表达谱, 肿瘤分类, 基因选择, 支持向量机, 等价类

Abstract: The key of distinguishing between normal and tumor samples effectively for tumor gene expression data is to find out the fewest genes which can predict the classes, then use a good performance classifier to classify. Faced with the problem, it uses the Revised Feature Score Criterion(RFSC) to remove the genes irrelevant to the classification task. It improves the pair-wise redundancy method, proposes strong correlative tree to filter the redundant gene. It improves the Rough Support Vector Machine(RSVM) and proposes the Approximate Equivalence Rough Support Vector Machine(AE-RSVM), and then validates classification for data sets. Using the tumor data set to test, the experimental results show the feasibility and effectiveness of the method proposed in this paper.

Key words: gene expression profile, tumor classification, gene selection, support vector machine, equivalence class