Computer Engineering and Applications ›› 2008, Vol. 44 ›› Issue (14): 228-230.

• 工程与应用 • Previous Articles     Next Articles

Application of Boosting algorithm to sample categorization of gene expression profiles

LIU Quan-jin1,LI Ying-xin2   

  1. 1.School of Physics & Electronic Engineering,Anqing Teachers College,Anqing,Anhui 246011,China
    2.CCD Item,Beijing Jingwei Textile Machinery New Technology Co.,LTD,Beijing 100176,China
  • Received:2007-08-24 Revised:2007-10-24 Online:2008-05-11 Published:2008-05-11
  • Contact: LIU Quan-jin

Boosting算法在基因表达谱样本分类中的应用

刘全金1,李颖新2   

  1. 1.安庆师范学院 物理与电气工程学院,安徽 安庆246011
    2.北京经纬纺机新技术有限公司 CCD部,北京 100176
  • 通讯作者: 刘全金

Abstract: In this paper an approach is proposed for sample categorization of gene expression profiles based on structure of gene expression profiles.Firstly,genes are removed as“noise genes”with small Bhattacharyya distance.Secondly,multi-edit-nearest-neighbor algorithm is modified to eliminate“noise samples”.Then boosting-based support vector machines combination classifiers are constructed and employed to classify the samples.Finally,this methods is used to classify colon genes expression profiles samples.The results show that the means is feasible and effective.

Key words: Bhattacharyya distance, multi-edit-nearest-neighbor algorithm, Boosting algorithm

摘要: 基于基因表达谱结构提出一种基因表达谱的样本分类方法。首先用基因的Bhattacharyya距离衡量其所含样本类别的信息,过滤Bhattacharyya距离较小的噪声基因;然后修改重复剪辑近邻算法,剔除噪声样本;再基于Boosting算法构建支持向量机组合分类器;最后以结肠癌基因表达谱样本为例,进行了分类实验。实验结果表明该方法简单、有效,对基因表达谱样本的分类问题有强的实用性。

关键词: Bhattacharyya距离, 重复剪辑近邻法, Boosting算法