计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (10): 146-149.

• 数据库、信号与信息处理 • 上一篇    下一篇

结肠癌基因表达谱数据集噪声处理研究

易  波,文天柱,张  原   

  1. 海军航空工程学院 研究生管理大队,山东 烟台 264001
  • 出版日期:2012-04-01 发布日期:2012-04-11

Study of disposal of noises in gene expression profiles of colon tumor

YI Bo, WEN Tianzhu, ZHANG Yuan   

  1. Graduate Students’ Brigade, NAAU, Yantai, Shandong 264001, China
  • Online:2012-04-01 Published:2012-04-11

摘要: 针对较大规模结肠癌基因表达谱信息,对其噪声处理在基因标签提取问题中的作用进行了研究。不考虑噪声,用ReCorre算法确定分类基因,再用增l减r搜索算法确定基因标签组,对每个基因标签组使用基于支持向量机的留一交叉检验,确定最优的基因标签。分析噪声的影响,对于数据噪声,利用小波阈值去噪的方法滤除;对于无用基因,采用交替选择算法处理,进而重新确定基因标签。实验证明对肿瘤基因表达谱中噪声的处理有助于获取分类能力更好的基因标签。

关键词: 基因表达谱, 噪声, 小波变换

Abstract: For large number of gene expression profiles of the colon tumor, the usage of the disposal of noises is researched on when the informative genes are selected in this paper. The ReCorre algorithm is applied in despite of noises, and many informative genes are picked out with the plus l minus r searching algorithm, which are checked out by the Leave-One-Out Cross Validation(LOOCV) method based on the Sport Vector Machine(SVM) later in order to get the best informative genes. The influence of the noises in the gene expression profiles is analyzed. The numerical noises are strained by the wavelet threshold de-noising while the redundant genes are dealt by the alternately selecting algorithm, so that the new informative genes can be selected. In the end, 2000 colon tumor gene expression samples are analyzed and the disposal of noises in the tumor gene expression profiles turn out to be feasible and effective.

Key words: gene expression profiles, noise, wavelet transform