计算机工程与应用 ›› 2009, Vol. 45 ›› Issue (4): 180-182.DOI: 10.3778/j.issn.1002-8331.2009.04.051

• 图形、图像、模式识别 • 上一篇    下一篇

基于特征Boosting的真核启动子预测方法

曾庆尚,武栓虎   

  1. 烟台大学 计算机学院,山东 烟台 264005
  • 收稿日期:2008-07-21 修回日期:2008-10-16 出版日期:2009-02-01 发布日期:2009-02-01
  • 通讯作者: 曾庆尚

Eukaryotic promoter prediction based on feature-boosting

ZENG Qing-shang,WU Shuan-hu   

  1. Department of Computer,Yantai University,Yantai,Shandong 264005,China
  • Received:2008-07-21 Revised:2008-10-16 Online:2009-02-01 Published:2009-02-01
  • Contact: ZENG Qing-shang

摘要: 提出了一个新的启动子检测方法,它基于以下假设:启动子是由一些词模式决定的且不同的启动子由不同的词决定。通过计算散度距离选择最可能的特征并用feature-boosting构造一系列的弱分类器。一定数目的弱分类器可构造一强分类器,这样就可以达到一个较好的性能。和其他分类器不同的是,采用了不同的训练和分类策略。对大型基因序列实验结果和一些较好的算法比较显示该方法预测启动子区域是有效的,且具有较好的敏感性和特异性。

关键词: DNA序列分析, 启动子预测, 词模式, 特征boosting

Abstract: A new method is presented for promoter prediction based on the following hypothesis:Promoter is determined by some word patterns and different promoters are determined by different words.Most potential features are selected by divergence distance to build a sequence of weak classifier by feature-boosting.A number of weak classifiers construct a strong classifier,which can achieve a better performance.Different from other classifier,a different training and classifying strategy is adopted.Experimental results on large genomic sequences and comparisons with several excellent algorithms show that the algorithm is efficient with higher sensitivity and specificity in predicting promoter regions.

Key words: DNA sequence analysis, promoter prediction, word patterns, feature-boosting