计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (11): 201-204.

• 图形、图像、模式识别 • 上一篇    下一篇

基于特征综合的启动子识别方法

刘咏梅,董宜堃   

  1. 哈尔滨工程大学 计算机科学与技术学院,哈尔滨 150001
  • 出版日期:2012-04-11 发布日期:2012-04-16

Promoter recognition method based on integrated features

LIU Yongmei, DONG Yikun   

  1. College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
  • Online:2012-04-11 Published:2012-04-16

摘要: 针对真核生物启动子识别的高假阳性现状,提出了一种基于特征综合的真核启动子识别方法。通过提取人类启动子核苷酸联体统计信息作为特征,并使用主成分分析法进行主元提取。将10维主成分特征与2维CpG岛特征进行特征综合,共同作为BP神经网络的输入来识别启动子。对人类基因序列启动子的预测结果表明,不但有效地减小了假阳性,而且具有较好的敏感性和特异性。

关键词: 启动子, 启动子识别, 主成分分析, CpG岛, 反向传播(BP)神经网络

Abstract: In order to improve the situation of high false positive in eukaryotic promoter recognition, a promoter predication method based on integrate-features is presented. The method employs human promoter codon and pentamer information as features, and extracts principal components by PCA. The 10-dimension features and 2-dimension CpG island information, based on integrated features, are put into the BP neural network as classifier. The experimental results show that the method not only reduces the false positives but also obtains a higher sensitivity and specificity.

Key words: promoter, promoter recognition, principal component analysis, CpG island, Back Propagation(BP) neural network