计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (31): 182-184.

• 图形、图像、模式识别 • 上一篇    下一篇

一种基于概率统计特征的剪接位点识别方法

李绍燕,邓 伟   

  1. 苏州大学 计算机科学与技术学院,江苏 苏州 215006
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-11-01 发布日期:2011-11-01

Identification of splice sites based on probability statistical features

LI Shaoyan,DENG Wei   

  1. School of Computer Science & Technology,Soochow University,Suzhou,Jiangsu 215006,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-11-01 Published:2011-11-01

摘要: 依据剪接位点附近存在的序列保守性出现了多种机器学习识别方法,如基于统计概率的方法、基于隐马尔可夫模型(Hidden Markov Model,HMM)的方法和基于支持向量机(Support Vector Machines,SVM)的方法等,这些方法识别精度较高,但算法过程复杂。基于剪接位点附近碱基之间的相关性和统计特征,构造了一种固定位点上碱基间的网络结构图,并在此网络结构图的基础上提出了基于概率统计特征的剪接位点识别计算公式,利用N269数据库对识别方法和其他传统方法的性能进行了比较。实验结果表明,基于概率统计特征的方法预测人类的剪接位点,有较好的预测效果,与其他的一些算法相比,表现出参数少,精度高等优点。

关键词: 剪接位点识别, 机器学习, 概率统计特征

Abstract: According to the statistical features of consensus sequences around splice sites,there exists a variety of identification of machine learning with higher precision,but the process of algorithm is more complex,such as method based on statistical probability,Hidden Markov Model(HMM) and Support Vector Machines(SVM) etc.According to the correlation between splice sites and statistical features,this paper presents a statistical feature method which is based on structural drawing network between the bases,tests and evaluates the method performance using N269 database.The experimental results show that the method proposed has better preformance in prediction results with less parameters and higher precision than other algorithms.

Key words: splice site recognition, machine learning, statistical features