计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (10): 120-123.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

多尺度组分特征和位点关联特征相融合的剪接位点识别

周  雄   

  1. 重庆工商大学 融智学院,重庆 400033
  • 出版日期:2014-05-15 发布日期:2014-05-14

Splice sites identification based on multi-scale component features and adjacent positions relationship features

ZHOU Xiong   

  1. Rongzhi College of Chongqing Technology and Business University, Chongqing 400033, China
  • Online:2014-05-15 Published:2014-05-14

摘要: 为了提高剪接位点的识别精度,提出一种多尺度组分和位点关联特征相融合的剪接位点识别模型(MSC-APR)。确定剪接位点序列保守性的窗口长度,分别提取序列的多尺度组分和位点关联特征,然后将两类特征组合输入最小二乘支持向量机构建剪接位点分类器,采用数据集HS3D和NN269进行仿真实验。结果表明,MSC-APR的剪接位点识别精度明显优于对比模型的识别精度。

关键词: 剪接位点, 最小二乘支持向机, 位点关联特征, 多尺度组分特征

Abstract: In order to improve identification accuracy of splice site, a novel splice sites identification model is proposed based on Multi-Scale Component features and Adjacent Positions Relationship features(MSC-APR). The window length of splice site conservation sequence is determined, and then multi-scale component and adjacent positions relationship features are extracted, features are combined and are input into least squares support vector machine to learn and establish the classifier of splice sites, and the data of HS3D and NN269 are used to test the performance of model. The experimental results show that compared with the other models, the proposed model has obtained higher identification accuracy of splice sites and superiority to other models.

Key words: splice sites, least squares support vector machine, adjacent positions relationship features, multi-scale component features