计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (14): 203-207.

• 工程与应用 • 上一篇    下一篇

基于支持向量机分类算法的番茄miRNA预测

孙  超1,孟  军1,栾雨时2   

  1. 1.大连理工大学 计算机科学与技术学院,辽宁 大连 116023
    2.大连理工大学 生命科学与技术学院,辽宁 大连 116023
  • 出版日期:2012-05-11 发布日期:2012-05-14

Tomato miRNA prediction based on SVM classification algorithm

SUN Chao1, MENG Jun1, LUAN Yushi2   

  1. 1.School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116023, China
    2.School of Life Science and Biotechnology, Dalian University of Technology, Dalian, Liaoning 116023, China
  • Online:2012-05-11 Published:2012-05-14

摘要: 为了识别番茄基因组中潜在的miRNA,基于已发现的miRNA特征,利用支持向量机方法构建模型sly_pre_SVM和sly_SVM,用于番茄的前体miRNA序列和成熟miRNA序列的预测。对miRNA特征向量的编码、miRNA特征选择和参数的优化进行了研究。sly_pre_SVM对番茄测试集的分类精度、敏感性和特异性分别为99.69%、100%和99.66%,sly_SVM对番茄测试集的分类精度、敏感性和特异性分别为89.79%、88.89%和90%。预测得到41条番茄成熟miRNA序列,其中14条是尚未发现的,为进一步的miRNA生物学实验奠定了基础。

关键词: 支持向量机, 番茄, miRNA, 预测

Abstract: In order to predict the potential miRNA in tomato genome, based on miRNA feature that has been discovered, it develops two models, sly_pre_SVM and sly_SVM, based on support vector machine to discover the miRNA precursor sequence and mature miRNA sequence of the tomatoes. Some research about vector coding of miRNA feature, miRNA features selection and parameters optimization is done. The accuracy, sensitivity and specificity of sly_pre_SVM, a model applied to predict miRNA precursors, is 99.69%, 100% and 99.66% on tomato data set. The accuracy, sensitivity and specificity of sly_ SVM, a model applied to predict mature region on miRNA precursors, is 89.79%, 88.89% and 90.0% on tomato data set. 14 novel miRNA candidates are obtained from tomato genome. Therefore, the research provides guidance for further miRNA biology experiment.

Key words: support vector machines, tomato, miRNA, prediction