计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (16): 228-230.

• 工程与应用 • 上一篇    下一篇

基于SVM和KNN的蛋白质耐热性分类

丁彦蕊1,2,蔡宇杰2,3,孙 俊1,须文波1   

  1. 1.江南大学 信息工程学院,江苏 无锡 214122
    2.江南大学 工业生物技术教育部重点实验室,江苏 无锡 214036
    3.江南大学 生物工程学院,江苏 无锡 214036
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-06-01 发布日期:2007-06-01
  • 通讯作者: 丁彦蕊

Classification of protein thermostability using Support Vector Machines and K-Nearest Neighbors

DING Yan-rui1,2,CAI Yu-jie2,3,SUN Jun1,XU Wen-bo1   

  1. 1.School of Information Technology,Southern Yangtz University,Wuxi,Jiangsu 214122,China
    2.Key Laboratory of Industrial Biotechnology,Wuxi,Jiangsu 214036,China
    3.School of biotechnology,Wuxi,Jiangsu 214036,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-06-01 Published:2007-06-01
  • Contact: DING Yan-rui

摘要: 以氨基酸含量为特征向量,研究了SVM和KNN预测蛋白质耐热性的准确度。结果表明,基于SVM的分类效果较好,其局部预测率和全局预测率分别为82.4%和83.4%;而基于KNN方法的局部预测率和全局预测率分别为77.6%和79.9%。两种方法的预测率均表明氨基酸含量是影响蛋白质耐热性的主要因素。

Abstract: Regarding amino acid composition as eigenvector,protein thermostability is classified using Support Vector Machines and K-Nearest Neighbors.It is found that the result of using support vector machines is better than K-Nearest Neighbors.The local accuracy and global accuracy are 82.4% and 83.4% respectively.But the local accuracy and global accuracy are 77.6% and 79.9% respectively using K-Nearest Neighbors.The prediction accuracy of two kinds of methods can both prove that the amino acid composition is the main factor that influences the protein thermostability.