Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (6): 219-220.DOI: 10.3778/j.issn.1002-8331.2009.06.063

• 工程与应用 • Previous Articles     Next Articles

Predicting protein secondary structure by balancing data method

LI Wei,ZHAO Ya-ou,CHEN Yue-hui   

  1. Department of Information Science and Engineering,University of Jinan,Jinan 250022,China
  • Received:2008-01-16 Revised:2008-04-21 Online:2009-02-21 Published:2009-02-21
  • Contact: LI Wei

均衡数据法提高蛋白质二级结构预测

李 伟,赵亚欧,陈月辉   

  1. 济南大学 信息科学与工程学院,济南 250022
  • 通讯作者: 李 伟

Abstract: In the traditional protein secondary prediction methods,it often be founded unbalance training,as a result of that there is a big difference among the three structures of the correct rate,which is caused by the different number of residues.In order to correct it,as to the bagging,rectifying the traditional method,reducing the difference of the number of residues.In experiment,the data of CB396 is used,get better correct rate of collapse and the three total rate get up to 77%,the method achieves better performance.

Key words: Position Special Scoring Matrix(PSSM), BP neural network, CB396, protein secondary prediction

摘要: 传统蛋白质二级结构预测,由于氨基酸序列中三种结构数量的差异,易造成不均衡训练,使得对三种结构的预测准确率差别较大。为改善这种缺陷,受装袋原理的启发,对传统方法进行改进,缩小训练时三种结构数量的差距。在实验中,采用数据集CB396,结果表明该方法能够显著提高对折叠的预测正确率,而且在总的预测正确率上达到77.3%,可以较好地进行蛋白质二级结构预测。

关键词: PSSM矩阵, BP神经网络, CB396数据集, 蛋白质二级结构