Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (17): 22-25.DOI: 10.3778/j.issn.1002-8331.2009.17.007

• 博士论坛 • Previous Articles     Next Articles

Functional discrimination of membrane proteins using SVM and FFT

GAO Jian-zhao,WANG Kui,HU Gang,ZHANG Hua   

  1. School of Mathematical Sciences and LPMC,Nankai University,Tianjin 300071,China
  • Received:2009-02-23 Revised:2009-03-24 Online:2009-06-11 Published:2009-06-11
  • Contact: GAO Jian-zhao

用SVM和FFT对膜蛋白功能分类

高建召,王 奎,胡 刚,张 华   

  1. 南开大学 数学科学学院LMP,天津 300071
  • 通讯作者: 高建召

Abstract: Membrane transport protein plays an important role in living cells.Currently,there are several methods to predict the membrane transport protein.However,there is limited number of published method to discriminate membrane transport proteins based on their functions.To approach this problem,a support vector machine based on fast Fourier transform is used to discriminate channels/pores,electrochemical potential-driven transporters and primary active transporters.Amino acid occurrence and modified Kyte-Doolittle hydrophobic scales,Ponnuswamy hydrophobic scales,mean polarity and solvation free energy,which are changed by fast Fourier transform,are used as support vector machine input vector.This method discriminates three transporters in transport classification database with the five-fold cross validation accuracy of 72.1% in a data set of 1718 membrane proteins.A preliminary literature-based validation had cross-validation accuracy 68.1%.This study suggests that the method can achieve a better accuracy and classify the transport into channels/pores,electrochemical and active transporters effectively.

Key words: Support Vector Machine(SVM), Fast Fourier Transform(FFT), hydrophobic scales, mean polarity, solvation free energy

摘要: 膜蛋白在细胞生命活动中扮演着重要的角色。目前,有很多方法用来预测和分类膜转运蛋白。然而,预测膜蛋白功能的工作并不多。为了解决这个问题,基于蛋白质序列信息结合快速傅里叶变换利用支持向量机的方法预测来自TCDB 数据库中的channels/pores,electrochemical potential-driven transporters和primary active transporters三类膜转运蛋白共1 817条蛋白质的功能。模型使用20种氨基酸的分布,残基的疏水性、平均极性和溶剂化自由能为原始的特征数据,利用快速傅里叶变换将其转化为频域上的信息作为机器学习的特征输入。通过五倍交叉检验预测准确率达到了72.1%,而先前的文献报道的准确率为68.1%。论文的研究证明该方法可以有效地对channels/pores,electrochemical potential-driven transporters和primary active transporters 三种不同功能的膜转运蛋白进行功能分类。

关键词: 支持向量机, 快速傅里叶变换, 疏水性, 平均极性, 溶剂化自由能