Prediction of Membrane Protein Based on Sequence Information Fusion and Two-Stage Feature Selection

doi:10.3778/j.issn.1002-8331.1712-0265

Abstract

Abstract: Researching on membrane protein type prediction is of great significance, because the type of membrane protein is exceedingly related with its function. In this study, a two-stage feature selection method is proposed（MIC-GA）, which is on the basis of Maximum Information Coefficient（MIC） and Genetic Algorithm（GA）, to address the problem of high-dimensional feature in the process of feature extraction for membrane protein. Three kinds of feature representations, PseAAC, DC and PSSM, are extracted from a membrane protein sequence. In the process of feature fusion, an improved ReliefF algorithm（FReliefF） is proposed to obtain an effective feature score. Ultimately the extremely randomized tree is used two times based on Stacking ensemble learning framework to realize a reasonable prediction of membrane protein types. The results show that the proposed method can improve the accuracy of membrane protein prediction efficiently.

Key words: membrane protein type prediction, maximum information coefficient, genetic algorithm, feature selection, feature fusion, extremely randomized tree

摘要： 膜蛋白的功能与其类型密切相关，因此膜蛋白类型的预测具有重要意义。针对膜蛋白特征表达过程中出现的特征维数高的问题，结合最大信息系数与遗传算法提出一种两阶段特征选择（MIC-GA）。抽取膜蛋白序列信息中的伪氨基酸组成、二肽组成和位置特异性分数矩阵等特征融合后作为特征参数，并在融合过程中提出一种改进的ReliefF算法（FReliefF）得到更有效的特征分数。基于Stacking集成学习框架，两次使用极端随机树对膜蛋白类型进行合理化预测。结果表明该方法能够有效提高膜蛋白预测的准确率。

关键词: 膜蛋白预测, 最大信息系数, 遗传算法, 特征选择, 特征融合, 极端随机树

GUO Lei, WANG Shunfang. Prediction of Membrane Protein Based on Sequence Information Fusion and Two-Stage Feature Selection[J]. Computer Engineering and Applications, 2019, 55(6): 145-150.

郭磊，王顺芳. 序列信息融合与两阶段特征选择的膜蛋白预测[J]. 计算机工程与应用, 2019, 55(6): 145-150.

[1]	LU Lixia, ZOU Junzhong, GUO Yucheng, ZHANG Jian, WANG Bei. Prediction of Knee Injury Based on Multimodal Fusion [J]. Computer Engineering and Applications, 2021, 57(9): 225-232.
[2]	LI Mingshan, HAN Qingpeng, ZHANG Tianyu, WANG Daolei. Safety Helmet Detection Method of Improved SSD [J]. Computer Engineering and Applications, 2021, 57(8): 192-197.
[3]	GUO Xiaojing, SUI Haoda. Application of Improved YOLOv3 in Foreign Object Debris Target Detection on Airfield Pavement [J]. Computer Engineering and Applications, 2021, 57(8): 249-255.
[4]	LI Li, JI Xinyuan, SONG Song. Prediction Model for Number of Software Defects in Loop [J]. Computer Engineering and Applications, 2021, 57(7): 158-163.
[5]	LI Jingxing, YANG Youlong. Feature Selection of Markov Blanket for High Dimensional Data [J]. Computer Engineering and Applications, 2021, 57(6): 58-66.
[6]	HAN Wenjing, LUO Xiaoshu, YANG Rixing. Research on Compound Gesture Recognition Method [J]. Computer Engineering and Applications, 2021, 57(4): 108-113.
[7]	ZHAO Hui, LI Zhiwei, FANG Lufa. Feature Information Enhancement Based Single Shot Multibox Detector Algorithm [J]. Computer Engineering and Applications, 2021, 57(4): 148-154.
[8]	LI Yuqi, LIU Zhiqian, CHENG Ningyi, WANG Yingying, ZHU Chunli. Path Planning of UAV Under Multi-constraint Conditions [J]. Computer Engineering and Applications, 2021, 57(4): 225-230.
[9]	YANG Wei, WU Yingying, WANG Ting. Research on Configuration Optimization Problems of Shuttle-Carrier Storage and Retrieval System [J]. Computer Engineering and Applications, 2021, 57(4): 258-265.
[10]	WANG Dianwei, ZHAO Mengying, LIU Ying, SONG Haijun, XIE Yongjun. Improved R-SSD Panoramic Video Image Vehicle Detection Algorithm [J]. Computer Engineering and Applications, 2021, 57(3): 189-195.
[11]	LU Wei, LIU Dan, SHAO Min, WU Yangdong. Application of Improved Mask R-CNN Network in Medical Image Recognition and Segmentation [J]. Computer Engineering and Applications, 2021, 57(24): 234-241.
[12]	XIAO Ruixue, FENG Yingwei, QU Jianping. Steganalysis of Variable Size Image Based on Efficient Feature Fusion [J]. Computer Engineering and Applications, 2021, 57(24): 126-134.
[13]	LI Qian, JIANG Li, LIANG Changyong. Multi-objective Cold Chain Distribution Optimization Based on Fuzzy Time Window [J]. Computer Engineering and Applications, 2021, 57(23): 255-262.
[14]	TENG Jinbao, KONG Weiwei, TIAN Qiaoxin, WANG Zhaoqian, LI Long. Multi-channel Attention Mechanism Text Classification Model Based on CNN and LSTM [J]. Computer Engineering and Applications, 2021, 57(23): 154-162.
[15]	WANG Chuanyu, LI Weixiang, CHEN Zhenhuan. Reserch of Multi-modal Emotion Recognition Based on Voice and Video Images [J]. Computer Engineering and Applications, 2021, 57(23): 163-170.

Prediction of Membrane Protein Based on Sequence Information Fusion and Two-Stage Feature Selection

序列信息融合与两阶段特征选择的膜蛋白预测

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics