基于强化学习的HP模型优化方法研究

doi:10.3778/j.issn.1002-8331.1811-0059

计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (12): 132-139.DOI: 10.3778/j.issn.1002-8331.1811-0059

基于强化学习的HP模型优化方法研究

吴宏杰1，2，杨茹1，傅启明1，陈建平1，陆卫忠1

1.苏州科技大学电子与信息工程学院，江苏苏州 215009
2.苏州大学江苏省计算机信息处理技术重点实验室，江苏苏州 215006

出版日期:2019-06-15 发布日期:2019-06-13

Research on HP Model Optimization Method Based on Reinforcement Learning

WU Hongjie1，2, YANG Ru1, FU Qiming1, CHEN Jianping1, LU Weizhong1

1.School of Electronics and Information Engineering, Suzhou University of Science and Technology, Suzhou, Jiangsu 215009, China
2.Jiangsu Provincial Key Lab for Information Processing Technologies, Soochow University, Suzhou, Jiangsu 215006, China

Online:2019-06-15 Published:2019-06-13

摘要/Abstract

摘要： 蛋白质结构预测问题一直是生物信息学中的重要问题。基于疏水极性模型的蛋白质二维结构预测问题是一个典型的NP难问题。目前疏水极性模型优化的方法有贪心算法、粒子群算法、遗传算法、蚁群算法和蒙特卡罗模拟方法等，但这些方法成功收敛的鲁棒性不高，容易陷入局部最优。由此提出一种基于强化学习的HP模型优化方法，利用其连续马尔可夫最优决策与最大化全局累计回报的特点，在全状态空间中，构建基于能量函数的奖赏函数，引入刚性重叠检测规则，充分挖掘生物序列中的全局进化关系，从而进行有效与稳定的预测。以3条经典论文序列和5条Uniref50序列为实验对象，与贪心算法和粒子群算法分别进行了鲁棒性、收敛性与运行时间的比较。贪心算法只能在62.5%的序列上进行收敛，该文方法能在5万次训练后稳定的在所有序列上达到了收敛。与粒子群算法相比，两者都能找到最低能量结构，但该文的运行时间较粒子群算法降低了63.9%。

关键词: 强化学习, HP模型, 结构预测

Abstract: Protein structure prediction is an important factor in the area of bioinformatics. Predicting the two-dimensional structure of proteins based on the Hydrophobic Polarity model（HP model） is a typical Non-deterministic Polynomial（NP）-hard problem. Currently, HP model optimization methods include the greedy algorithm, particle swarm optimization, genetic algorithm, ant colony algorithm and the Monte-Carlo simulation method. However, the robustness of these methods are not sufficient, and it is easy to fall into a local optimum. Therefore, a HP model optimization method, based on reinforcement learning is proposed. In the full state space, a reward function based on energy function is designed and a rigid overlap detection rule is introduced. By using the characteristics of the continuous Markov optimal decision and maximizing global cumulative return, the global evolutionary relationship in biological sequences is fully exploited, and effective and stable predictions are retrieved. Eight classical sequences from publications and Uniref50 are selected as experimental objects. The robustness, convergence and running time are compared with the greedy algorithm and particle swarm optimization algorithm, respectively. Both reinforcement method and swarm optimization method can find all the lowest energy structures for these eight sequences, while the greedy algorithm only detects 62.5%. Compared with particle swarm optimization, the running time of the reinforcement method is 63.9% lower than that of particle swarm optimization.

Key words: reinforcement learning, Hydrophobic Polarity（HP） model, structure prediction

吴宏杰1，2，杨茹1，傅启明1，陈建平1，陆卫忠1. 基于强化学习的HP模型优化方法研究[J]. 计算机工程与应用, 2019, 55(12): 132-139.

WU Hongjie1，2, YANG Ru1, FU Qiming1, CHEN Jianping1, LU Weizhong1. Research on HP Model Optimization Method Based on Reinforcement Learning[J]. Computer Engineering and Applications, 2019, 55(12): 132-139.

[1]	张鑫，张席. 优先状态估计的双深度Q网络[J]. 计算机工程与应用, 2021, 57(8): 78-83.
[2]	王晓，唐伦，贺小雨，陈前斌. 基于深度强化学习的服务功能链多维资源优化[J]. 计算机工程与应用, 2021, 57(4): 68-76.
[3]	赖俊，魏竞毅，陈希亮. 分层强化学习综述[J]. 计算机工程与应用, 2021, 57(3): 72-79.
[4]	马志豪，朱响斌. 拟双曲动量梯度的对抗深度强化学习研究[J]. 计算机工程与应用, 2021, 57(24): 90-99.
[5]	李宝帅，叶春明. 深度强化学习算法求解作业车间调度问题[J]. 计算机工程与应用, 2021, 57(23): 248-254.
[6]	王军，曹雷，陈希亮，赖俊，章乐贵. 多智能体博弈强化学习研究综述[J]. 计算机工程与应用, 2021, 57(21): 1-13.
[7]	成怡，郝密密. 改进深度强化学习的室内移动机器人路径规划[J]. 计算机工程与应用, 2021, 57(21): 256-262.
[8]	况立群，李思远，冯利，韩燮，徐清宇. 深度强化学习算法在智能军事决策中的应用[J]. 计算机工程与应用, 2021, 57(20): 271-278.
[9]	孔松涛，刘池池，史勇，谢义，王堃. 深度强化学习在智能制造中的应用展望综述[J]. 计算机工程与应用, 2021, 57(2): 49-59.
[10]	李浩，宁浩宇，康雁，梁文韬，霍雯. 针对文本情感转换的SMRFGAN模型[J]. 计算机工程与应用, 2021, 57(2): 170-176.
[11]	张荣霞，武长旭，孙同超，赵增顺. 深度强化学习及在路径规划中的研究进展[J]. 计算机工程与应用, 2021, 57(19): 44-56.
[12]	杨薛钰，陈建平，傅启明，陆悠，吴宏杰. 基于随机方差减小方法的DDPG算法[J]. 计算机工程与应用, 2021, 57(19): 104-111.
[13]	宋浩楠，赵刚，王兴芬. 融合知识表示和深度强化学习的知识推理方法[J]. 计算机工程与应用, 2021, 57(19): 189-197.
[14]	王科银，石振，杨正才，杨亚会，王思山. 改进强化学习算法应用于移动机器人路径规划[J]. 计算机工程与应用, 2021, 57(18): 270-274.
[15]	张俊，朱庆伟，严俊杰，温波. 改进强化学习算法的UAV室内三维航迹规划[J]. 计算机工程与应用, 2021, 57(16): 175-181.

基于强化学习的HP模型优化方法研究

Research on HP Model Optimization Method Based on Reinforcement Learning

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics