[1] SUTTON R S, BARTO A G. Reinforcement Learning: an Introduction[M]. 2nd ed. Cambridge: MIT Press, 2018: 17-35.
[2] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533.
[3] SIGAUD O. Combining evolution and deep reinforcement learning for policy search: a survey[J]. arXiv:2203.14009, 2022.
[4] KHADKA S, TUMER K. Evolution-guided policy gradient in reinforcement learning[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal, Canada: MIT Press: 2018: 1196-1208
[5] POURCHOT A, SIGAUD O. CEM-RL: combining evolutionary and gradient-based methods for policy search[C]//Proceedings of the 6th International Conference on Learning Representations, 2019.
[6] KHADKA S, MAJUMDAR S, NASSAR T, et al. Collaborative evolutionary reinforcement learning[C]//Proceedings of the 36th International Conference on Machine Learning, 2019: 3341-3350.
[7] BODNAR C, DAY B, LIó P. Proximal distilled evolutionary reinforcement learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020: 3283-3290.
[8] LV S, HAN S, ZHOU W, et al. Recruitment-imitation mechanism for evolutionary reinforcement learning[J]. Information Sciences, 2021, 553: 172-188.
[9] MARCHESINI E, CORSI D, FARINELLI A. Genetic soft updates for policy evolution in deep reinforcement learning[C]//Proceedings of the 8th International Conference on Learning Representations, 2020.
[10] HAO J Y, LI P Y, TANG H Y, et al. ERL-Re^2: efficient evolutionary reinforcement learning with shared state representation and individual policy representation[C]//Proceedings of the 11th International Conference on Learning Representations, 2023.
[11] STANLEY K O, CLUNE J, LEHMAN J, et al. Designing neural networks through neuroevolution[J]. Nature Machine Intelligence, 2019, 1: 24-35.
[12] SALEHI A, CONINX A, DONCIEUX S. Few-shot quality-diversity optimization[J]. IEEE Robotics and Automation Letters, 2022, 7(2): 4424-4431.
[13] LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[J]. arXiv:1509.02971, 2015.
[14] 吕帅, 龚晓宇, 张正昊, 等. 结合进化算法的深度强化学习方法研究综述[J]. 计算机学报, 2022, 45(7): 1478-1499.
Lü S, GONG X Y, ZHANG Z H, et al. Survey of deep reinforcement learning methods with evolutionary algorithms[J]. Chinese Journal of Computers, 2022, 45(7): 1478-1499.
[15] FUJIMOTO S, VAN HOOF H, MEGER D J. Addressing function approximation error in actor-critic methods[C]//Proceedings of the 35th International Conference on Machine Learning, 2018: 2587-2601.
[16] FINN C, ABBEEL P, LEVINE S, et al. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the 34th International Conference on Machine Learning - Volume 70. New York: ACM, 2017: 1126-1135.
[17] LüDERS B, SCHL?GER M, KORACH A, et al. Continual and one-shot learning through neural networks with dynamic external memory[M]//Applications of evolutionary computation. Cham: Springer, 2017: 886-901.
[18] LEHMAN J, CHEN J, CLUNE J, et al. Safe mutations for deep and recurrent neural networks through output gradients[C]//Proceedings of the Genetic and Evolutionary Computation Conference. New York: ACM, 2018: 117-124.
[19] SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[J]. arXiv:1707.6347.017. |