[1] 孙彧, 曹雷, 陈希亮, 等. 多智能体深度强化学习研究综述[J]. 计算机工程与应用, 2020, 56(5): 13-24.
SUN Y, CAO L, CHEN X L, et al. Overview of multi-agent deep reinforcement learning[J]. Computer Engineering and Applications, 2020, 56(5): 13-24.
[2] 杨霄, 李晓婷. 基于深度强化学习的自动驾驶技术研究[J]. 网络安全技术与应用, 2021(1): 136-138.
YANG X, LI X T. Research on automatic driving technology based on deep reinforcement learning[J]. Network Security Technology and Application, 2021(1): 136-138.
[3] YE D, LIU Z, SUN M, et al. Mastering complex control in moba games with deep reinforcement learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020: 6672-6679.
[4] 吴晓光, 刘绍维, 杨磊, 等. 基于深度强化学习的双足机器人斜坡步态控制方法[J]. 自动化学报, 2021, 47(8): 1976-1987.
WU X G, LIU S W, YANG L, et al. A gait control method for biped robot on slope based on deep reinforcement learning[J]. Acta Automatica Sinica, 2021, 47(8): 1976-1987.
[5] LOWE R, WU Y, TAMAR A, et al. Multi-agent actor-critic for mixed cooperative competitive environments[C]//Proceedings of the 31st Annual Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2017.
[6] SUNEHAG P, LEVER G, GRUSLYS A, et al. Value decomposition networks for cooperative multi-agent learning based on team reward[C]//Proceedings of the 17th International Conference on Autonomous Agents and Multi-Agent Sytems, 2018: 2085-2087.
[7] HAFIZ A M. Image classification by reinforcement learning with two-state Q-Learning[J]. arXiv:2007.01298, 2020.
[8] SUKHBAATAR S, FERGUS R. Learning multiagent comm- unication with backpropagation[C]//Advances in Neural Information Processing Systems, 2016: 2244-2252.
[9] PENG P, YUAN Q, WEN Y, et al. Multiagent bidirectionally coordinated nets for learning to play starcraft combat games[J]. arXiv:1703.10069, 2017.
[10] UZKENT B, YEH C, ERMON S. Efficient object detection in large images using deep reinforcement learning[C]//IEEE Winter Conference on Applications of Computer Vision, 2020: 1824-1833.
[11] QIAO J F, WANG G M, LI W J, et al. An adaptive deep Q-learning strategy for hand written digit recognition[J]. Neural Networks, 2018, 107: 61-71.
[12] MOUSAVI H K, NAZARI M, TAKá? M, et al. Multi-agent image classification via reinforcement learning[C]//IEEE/ RSJ International Conference on Intelligent Robots and Systems, 2019: 5020-5027.
[13] SUTTON R S, MCALLESTER D, SINGH S, et al. Policy gradient methods for reinforcement learning with function approximation[C]//Advances in Neural Information Processing Systems, 2000.
[14] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[J]. arXiv:1312.5602, 2013.
[15] RASHID T, SAMVELYAN M, DE WITT C S, et al. QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning[J]. arXiv:1803.11485, 2018.
[16] HOSTALLERO W, SON K, KIM D, et al. Learning to factorize with transformation for cooperative multiagent reinforcement learning[C]//Proceedings of the 31st International Conference on Machine Learning, 2019.
[17] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1708.
[18] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[19] CHENG G, HAN J W, LU J W. Remote sensing image scene classification: benchmark and state of the art[J]. Proceedings of the IEEE, 2017, 105(10): 1865-1883.
|