[1] 周伟, 陈亚宁. 智能化战争背景下人工智能技术在军事指挥控制中的应用[J]. 军事文摘, 2022(8): 27-30.
ZHOU W, CHEN Y N. Application of artificial intelligence technology in military command and control under the background of intelligent war[J]. Military Digest, 2022(8): 27-30.
[2] 施伟, 黄红蓝, 冯旸赫, 等. 面向多类别分类问题的子抽样主动学习方法[J]. 系统工程与电子技术, 2021, 43(3): 700-708.
SHI W, HUANG H L, FENG Y H, et al. Subsampling oriented active learning method for multi-category classification problem[J]. Systems Engineering and Electronics, 2021, 43(3): 700-708.
[3] 陈明昊, 朱月瑶, 孙毅, 等. 计及高渗透率光伏消纳与深度强化学习的综合能源系统预测调控[J]. 电工技术学报, 2024, 39(19): 6054-6071.
CHEN M H, ZHU Y Y, SUN Y, et al. The predictive-control optimization method for park integrated energy system considering the high penetration of photovoltaics and deep reinforcement learning[J]. Transactions of China Electrotechnical Society, 2024, 39(19): 6054-6071.
[4] 施伟, 冯旸赫, 程光权, 等. 基于深度强化学习的多机协同空战方法研究[J]. 自动化学报, 2021, 47(7): 1610-1623.
SHI W, FENG Y H, CHENG G Q, et al. Research on multi-aircraft cooperative air combat method based on deep reinforcement learning[J]. Acta Automatica Sinica, 2021, 47(7): 1610-1623.
[5] 田雨萌. 面向云边应用协同的资源调度方法研究[D]. 北京: 军事科学院, 2024.
TIAN Y M. Research on resource scheduling method for cloud-side application collaboration[D]. Beijing: Academy of Military Science, 2024.
[6] 尹奇跃, 赵美静, 倪晚成, 等. 兵棋推演的智能决策技术与挑战[J]. 自动化学报, 2023, 49(5): 913-928.
YIN Q Y, ZHAO M J, NI W C, et al. Intelligent decision making technology and challenge of wargame[J]. Acta Automatica Sinica, 2023, 49(5): 913-928.
[7] 周志杰, 曹友, 胡昌华, 等. 基于规则的建模方法的可解释性及其发展[J]. 自动化学报, 2021, 47(6): 1201-1216.
ZHOU Z J, CAO Y, HU C H, et al. The interpretability of rule-based modeling approach and its development[J]. Acta Automatica Sinica, 2021, 47(6): 1201-1216.
[8] VINYALS O, BABUSCHKIN I, CZARNECKI W M, et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning[J]. Nature, 2019, 575(7782): 350-354.
[9] 于泽, 宁念文, 郑燕柳, 等. 深度强化学习驱动的智能交通信号控制策略综述[J]. 计算机科学, 2023, 50(4): 159-171.
YU Z, NING N W, ZHENG Y L, et al. Review of intelligent traffic signal control strategies driven by deep reinforcement learning[J]. Computer Science, 2023, 50(4): 159-171.
[10] HUANG H L, SHI W, FENG Y H, et al. Active client selection for clustered federated learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(11): 16424-16438.
[11] BROOKS R. A robust layered control system for a mobile robot[J]. IEEE Journal on Robotics and Automation, 1986, 2(1): 14-23.
[12] BRATMAN M E, ISRAEL D J, POLLACK M E. Plans and resource-bounded practical reasoning[J]. Computational Intelligence, 1988, 4(3): 349-355.
[13] KELLER J. DARPA to develop swarming unmanned vehicles for better military reconnaissance[J]. Military & Aerospace Electronics, 2017, 28(2): 4-6.
[14] REN X M, GU H X, WEI W T. Tree-RNN: tree structural recurrent neural network for network traffic classification[J]. Expert Systems with Applications, 2021, 167: 114363.
[15] SHI W, FENG Y H, HUANG H L, et al. Efficient hierarchical policy network with fuzzy rules[J]. International Journal of Machine Learning and Cybernetics, 2022, 13(2): 447-459.
[16] LI J, WANG R, NANTOGMA S, et al. Genetic fuzzy tree based learning algorithm toward the weapon-target assignment problem[C]//Proceedings of 2021 International Conference on Autonomous Unmanned Systems (ICAUS 2021). Singapore: Springer, 2022: 1677-1686.
[17] YANG X T, JI Z, WU J, et al. Hierarchical reinforcement learning with universal policies for multistep robotic manipulation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(9): 4727-4741.
[18] POPE A P, IDE J S, MI?OVI? D, et al. Hierarchical reinforcement learning for air combat at DARPA’s AlphaDogfight trials[J]. IEEE Transactions on Artificial Intelligence, 2023, 4(6): 1371-1385.
[19] GüRTLER N, BüCHLER D, MARTIUS G. Hierarchical reinforcement learning with timed subgoals[C]//Proceedings of the Neural Information Processing Systems, 2021.
[20] ZARE M, KEBRIA P M, KHOSRAVI A, et al. A survey of imitation learning: algorithms, recent developments, and challenges[J]. IEEE Transactions on Cybernetics, 2024, 54(12): 7173-7186.
[21] BEN-PORAT O, MANSOUR Y, MOSHKOVITZ M, et al. Principal-agent reward shaping in MDPs[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2024: 9502-9510.
[22] 宋浩楠, 赵刚, 王兴芬. 融合知识表示和深度强化学习的知识推理方法[J]. 计算机工程与应用, 2021, 57(19): 189-197.
SONG H N, ZHAO G, WANG X F. Knowledge reasoning method combining knowledge representation with deep reinforcement learning[J]. Computer Engineering and Applications, 2021, 57(19): 189-197.
[23] MAO Y X, ZHANG H C, CHEN C, et al. Supported trust region optimization for offline reinforcement learning[C]//Proceedings of the 40th International Conference on Machine Learning, 2023: 23829-23851.
[24] QU Y, WANG B Y, SHAO J Z, et al. Hokoff: real game dataset from honor of kings and its offline reinforcement learning benchmarks[C]//Proceedings of the 37th Conference on Neural Information Processing Systems, 2023.
[25] SHAO J Z, QU Y, CHEN C, et al. Counterfactual conservative Q learning for offline multi-agent reinforcement learning[C]//Proceedings of the 37th Conference on Neural Information Processing Systems, 2023.
[26] NAIR A, GUPTA A, DALAL M, et al. AWAC: accelerating online reinforcement learning with offline datasets[J]. arXiv:2006.09359, 2020.
[27] 薛建强, 史彦军, 李波. 面向无人集群的边缘计算技术综述[J]. 兵工学报, 2023, 44(9): 2546-2555.
XUE J Q, SHI Y J, LI B. A review of edge computing technology for unmanned swarms[J]. Acta Armamentarii, 2023, 44(9): 2546-2555.
[28] QI J J, ZHOU Q H, LEI L, et al. Federated reinforcement learning: techniques, applications, and open challenges[J]. arXiv:2108.11887, 2021.
[29] NADIGER C, KUMAR A, ABDELHAK S. Federated reinforcement learning for fast personalization[C]//Proceedings of the 2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering. Piscataway: IEEE, 2019: 123-127.
[30] LIU B Y, WANG L J, LIU M. Lifelong federated reinforcement learning: a learning architecture for navigation in cloud robotic systems[J]. IEEE Robotics and Automation Letters, 2019, 4(4): 4555-4562.
[31] REN J J, WANG H C, HOU T T, et al. Federated learning-based computation offloading optimization in edge computing-supported Internet of Things[C]//Proceedings of the IEEE Access. Piscataway: IEEE, 2019: 69194-69201.
[32] WANG X F, WANG C Y, LI X H, et al. Federated deep reinforcement learning for Internet of Things with decentralized cooperative edge caching[J]. IEEE Internet of Things Journal, 2020, 7(10): 9441-9455.
[33] ZHUO H H, FENG W F, LIN Y F, et al. Federated deep reinforcement learning[J]. arXiv:1901.08277, 2019.
[34] SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[J]. arXiv:1707.06347, 2017.
[35] ZHUANG Z F, LEI K, LIU J X, et al. Behavior proximal policy optimization[J]. arXiv:2302.11312, 2023.
[36] SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized experience replay[J]. arXiv:1511.05952, 2015.
[37] HUANG H L, SHI W, FENG Y H, et al. A novel federated reinforcement learning algorithm with historical model update momentum[C]//Proceedings of the 2023 2nd International Conference on Machine Learning, Cloud Computing and Intelligent Mining. Piscataway: IEEE, 2023: 328-333.
[38] MCMAHAN H B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[C]//Proceedings of the International Conference on Artificial Intelligence and Statistics, 2016.
[39] FU J, KUMAR A, NACHUM O, et al. D4RL: datasets for deep data-driven reinforcement learning[J]. arXiv:2004. 07219, 2020. |