深度强化学习求解移动机器人端到端导航问题的研究综述

doi:10.3778/j.issn.1002-8331.2312-0256

摘要/Abstract

摘要： 自主导航是移动机器人完成复杂任务的前提和基础，传统的自主导航系统依赖于地图的精度，无法适应高度复杂的作业和服务场景。移动机器人不依赖先验地图信息，通过深度强化学习与环境交互学习能够自主决策的端到端导航方法成为新的研究热点。大多数现有的分类方法不能全面地总结端到端导航问题的挑战和机遇，根据端到端导航系统的特点，将导航问题的挑战归结为导航智能体感知能力差、学习效率低和导航策略泛化能力弱等关键问题，阐述了端到端导航系统的研究现状和发展趋势，分别详细介绍了近年来针对这些关键问题的代表性研究成果，并对其优势和不足进行了归纳总结。最后，从视觉语言导航、多智能体协同导航、融合超分辨率重建图像的端到端导航和可解释性端到端导航等方面展望了移动机器人端到端导航的未来发展趋势，为移动机器人端到端导航的研究和应用提供一定的思路。

关键词: 端到端导航, 深度强化学习, 感知能力, 学习效率, 泛化能力

Abstract: Autonomous navigation is the prerequisite and foundation for mobile robots to accomplish complex tasks. Traditional autonomous navigation systems rely on the accuracy of maps and cannot adapt to highly complex industrial and service scenarios. End-to-end navigation methods for mobile robots that do not rely on a priori map information and are able to make autonomous decisions through deep reinforcement learning, and environment interaction learning have become a new research hotspot. Most existing classifications cannot comprehensively summarize the challenges and opportunities of end-to-end navigation problems. Based on the characteristics of end-to-end navigation systems, the challenges of the navigation problem are attributed to the key issues of poor perception ability of navigation agents, ineffective learning and poor generalization ability of navigation strategies. The research status and development trends of end-to-end navigation systems are described. Representative research results in recent years addressing these key issues are detailed respectively, and their advantages and shortcomings are summarized. Finally, the future development trends of end-to-end navigation for mobile robots are prospectively envisioned in aspects such as visual language navigation, multi-agents collaborative navigation, end-to-end navigation for fusion super-resolution reconstructed images and interpretable end-to-end navigation, providing certain insights for the research and application of end-to-end navigation for mobile robots.

Key words: end-to-end navigation, deep reinforcement learning, perception ability, learning efficiency, generalization ability

何丽, 姚佳程, 廖雨鑫, 张文智, 卢赵清, 袁亮, 肖文东. 深度强化学习求解移动机器人端到端导航问题的研究综述[J]. 计算机工程与应用, 2024, 60(14): 1-13.

HE Li, YAO Jiacheng, LIAO Yuxin, ZHANG Wenzhi, LU Zhaoqing, YUAN Liang, XIAO Wendong. Research Review on Deep Reinforcement Learning for Solving End-to-End Navigation Problems of Mobile Robots[J]. Computer Engineering and Applications, 2024, 60(14): 1-13.

参考文献

[1] 孙溥茜. 京东物流: 智能物流体系中的配送机器人与无人机技术[J]. 机器人产业, 2022(5): 56-58.
SUN P Q. JD logistics: delivery robots and drones in intelligent logistics system[J]. Robot Industry, 2022(5): 56-58.
[2] 崔炜, 朱发证. 机器人导航的路径规划算法研究综述[J]. 计算机工程与应用, 2023, 59(19): 10-20.
CUI W, ZHU F Z. Review of path planning algorithms for robot navigation[J]. Computer Engineering and Applications, 2023, 59(19): 10-20.
[3] KEGELEIRS M, GRISETTI G, BIRATTARI M. Swarm SLAM: challenges and perspectives[J]. Frontiers in Robotics and AI, 2021, 8: 618268.
[4] 毛文平, 李帅永, 谢现乐, 等. 基于自适应机制改进蚁群算法的移动机器人全局路径规划[J]. 控制与决策, 2023, 38(9): 2520-2528.
MAO W P, LI S Y, XIE X L, et al. Global path planning of mobile robot based on adaptive mechanism improved ant colony algorithm[J]. Control and Decision, 2023, 38(9): 2520-2528.
[5] JIAN Z, ZHANG S, CHEN S, et al. A global-local coupling two-stage path planning method for mobile robots[J]. IEEE Robotics and Automation Letters, 2021, 6(3): 5349-5356.
[6] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing atari with deep reinforcement learning[J]. arXiv:1312.5602, 2013.
[7] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518: 529-533.
[8] VINYALS O, BABUSCHKIN I, CZARNECKI W M, et al. Grandmaster level in starcraft II using multi-agent reinforcement learning[J]. Nature, 2019, 575: 350-354.
[9] YUE P, XIN J, ZHAO H, et al. Experimental research on deep reinforcement learning in autonomous navigation of mobile robot[C]//Proceedings of the 14th IEEE Conference on Industrial Electronics and Applications (ICIEA), 2019: 1612-1616.
[10] GAO X, GAO R, LIANG P, et al. A hybrid tracking control strategy for nonholonomic wheeled mobile robot incorporating deep reinforcement learning approach[J]. IEEE Access, 2021, 9: 15592-15602.
[11] 刘春晖, 王思长, 郑策, 等. 基于深度学习的室内导航机器人避障规划算法[J]. 吉林大学学报(工学版), 2023, 53(12): 3558-3564.
LIU C H, WANG S C, ZHENG C, et al. Obstacle avoidance planning algorithm for indoor navigation robot based on deep learning[J]. Journal of Jilin University (Engineering Edition), 2023, 53(12): 3558-3564.
[12] FANG Q, XU X, WANG X, et al. Target‐driven visual navigation in indoor scenes using reinforcement learning and imitation learning[J]. CAAI Transactions on Intelligence Technology, 2022, 7(2): 167-176.
[13] LIANG J, WEERAKOON K, GUAN T, et al. Adaptiveon: adaptive outdoor local navigation method for stable and reliable actions[J]. IEEE Robotics and Automation Letters, 2022, 8(2): 648-655.
[14] JOSEF S, DEGANI A. Deep reinforcement learning for safe local planning of a ground vehicle in unknown rough terrain[J]. IEEE Robotics and Automation Letters, 2020, 5(4): 6748-6755.
[15] XIE Z, DAMES P. DRL-VO: learning to navigate through crowded dynamic scenes using velocity obstacles[J]. IEEE Transactions on Robotics, 2023, 39(4): 2700-2719.
[16] 张德龙, 李威凌, 吴怀宇, 等. 基于学习机制的移动机器人动态场景自适应导航方法[J]. 信息与控制, 2016, 45(5): 521-529.
ZHANG D L, LI W L, WU H Y, et al. Mobile robot adaptive navigation in dynamic scenarios based on learning mechanism[J]. Information and Control, 2016, 45(5): 521-529.
[17] ALAMIYAN-HARANDI F, DERHAMI V, JAMSHIDI F. Combination of recurrent neural network and deep learning for robot navigation task in off-road environment[J]. Robotica, 2020, 38(8): 1450-1462.
[18] LIU S, CHANG P, LIANG W, et al. Decentralized structural-rnn for robot crowd navigation with deep reinforcement learning[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), May 30-June 5, 2021: 3517-3524.
[19] MA L L, LIU Y J, CHEN J, et al. Learning to navigate in indoor environments: from memorizing to reasoning[J]. arXiv:1904.06933, 2019.
[20] 袁浩, 刘紫燕, 梁静, 等. 融合LSTM的深度强化学习视觉导航[J]. 无线电工程, 2022, 52(1): 161-167.
YU H, LIU Y Z, LIANG J, et al. Visual navigation based on LSTM and deep reinforcement learning[J]. Radio Engineering, 2022, 52(1): 161-167.
[21] 张仪, 冯伟, 王卫军, 等. 融合LSTM和PPO算法的移动机器人视觉导航[J]. 电子测量与仪器学报, 2022, 36(8): 132-140.
ZHANG Y, FENG W, WANG W J, et al. Visual navi-gation of mobile robots based on LSTM and PPO alg-orithms[J]. Journal of Electronic Measurement and Instrumentation, 2022, 36(8): 132-140.
[22] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
[23] YANG K, WANG K, BERGASA L, et al. Unifying terrain awareness for the visually impaired through real-time semantic segmentation[J]. Sensors, 2018, 18(5): 1506.
[24] 徐风尧, 王恒升. 移动机器人导航中的楼道场景语义分割[J]. 计算机应用研究, 2018, 35(6): 1863-1866.
XU F Y, WANG H S. Semantic segmentation of corridor scene for mobile robot navigation[J]. Application Research of Computers, 2018, 35(6): 1863-1866.
[25] MOUSAVIAN A, TOSHEV A, FI?ER M, et al. Visual representations for semantic target driven navigation[C]//Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), 2019: 8846-8852.
[26] GUAN T, KOTHANDARAMAN D, CHANDRA R, et al. GA-NAV: efficient terrain segmentation for robot navigation in unstructured outdoor environments[J]. IEEE Robotics and Automation Letters, 2022, 7(3): 8138-8145.
[27] DANG T V, BUI N T. Multi-scale fully convolutional network-based semantic segmentation for mobile robot navigation[J]. Electronics, Multidisciplinary Digital Publishing Institute, 2023, 12(3): 533.
[28] CARLONE L, KARAMAN S. Attention and anticipation in fast visual-inertial navigation[J]. IEEE Transactions on Robotics, 2019, 35(1): 1-20.
[29] CHEN C, LIU Y, KREISS S, et al. Crowd-robot interaction: crowd-aware robot navigation with attention-based deep reinforcement learning[C]//Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), 2019: 6015-6022.
[30] SEYMOUR Z, THOPALLI K, MITHUN N, et al. MaAST: map attention with semantic transformers for efficient visual navigation[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021: 13223-13230.
[31] SONG C, HE Z, DONG L. A local-and-global attention reinforcement learning algorithm for multiagent cooperative navigation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(6): 7767-7777.
[32] LIU S, CHANG P, HUANG Z, et al. Intention aware robot crowd navigation with attention-based interaction graph[C]//Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023: 12015-12021.
[33] 孟怡悦, 郭迟, 刘经南. 基于注意力机制和奖励塑造的深度强化学习视觉目标导航方法[J]. 武汉大学学报 (信息科学版), 2023: 1-9. DOI: 10.13203/j.whugis20230193.
MENG Y YUE, GUO C, LIU J N. Deep reinforcement learning visual target navigation method based on attention mechanism and reward shaping[J]. Geomatics and Information Science of Wuhan University, 2023: 1-9. DOI: 10.13203/ j.whugis20230193.
[34] GUO H, HUANG Z, HO Q, et al. Autonomous navigation in dynamic environments with multi-modal perception uncertainties[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021: 9255-9261.
[35] CAI P, WANG S, SUN Y, et al. Probabilistic end-to-end vehicle navigation in complex dynamic environments with multimodal sensor fusion[J]. IEEE Robotics and Automation Letters, 2020, 5(3): 4218-4224.
[36] LI Z, ZHOU A, PU J, et al. Multi-modal neural feature fusion for automatic driving through perception-aware path planning[J]. IEEE Access, 2021, 9: 142782-142794.
[37] YU X, ZHOU B, CHANG Z, et al. MMDF: multi-modal deep feature based place recognition of mobile robots with applications on cross-scene navigation[J]. IEEE Robotics and Automation Letters, 2022, 7(3): 6742-6749.
[38] 王业飞, 葛泉波, 刘华平, 等. 机器人视觉听觉融合的感知操作系统[J]. 智能系统学报, 2023, 18(2): 381-389.
WANG Y F, GE Q B, LIU H P, et al. A perceptual manipulation system for audio-visual fusion of robots[J]. CAAI Transactions on Intelligent Systems, 2023, 18(2): 381-389.
[39] MAJUMDAR A, AGGARWAL G, DEVNANI B, et al. ZSON: zero-shot object-goal navigation using multimodal goal embeddings[C]//Advances in Neural Information Processing Systems, 2022: 32340-32352.
[40] SHI H, SHI L, XU M, et al. End-to-end navigation strategy with deep reinforcement learning for mobile robots[J]. IEEE Transactions on Industrial Informatics, 2020, 16(4): 2393-2402.
[41] WU K, WANG H, ABOLFAZLI ESFAHANI M, et al. BND*-DDQN: learn to steer autonomously through deep reinforcement learning[J]. IEEE Transactions on Cognitive and Developmental Systems, 2021, 13(2): 249-261.
[42] ZHANG J, YU H, XU W. Hierarchical reinforcement learning by discovering intrinsic options[C]//Proceedings of the International Conference on Learning Representations, 2020: 1-19.
[43] 王童, 李骜, 宋海荦, 等. 基于分层深度强化学习的移动机器人导航方法[J]. 控制与决策, 2022, 37(11): 2799-2807.
WANG T, LI A, SONG H H, et al. Navigation method for mobile robot based on hierarchical deep reinforcement learning[J]. Control and Decision, 2022, 37(11): 2799-2807.
[44] YE X, YANG Y. Hierarchical and partially observable goal-driven policy learning with goals relational graph[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021: 14096-14105.
[45] PéREZ-D’ARPINO C, LIU C, GOEBEL P, et al. Robot navigation in constrained pedestrian environments using reinforcement learning[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 2021: 1140-1146.
[46] K?STNER L, BUIYAN T, JIAO L, et al. Arena-Rosnav: towards deployment of deep-reinforcement-learning-based obstacle avoidance into conventional autonomous navigation systems[C]//Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 2021: 6456-6463.
[47] K?STNER L, ZHAO X, BUIYAN T, et al. Connecting deep-reinforcement-learning-based obstacle avoidance with conventional global planners using waypoint generators[C]//Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 2021: 1213-1220.
[48] K?STNER L, COX J, BUIYAN T, et al. All-in-one: a drl-based control switch combining state-of-the-art navigation planners[C]//Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 2022: 2861-2867.
[49] YE J, BATRA D, WIJMANS E, et al. Auxiliary tasks speed up learning point goal navigation[C]//Proceedings of the 4th Conference on Robot Learning, 2021: 498-516.
[50] SANG H, JIANG R, WANG Z, et al. A novel neural multi-store memory network for autonomous visual navigation in unknown environment[J]. IEEE Robotics and Automation Letters, 2022, 7(2): 2039-2046.
[51] KUO C W, MA C Y, HOFFMAN J, et al. Structure-encoding auxiliary tasks for improved visual representation in vision-and-language navigation[C]//Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2023: 1104-1113.
[52] ZHANG W, HE L, WANG H, et al. Multiple self-s-upervised auxiliary tasks for target-driven visual navig-ation using deep reinforcement learning[J]. Entropy, Multi-disciplinary Digital Publishing Institute, 2023, 25(7): 1007.
[53] 王浩杰, 陶冶, 鲁超峰. 基于碰撞预测的强化模仿学习机器人导航方法[J]. 计算机工程与应用, 2024, 60(10): 341-352.
WANG H, TAO Y, LU C F. Reinforcement imitation learning method based on collision predict for robots navigation[J]. Computer Engineering and Applications, 2024, 60(10): 341-352.
[54] PFEIFFER M, SHUKLA S, TURCHETTA M, et al. Reinforced imitation: sample efficient deep reinforcement learning for mapless navigation by leveraging prior demonstrations[J]. IEEE Robotics and Automation Letters, 2018, 3(4): 4423-4430.
[55] WANG X, HUANG Q, CELIKYILMAZ A, et al. Reinforced cross-modal matching and self-supervised imitation learning for vision-language navigation[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019: 6622-6631.
[56] XIAO W, YUAN L, HE L, et al. Multigoal visual navigation with collision avoidance via deep reinforcement learning[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 1-9.
[57] WANG H, CHEN A G H, LI X, et al. Find what you want: learning demand?conditioned object attribute space for demand-driven navigation[C]//Advances in Neural Information Processing Systems, 2023.
[58] ZHANG J, YU S, DUAN J, et al. Good time to ask: a learning framework for asking for help in embodied visual navigation[C]//Proceedings of the 20th International Conference on Ubiquitous Robots, 2023: 503-509.
[59] LYU Y, SHI Y, ZHANG X. Improving target-driven visual navigation with attention on 3D spatial relationships[J]. Neural Processing Letters, 2022, 54(5): 3979-3998.
[60] PAN B, PANDA R, JIN S Y, et al. LangNav: language as a perceptual representation for navigation[J]. arXiv:2310.07889, 2023.
[61] ZHAO W, QUERALTA J P, WESTERLUND T. Sim-to-real transfer in deep reinforcement learning for robotics: a survey[C]//Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), 2020: 737-744.
[62] KULHANEK J, DERNER E, BABUSKA R. Visual navigation in real-world indoor environments using end-to-end deep reinforcement learning[J]. IEEE Robotics and Automation Letters, 2021, 6(3): 4345-4352.
[63] MURATORE F, RAMOS F, TURK G, et al. Robot learning from randomized simulations: a review[J]. Frontiers in Robotics and AI, 2022, 9: 799893.
[64] 张夏禹, 陈小平. 基于目标的域随机化方法在机器人操作方面的研究[J]. 计算机应用研究, 2022, 39(10): 3084-3088.
ZHANG X Y, CHEN X P. Research on goal-based domain randomization method in robot manipulation[J]. Application Research of Computers, 2022, 39(10), 3084-3088.
[65] TRUONG J, CHERNOVA S, BATRA D. Bi-directional domain adaptation for sim2real transfer of embodied navigation agents[J]. IEEE Robotics and Automation Letters, 2021, 6(2): 2634-2641.
[66] LEE E S, KIM J, KIM Y M. Self-supervised domain adaptation for visual navigation with global map consistency[C]//Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2022: 1868-1877.
[67] FRIED D, HU R, CIRIK V, et al. Speaker-follower models for vision-and-language navigation[C]//Advances in Neural Information Processing Systems, 2018.
[68] 袁诚, 朱倩倩, 赖际舟, 等. 基于模拟多位置数据增强驱动零速检测的惯性行人导航方法[J]. 中国惯性技术学报, 2022, 30(6): 709-715.
YUAN C, ZHU Q Q, LAI J Z, et al. Inertial pedestrian navigation method based on simulated multi-position data augmentation driven zero-velocity detection[J]. Journal of Chinese Inertial of Technology, 2022, 30(6): 709-715.
[69] FENG J, LI Y, ZHAO K, et al. DeepMM: deep learning based map matching with data augmentation[J]. IEEE Transactions on Mobile Computing, 2022, 21(7): 2372-2384.
[70] WANG Z, LI J, HONG Y, et al. Scaling data generation in vision-and-language navigation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023: 12009-12020.
[71] HE K, SI C, LU Z, et al. Frequency-enhanced data augmentation for vision-and-language navigation[C]//Advances in Neural Information Processing Systems, 2024.
[72] HAO W, LI C, LI X, et al. Towards learning a generic agent for vision-and-language navigation via pre-training[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020: 13134-13143.
[73] PASHEVICH A, SCHMID C, SUN C. Episodic tran-sformer for vision-and-language navigation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 15942-15952.
[74] QIAO Y, QI Y, HONG Y, et al. Hop: history-and-order aware pretraining for vision-and-language navigation[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022: 15397-15406.
[75] HUANG B, ZHANG S, HUANG J, et al. Knowledge distilled pre-training model for vision-language-navigation[J]. Applied Intelligence, 2023, 53(5): 5607-5619.
[76] BADKI A, GALLO O, KAUTZ J, et al. Binary TTC: a temporal geofence for autonomous navigation[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021: 12941-12950.
[77] GUHUR P L, TAPASWI M, CHEN S, et al. Airbert: in-domain pretraining for vision-and-language navigation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 1634-1643.
[78] TANG T, YU X, DONG X, et al. Auto-navigator: decoupled neural architecture search for visual navigation[C]//Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2021: 3742-3751.
[79] 司马双霖, 黄岩, 何科技, 等. 视觉语言导航研究进展[J]. 自动化学报, 2023, 49(1): 1-14.
SIMA S L, HUANG Y, HE K J, et al. Recent advances in vision-and-language navigation[J]. Acta Automatica Sinica, 2023, 49(1): 1-14.
[80] 林谦, 余超, 伍夏威, 等. 面向机器人系统的虚实迁移强化学习综述[J]. 软件学报, 2024, 35(2): 711-738.
LIN Q, YU C, WU X W, et al. Survey on virtual-to-real transfer reinforcement learning for robot systems[J]. Journal of Software, 2024, 35(2): 711-738.
[81] 胡成纬, 江爱文, 王明文. 基于场景图知识融入与元学习的视觉语言导航[J]. 山西大学学报 (自然科学版), 2021, 44(3): 420-427.
HU C W, JIANG A W, WANG M W. Visual language navigation based on scene graph knowledge fusion and meta-learning[J]. Journal of Shanxi University (Nat Sci Ed ), 2021, 44(3): 420-427.
[82] YU W, TAN J, BAI Y, et al. Learning fast adaptation with meta strategy optimization[J]. IEEE Robotics and Automation Letters, 2020, 5(2): 2950-2957.
[83] WEN S, WEN Z, ZHANG D, et al. A multi-robot path-planning algorithm for autonomous navigation using meta-reinforcement learning based on transfer learning[J]. Applied Soft Computing, 2021, 110: 107605.
[84] LIU N, CAI Y, LU T, et al. Real-sim-real transfer for real-world robot control policy learning with deep reinforcement learning[J]. Applied Sciences, 2020, 10(5): 1555.
[85] JAYARATNE M, ALAHAKOON D, DE SILVA D. Unsupervised skill transfer learning for autonomous robots using distributed growing self organizing maps[J]. Robotics and Autonomous Systems, 2021, 144: 103835.
[86] AL-HALAH Z, RAMAKRISHNAN S K, GRAUMAN K. Zero experience required: plug & play modular transfer learning for semantic visual navigation[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022: 17010-17020.
[87] ZHANG Y, ZAVLANOS M M. Distributed off-policy actor-critic reinforcement learning with policy consen-sus[C]//Proceedings of the 2019 IEEE 58th Conference on Decision and Control (CDC), 2019: 4674-4679.
[88] XU Z, BAI Y, ZHANG B, et al. Haven: hierarchical cooperative multi-agent reinforcement learning with dual coordination mechanism[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2023: 11735-11743.
[89] CHEN C, QING C, XU X, et al. Cross parallax attention network for stereo image super-resolution[J]. IEEE Transactions on Multimedia, 2021, 24: 202-216.
[90] JIN K, WANG X, SHAO F. Jointly texture enhanced and stereo captured network for stereo image super-resolution[J]. Pattern Recognition Letters, 2023, 167: 141-148.
[91] OKARMA K, TECLAW M, LECH P. Application of super-resolution algorithms for the navigation of autonomous mobile robots[C]//Proceedings of the 6th International Conference on Image Processing and Communications Challenges, 2015: 145-152.