
计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (10): 66-78.DOI: 10.3778/j.issn.1002-8331.2409-0215
高宇宁,王安成,赵华凯,罗豪龙,杨子迪,李建胜
出版日期:2025-05-15
发布日期:2025-05-15
GAO Yuning, WANG Ancheng, ZHAO Huakai, LUO Haolong, YANG Zidi, LI Jiansheng
Online:2025-05-15
Published:2025-05-15
摘要: 传统的视觉导航方法对高精度地图的依赖性较高,且存在难以避免的误差积累问题,在面对复杂动态环境中的导航任务时往往表现不佳。基于深度强化学习的视觉导航方法通过模拟人类自身的导航方式,能够直接根据视觉信息以端到端的方式实现指定目标的安全导航,是视觉导航领域新兴的研究热点。为探讨深度强化学习视觉导航方向的最新研究问题,直观对比该方向的最新方法,介绍了深度强化学习导航方法的背景和理论。聚焦近五年该方向的主要研究问题,从数据利用、策略优化和场景泛化三个方面对重要方法进行了总结分析。最后给出了对于此类方法目前研究情况和未来研究问题的思考,在总结最新研究动态的同时为相关方法未来的研究提供参考。
高宇宁, 王安成, 赵华凯, 罗豪龙, 杨子迪, 李建胜. 基于深度强化学习的视觉导航方法综述[J]. 计算机工程与应用, 2025, 61(10): 66-78.
GAO Yuning, WANG Ancheng, ZHAO Huakai, LUO Haolong, YANG Zidi, LI Jiansheng. Review on Visual Navigation Methods Based on Deep Reinforcement Learning[J]. Computer Engineering and Applications, 2025, 61(10): 66-78.
| [1] ZHU K, ZHANG T. Deep reinforcement learning based mobile robot navigation: a review[J]. Tsinghua Science and Technology, 2021, 26(5): 674-691. [2] ZENG F Y, WANG C, GE S S. A survey on visual navigation for artificial agents with deep reinforcement learning[J]. IEEE Access, 2020, 8: 135426-135442. [3] JIN S, WANG X M, MENG Q H. Spatial memory-augmented visual navigation based on hierarchical deep reinforcement learning in unknown environments[J]. Knowledge-Based Systems, 2024, 285: 111358. [4] 王泽民, 李建胜, 王安成, 等. 一种基于LK光流的动态场景SLAM新方法[J]. 测绘科学技术学报, 2018, 35(2): 187-190. WANG Z M, LI J S, WANG A C, et al. A method of SLAM based on LK optical flow suitable for dynamic scene[J]. Journal of Geomatics Science and Technology, 2018, 35(2): 187-190. [5] JIN S, CHEN L, SUN R C, et al. A novel vSLAM framework with unsupervised semantic segmentation based on adversarial transfer learning[J]. Applied Soft Computing, 2020, 90: 106153. [6] CAMPOS C, ELVIRA R, RODRíGUEZ J J G, et al. ORB-SLAM3: an accurate open-source library for visual, visual-inertial, and multimap SLAM[J]. IEEE Transactions on Robotics, 2021, 37(6): 1874-1890. [7] 杜传胜, 高焕兵, 侯宇翔, 等. 同根双向扩展的贪心RRT路径规划算法[J]. 计算机工程与应用, 2023, 59(21): 312-318. DU C S, GAO H B, HOU Y X, et al. Greedy RRT path planning algorithm with same root bidirectional extension[J]. Computer Engineering and Applications, 2023, 59(21): 312-318. [8] LIU C, WU L, XIAO W S, et al. An improved heuristic mechanism ant colony optimization algorithm for solving path planning[J]. Knowledge-Based Systems, 2023, 271: 110540. [9] 陈志澜, 唐昊阳. 改进RRT-Connect算法的机器人路径规划研究[J]. 计算机科学与探索, 2025, 19(2): 396-405. CHEN Z L, TANG H Y. Research on robot path planning based on improved RRT-connect algorithm[J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(2): 396-405. [10] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533. [11] SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529(7587): 484-489. [12] 郭迟, 罗宾汉, 李飞, 等. 类脑导航算法: 综述与验证[J]. 武汉大学学报 (信息科学版), 2021, 46(12): 1819-1831. GUO C, LUO B H, LI F, et al. Review and verification for brain-like navigation algorithm[J]. Geomatics and Information Science of Wuhan University, 2021, 46(12): 1819-1831. [13] ARAFAT M Y, ALAM M M, MOH S. Vision-based navigation techniques for unmanned aerial vehicles: review and challenges[J]. Drones, 2023, 7(2): 89. [14] 何丽, 姚佳程, 廖雨鑫, 等. 深度强化学习求解移动机器人端到端导航问题的研究综述[J]. 计算机工程与应用, 2024, 60(14): 1-13. HE L, YAO J C, LIAO Y X, et al. Research review on deep reinforcement learning for solving end-to-end navigation problems of mobile robots[J]. Computer Engineering and Applications, 2024, 60(14): 1-13. [15] VAN HASSELT H, GUEZ A, SILVER D, et al. Deep reinforcement learning with double Q-learning[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2016, 30(1): 2094-2100. [16] WANG Z Y, SCHAUL T, HESSEL M, et al. Dueling network architectures for deep reinforcement learning[C]//Proceedings of the 33rd International Conference on Machine Learning, 2016: 1995-2003. [17] FORTUNATO M, AZAR M G, PIOT B, et al. Noisy networks for exploration[J]. arXiv:1706.10295, 2017. [18] MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]//Proceedings of the 33rd International Conference on Machine Learning, 2016: 1928-1937. [19] WIJMANS E, ESSA I, BATRA D. VER: scaling on-policy RL leads to the emergence of navigation in embodied rearrangement[J]. arXiv:2210.05064, 2022. [20] SCHULMAN J, LEVINE S, MORITZ P, et al. Trust region policy optimization[J]. arXiv:1502.05477, 2015. [21] SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[J]. arXiv:1707.06347, 2017. [22] YAN J J, ZHANG Q S, CHENG J, et al. Indoor target-driven visual navigation based on spatial semantic information[C]//Proceedings of the 2022 IEEE International Conference on Image Processing. Piscataway: IEEE, 2022: 571-575. [23] DANG R H, WANG L Y, HE Z T, et al. Search for or navigate to? Dual adaptive thinking for object navigation[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2023: 8216-8225. [24] WANG S, WU Z H, HU X B, et al. Skill-based hierarchical reinforcement learning for target visual navigation[J]. IEEE Transactions on Multimedia, 2023, 25: 8920-8932. [25] ZHOU K, GUO C, GUO W F, et al. Learning heterogeneous relation graph and value regularization policy for visual navigation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(11): 16901-16915. [26] WANG X, LIU Y, SONG X, et al. CaMP: causal multi-policy planning for interactive navigation in multi-room scenes[C]//Advances in Neural Information Processing Systems 36, 2023. [27] AL-HALAH Z, RAMAKRISHNAN S K, GRAUMAN K. Zero experience required: plug & play modular transfer learning for semantic visual navigation[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 17010-17020. [28] SUN X, CHEN P, FAN J, et al. FGPrompt: fine-grained goal prompting for image-goal navigation[C]//Advances in Neural Information Processing Systems 36, 2023. [29] LI H X, WANG Z Y, YANG X, et al. MemoNav: working memory model for visual navigation[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 17913-17922. [30] SUN X Y, LIU L Z, ZHI H Y, et al. Prioritized semantic learning for zero-shot instance navigation[C]//Proceedings of the 18th European Conference on Computer Vision. Cham: Springer, 2024: 161-178. [31] KULHáNEK J, DERNER E, DE BRUIN T, et al. Vision-based navigation using deep reinforcement learning[C]//Proceedings of the 2019 European Conference on Mobile Robots, 2019: 1-8. [32] KAZEMI MOGHADDAM M, ABBASNEJAD E, WU Q, et al. ForeSI: success-aware visual navigation agent[C]//Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2022: 3401-3410. [33] RAO Z H, WU Y C, YANG Z F, et al. Visual navigation with multiple goals based on deep reinforcement learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(12): 5445-5455. [34] SHEN J W, YUAN L, LU Y, et al. Leveraging predictions of task-related latents for interactive visual navigation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2025, 36(1): 704-717. [35] MOUSAVIAN A, TOSHEV A, FISER M, et al. Visual representations for semantic target driven navigation[C]//Proceedings of the 2019 International Conference on Robotics and Automation. Piscataway: IEEE, 2019: 8846-8852. [36] ZHOU B, KR?HENBüHL P, KOLTUN V. Does computer vision matter for action?[J]. Science Robotics, 2019, 4(30): eaaw6661. [37] GORDON D, KADIAN A, PARIKH D, et al. SplitNet: Sim2Sim and Task2Task transfer for embodied visual navigation[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 1022-1031. [38] LI J C, WANG X, TANG S L, et al. Unsupervised reinforcement learning of transferable meta-skills for embodied navigation[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 12120-12129. [39] LANDI F, BARALDI L, CORNIA M, et al. Multimodal attention networks for low-level vision-and-language navigation[J]. Computer Vision and Image Understanding, 2021, 210: 103255. [40] SHEN W, XU D F, ZHU Y K, et al. Situational fusion of visual representation for visual navigation[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 2881-2890. [41] YEN-CHEN L, ZENG A, SONG S R, et al. Learning to see before learning to act: visual pre-training for manipulation[C]//Proceedings of the 2020 IEEE International Conference on Robotics and Automation. Piscataway: IEEE, 2020: 7286-7293. [42] DU Y L, GAN C, ISOLA P. Curious representation learning for embodied intelligence[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10388-10397. [43] MIROWSKI P, GRIMES M K, MALINOWSKI M, et al. Learning to navigate in cities without a map[C]//Advances in Neural Information Processing Systems 31, 2018: 2424-2435. [44] YE J, BATRA D, WIJMANS E, et al. Auxiliary tasks speed up learning PointGoal navigation[J]. arXiv:2007.04561, 2020. [45] SINGH K P, SALVADOR J, WEIHS L, et al. Scene graph contrastive learning for embodied navigation[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2023: 10850-10860. [46] LI W Y, HONG R X, SHEN J W, et al. Transformer memory for interactive visual navigation in cluttered environments[J]. IEEE Robotics and Automation Letters, 2023, 8(3): 1731-1738. [47] TOLANI V, BANSAL S, FAUST A, et al. Visual navigation among humans with optimal control as a supervisor[J]. IEEE Robotics and Automation Letters, 2021, 6(2): 2288-2295. [48] WU Q Y, MANOCHA D, WANG J, et al. NeoNav: improving the generalization of visual navigation via generating next expected observations[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(6): 10001-10008. [49] DU H M, YU X, ZHENG L. Learning object relation graph and tentative policy for visual navigation[C]//Proceedings of the 16th European Conference on Computer Vision. Cham: Springer, 2020: 19-34. [50] FANG Q, XU X, WANG X T, et al. Target-driven visual navigation in indoor scenes using reinforcement learning and imitation learning[J]. CAAI Transactions on Intelligence Technology, 2022, 7(2): 167-176. [51] BAI C J, LIU P, LIU K Y, et al. Variational dynamic for self-supervised exploration in deep reinforcement learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(8): 4776-4790. [52] DRUON R, YOSHIYASU Y, KANEZAKI A, et al. Visual object search by learning spatial context[J]. IEEE Robotics and Automation Letters, 2020, 5(2): 1279-1286. [53] SANG H R, JIANG R, WANG Z P, et al. A novel neural multi-store memory network for autonomous visual navigation in unknown environment[J]. IEEE Robotics and Automation Letters, 2022, 7(2): 2039-2046. [54] STOOKE A, LEE K, ABBEEL P, et al. Decoupling representation learning from reinforcement learning[C]//Proceedings of the 38th International Conference on Machine Learning, 2021: 9870-9879. [55] LU Y, CHEN Y R, ZHAO D B, et al. MGRL: graph neural network based inference in a Markov network with reinforcement learning for visual navigation[J]. Neurocomputing, 2021, 421: 140-150. [56] RUAN X G, LI P, ZHU X Q, et al. A target-driven visual navigation method based on intrinsic motivation exploration and space topological cognition[J]. Scientific Reports, 2022, 12(1): 3462. [57] MAYO B, HAZAN T, TAL A. Visual navigation with spatial attention[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 16893-16902. [58] LI Z Y, ZHOU A G. RDDRL: a recurrent deduction deep reinforcement learning model for multimodal vision-robot navigation[J]. Applied Intelligence, 2023, 53(20): 23244-23270. [59] ZHOU K, GUO C, ZHANG H Y, et al. Optimal Graph Transformer Viterbi knowledge inference network for more successful visual navigation[J]. Advanced Engineering Informatics, 2023, 55: 101889. [60] 孟怡悦, 郭迟, 刘经南. 采用注意力机制和奖励塑造的深度强化学习视觉目标导航方法[J]. 武汉大学学报 (信息科学版), 2024, 49(7): 1100-1108. MENG Y Y, GUO C, LIU J N. Deep reinforcement learning visual target navigation method based on attention mechanism and reward shaping[J]. Geomatics and Information Science of Wuhan University, 2024, 49(7): 1100-1108. [61] XIAO W D, YUAN L, HE L, et al. Multigoal visual navigation with collision avoidance via deep reinforcement learning[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 2505809. [62] YE X, YANG Y. Hierarchical and partially observable goal-driven policy learning with goals relational graph[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 14096-14105. [63] ZENG K H, WEIHS L, FARHADI A, et al. Pushing it out of the way: interactive visual navigation[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 9863-9872. [64] WORTSMAN M, EHSANI K, RASTEGARI M, et al. Learning to learn how to learn: self-adaptive visual navigation using meta-learning[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 6743-6752. [65] WU Q Y, XU K, WANG J, et al. Reinforcement learning-based visual navigation with information-theoretic regularization[J]. IEEE Robotics and Automation Letters, 2021, 6(2): 731-738. [66] LANCTOT M, ZAMBALDI V, GRUSLYS A, et al. A unified game-theoretic approach to multiagent reinforcement learning[C]//Advances in Neural Information Processing Systems 30, 2017: 4190-4203. [67] 刘紫燕, 杨模, 袁浩, 等. 结合拆分注意力机制和下一次预期观察的视觉导航[J]. 电子测量与仪器学报, 2023, 37(1): 96-105. LIU Z Y, YANG M, YUAN H, et al. Visual navigation combining split attention mechanism and next expected observation[J]. Journal of Electronic Measurement and Instrumentation, 2023, 37(1): 96-105. [68] LIU S, SUGANUMA M, OKATANI T. Symmetry-aware neural architecture for embodied visual navigation[J]. International Journal of Computer Vision, 2024, 132(4): 1091-1107. [69] ZHOU K Q, ZHENG K, PRYOR C, et al. ESC: exploration with soft commonsense constraints for zero-shot object navigation[C]//Proceedings of the 40th International Conference on Machine Learning, 2023: 42829-42842. [70] SUN Q R, LIU Y Y, CHUA T S, et al. Meta-transfer learning for few-shot learning[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 403-412. [71] DEITKE M, HAN W, HERRASTI A, et al. RoboTHOR: an open simulation-to-real embodied AI platform[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 3161-3171. |
| [1] | 郝鹤菲, 张龙豪, 崔洪振, 朱宵月, 彭云峰, 李向晖. 深度神经网络在人体姿态估计中的应用综述[J]. 计算机工程与应用, 2025, 61(9): 41-60. |
| [2] | 李仝伟, 仇大伟, 刘静, 逯英航. 基于RGB与骨骼数据的人体行为识别综述[J]. 计算机工程与应用, 2025, 61(8): 62-82. |
| [3] | 张奇, 周威, 胡伟超, 于鹏程. 融合递进式域适应和交叉注意力的事故检测方法[J]. 计算机工程与应用, 2025, 61(6): 349-360. |
| [4] | 马祖鑫, 崔允贺, 秦永彬, 申国伟, 郭春, 陈意, 钱清. 融合深度强化学习的卷积神经网络联合压缩方法[J]. 计算机工程与应用, 2025, 61(6): 210-219. |
| [5] | 洪书颖, 张东霖. 语义信息处理方式分类的车道线检测技术研究综述[J]. 计算机工程与应用, 2025, 61(5): 1-17. |
| [6] | 刘延飞, 李超, 王忠, 王杰铃. 多智能体深度强化学习及可扩展性研究进展[J]. 计算机工程与应用, 2025, 61(4): 1-24. |
| [7] | 李彦, 万征. 深度强化学习在边缘视频传输优化中的应用综述[J]. 计算机工程与应用, 2025, 61(4): 43-58. |
| [8] | 顾金浩, 况立群, 韩慧妍, 曹亚明, 焦世超. 动态环境下共融机器人深度强化学习导航算法[J]. 计算机工程与应用, 2025, 61(4): 90-98. |
| [9] | 陈善静, 李震, 王正刚, 刘宁波, 何韵. 西藏地区传统民居建筑典型特征遥感提取方法[J]. 计算机工程与应用, 2025, 61(3): 349-358. |
| [10] | 付均尚, 田莹. 采用多信息残差融合和多尺度特征表达的水下目标检测[J]. 计算机工程与应用, 2025, 61(11): 272-283. |
| [11] | 魏琦, 李艳武, 谢辉, 牛晓伟. 基于图神经网络的柔性作业车间两阶段调度研究[J]. 计算机工程与应用, 2025, 61(11): 342-350. |
| [12] | 蔡子悦, 袁振岳, 庞明勇. 深度学习的点云语义分割方法综述[J]. 计算机工程与应用, 2025, 61(11): 22-30. |
| [13] | 杨浩哲, 郭楠. 基于图像的虚拟试衣综述——从深度学习到扩散模型[J]. 计算机工程与应用, 2025, 61(10): 19-35. |
| [14] | 胡翔坤, 李华, 冯毅雄, 钱松荣, 李键, 李少波. 基于深度学习的基础设施表面裂纹检测方法研究进展[J]. 计算机工程与应用, 2025, 61(1): 1-23. |
| [15] | 李厚君, 韦柏全. 属性蒸馏的零样本识别方法[J]. 计算机工程与应用, 2024, 60(9): 219-227. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||