ZHANG Feng, GU Qiran, YUAN Shuai. Path Planning Method for Mobile Robot Based on Curiosity Distillation Double Q-Network[J]. Computer Engineering and Applications, 2023, 59(19): 316-322.
[1] 李文彪.基于深度强化学习的工业机器人避障路径规划方法[J].制造业自动化,2022,44(1):127-130.
LI W B.Obstacle avoidance path planning method for industrial robots based on deep reinforcement learning[J].Manufacturing Automation,2022,44(1):127-130.
[2] VOLODYMYR M,KORAY K,DAVID S,et al.Playing Atari with deep reinforcement learning[J].arXiv:1312.5602,2013.
[3] 王军,杨云霄,李莉.基于改进深度强化学习的移动机器人路径规划[J].电子测量技术,2021,44(22):19-24.
WANG J,YANG Y X,LI L.Mobile robot path planning based on improved deep reinforcement learning[J].Electronic Measurement Techniques,2021,44(22):19-24.
[4] CHRISTIANO P F,LEIKE J,BROWN T,et al.Deep reinforcement learning from human preferences[C]//Advances in Neural Information Processing Systems,2017.
[5] BELLEMARE M,SRINIVASAN S,OSTROVSKI G,et al.Unifying count-based exploration and intrinsic motivation[C]//Advances in Neural Information Processing Systems,2016.
[6] 武曲,张义,郭坤,等.结合LSTM的强化学习动态环境路径规划算法[J].小型微型计算机系统,2021,42(2):334-339.
WU Q,ZHANG Y,GUO K,et al.LSTM combined with reinforcement learning dynamic environment path planning algorithm[J].Journal of Chinese Computer Systems,2021,42(2):334-339.
[7] 封硕,舒红,谢步庆.基于改进深度强化学习的三维环境路径规划[J].计算机应用与软件,2021,38(1):250-255.
FENG S,SHU H,XIE B Q.3D environmental path planning based on improved deep reinforcement learning[J].Computer Applications and Software,2021,38(1):250-255.
[8] 张俊杰,张聪,赵涵捷.重复利用状态值的竞争深度Q网络算法[J].计算机工程与应用,2021,57(4):134-140.
ZHANG J J,ZHANG C,ZHAO H J.Dueling deep Q network algorithm with state value reuse[J].Computer Engineering and Applications,2021,57(4):134-140.
[9] 赵英男,刘鹏,赵巍,等.深度Q学习的二次主动采样方法[J].自动化学报,2019,45(10):1870-1882.
ZHAO Y N,LIU P,ZHAO W,et al.Twice sampling method in deep Q-network[J].Acta Automation Sinica,2019,45(10):1870-1882.
[10] ZHANG J,SPRINGENBERG J T,BOEDECKER J,et al.Deep reinforcement learning with successor features for navigation across similar environments[J].arXiv:1612.05533,2016.
[11] LYU L,ZHANG S,DING D,et al.Path planning via an improved DQN-based learning policy[J].IEEE Access,2019,7:67319-67330.
[12] 陈希亮,曹雷,李晨溪,等.基于重抽样优选缓存经验回放机制的深度强化学习方法[J].控制与决策,2018,33(4):600-606.
CHEN X L,CAO L,LI C X,et al.Deep reinforcement learning via good choice resampling experience replay memory[J].Control and Decision,2018,33(4):600-606.
[13] 董永峰,杨琛,董瑶,等.基于改进的DQN机器人路径规划[J].计算机工程与设计,2021,42(2):552-558.
DONG Y F,YANG C,DONG Y,et al.Robot path planning based on improved DQN[J].Computer Engineering and Design,2021,42(2):552-558.
[14] 徐志雄,曹雷,张永亮,等.基于动态融合目标的深度强化学习算法研究[J].计算机工程与应用,2019,55(7):157-161.
XU Z X,CAO L,ZHANG Y L,et al.Research on deep reinforcement learning algorithm based on dynamic fusion target[J].Computer Engineering and Applications,2019,55(7):157-161.
[15] 刘全,闫岩,朱斐,等.一种带探索噪音的深度循环Q网络[J].计算机学报,2019,42(7):1588-1604.
LIU Q,YAN Y,ZHU F,et al.A deep recurrent Q network with exploratory noise[J].Chinese Journal of Computers,2019,42(7):1588-1604.
[16] KIM K,KIM D,LEE J.Deep learning based on smooth driving for autonomous navigation[C]//Proceedings of the IEEE Confrence,2018.
[17] 夏宗涛,秦进.一种深度Q网络的改进算法[J].计算机应用研究,2019,36(12):3661-3665.
XIA Z T,QIN J.Improved algorithm for deep Q net[J].Application Research of Computers,2019,36(12):3661-3665.
[18] LONG P,FAN T,LIAO X,et al.Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning[C]//Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA),2018:6252-6259.
[19] YUE P,XIN J,ZHAO H,et al.Experimental research on deep reinforcement learning in autonomous navigation of mobile robot[C]//Proceedings of the 2019 14th IEEE Conference on Industrial Electronics and Applications (ICIEA),2019:1612-1616.
[20] VAN HASSELT H,GUEZ A,SILVER D.Deep reinforcement learning with double Q-learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2016.
[21] WANG Z Y,DE FREITAS N,LANCTOT M.Dueling network architectures for deep reinforcement learning[J].arXiv:1511.06581,2015.