仓储环境下基于忆阻强化学习的AGV路径规划

doi:10.3778/j.issn.1002-8331.2204-0491

摘要/Abstract

摘要： 针对动态仓储环境下的AGV路径规划，采用栅格法对仓储环境进行建模，通过改进了概率转移函数及信息素的蚁群算法完成静态环境下的路径规划；利用忆阻器和生物神经突触类似的特性，将其作为神经网络突触结构，改进传统的DQN算法，并利用基于忆阻器阵列的DQN算法进行动态局部避障；依据AGV感知范围内是否存在动态障碍物实时地切换路径规划机制，以实现高效的AGV搬运工作。在MATLAB仿真平台进行实验，结果表明该路径规划方法可有效、实时地为AGV规划出一条安全无碰撞的最优路径。

关键词: 自动引导车（AGV）, 动态环境, 深度Q网络（DQN）, 忆阻器, 路径规划

Abstract: In order to solve the AGV path planning problem in the dynamic storage environment, the grid method is used to model the warehouse environment. And the path planning tasks in the static environment is completed by improving the probability transfer function and the ant colony algorithm pheromone. Since the memristor characteristics is similar to biological synapse, which is used as the neural network synapse structure. And the DQN algorithm based on memristor array is used for dynamic local obstacle avoidance. The path planning method is switched in real time according to whether there are dynamic obstacles within the sensing range of the AGV to achieve efficient AGV handling. Simulation experiments are carried out on the MATLAB, and the results show that the path planning method can effectively and real-time plan a safe and collision-free optimal path for the AGV.

Key words: automated guided vehicle（AGV）, dynamic environment, deep Q-network（DQN）, memristor, path planning

杨海兰, 祁永强, 荣丹. 仓储环境下基于忆阻强化学习的AGV路径规划[J]. 计算机工程与应用, 2023, 59(17): 318-327.

YANG Hailan, QI Yongqiang, RONG Dan. AGV Path Planning Based on Memristor Reinforcement Learning in Warehouse Environment[J]. Computer Engineering and Applications, 2023, 59(17): 318-327.

参考文献

[1] 陈广锋，余立潮.多帧时间窗轮换算法规划仓储多AGV小车路径[J].计算机工程与应用，2020，56（23）：270-278.
CHEN G F，YU L C.Multi-frame time window rotation algorithm to plan storage multiple AGV car path[J].Computer Engineering and Applications，2020，56（23）：270-278.
[2] 于赫年，白桦，李超.仓储式多AGV系统的路径规划研究及仿真[J].计算机工程与应用，2020，56（2）：233-241.
YU H N，BAI H，LI C.Research and simulation on path planning of warehouse multi-AGV system[J].Computer Engineering and Applications，2020，56（2）：233-241.
[3] 徐翔斌，马中强.基于移动机器人的拣货系统研究进展[J].自动化学报，2022，48（1）：1-20.
XU?X B，MA?Z Q.Robotic mobile fulfillment systems：state-of-the-art and prospects[J].Acta?Automatica Sinica，2022，48（1）：1-20.
[4] 夏清松，唐秋华，张利平.多仓储机器人协同路径规划与作业避碰[J].信息与控制，2019，48（1）：22-28.
XIA Q S，TANG Q H，ZHANG L P.Cooperative path planning and operation collision avoidance for multiple storage robots[J].Information and Control，2019，48（1）：22-28.
[5] YANG B，LI W，WANG J，et al.A novel path planning algorithm for warehouse robots based on a two-dimensional grid model[J].IEEE Access，2020，8：80347?-80357.
[6] 梁金琳，薛颂东，赵静，等.基于蚁群-遗传融合框架的仓储群机器人任务分配[J].计算机系统应用，2021，30（11）：172-178.
LIANG?J L，XUE?S D，ZHAO?J，et al.Task allocation of warehouse swarm robots based on ant colony-genetic fusion framework[J].Computer Systems and Applications，2021，30（11）：172?178
[7] 雷斌，王菀莹，赵佳欣.货位分配优化研究综述[J].计算机工程与应用，2021，57（1）：48-55.
LEI B，WANG W Y，ZHAO J X.Review of research on location allocation optimization[J].Computer Engineering and Applications，2021，57（1）：48-55.
[8] 林韩熙，向丹，欧阳剑，等.移动机器人路径规划算法的研究综述[J].计算机工程与应用，2021，57（18）：38-48.
LIN H X，XIANG D，OUYANG J，et al.Review of path planning algorithms for mobile robots[J].Computer Engineering and Applications，2021，57（18）：38-48.
[9] YANG Y，LI J，PENG L.Multi-robot path planning based on a deep reinforcement learning DQN algorithm[J].CAAI Transactions on Intelligence Technology，2020，5（3）：177-183.
[10] PANOV A I，YAKOVLEV K S，SUVOROV R.Grid path planning with deep reinforcement learning：preliminary results[J].Procedia Computer Science，2018，123：347-353.
[11] WANG B，LIU Z，LI Q，et al.Mobile robot path planning in dynamic environments through globally guided reinforcement learning[J].IEEE Robotics and Automation Letters，2020，5（4）：6932-6939.
[12] 白云飞.基于强化学习的AGV动态路径规划研究[D].成都：四川大学，2021.
BAI Y F.Research on dynamic path planning of AGV based on reinforcement learning[D].Chengdu：Sichuan University，2021.
[13] JOKSA D，WANG E，BARMPATSALOS N，et al.Nonideality-aware training for accurate and robust low-power memristive neural networks[J].Advanced Science，2022，210784：1-16.
[14] 刘辉.基于强化学习的AGV仓储路径规划研究[D].青岛：青岛大学，2021.
LIU H.Research on AGV storage path planning based on reinforcement learning[D].Qingdao：Qingdao University，2021.
[15] YAKOPCIC C，ALMOM Z.Memristor crossbar deep network implementation based on a convolutional neural network[C]//International Joint Conference on Neural Networks，2016：963-970.
[16] HU M，GRAVES C，LI C，et al.Memristor-based analog computation and neural network classification with a dot product engine[J].Advanced Materials，2018，30（9）：1-10.
[17] ZHANG Y，WANG X，FRIEDMAN E G.Memristor-based circuit design for multilayer neural networks[J].IEEE Transactions on Circuits and Systems Part1 Regular Papers，2018，65（2）：677-686.
[18] WANG Z R，LI C，SONG W H，et al.Reinforcement learning with analogue memristor arrays[J].Nature Electronics，2019，2（3）：115-124.
[19] CHUA L.Memristor-the missing circuit element[J].IEEE Transactions on Circuit Theory，1971，18（5）：507-519.
[20] STRUKOV D B，SNIDER G S，STEWART D R，et al.The missing memristor found[J].Nature，2008，435（7191）：80-83.
[21] JOGLEKAR Y N，WOLF S J.The elusive memristor：properties of basic electrical circuits[J].European Journal of Physics，2009，30（4）：661-675.
[22] 段书凯，胡小方，王丽丹，等.忆阻器阻变随机存取存储器及其在信息存储中的应用[J].中国科学：信息科学，2012，42（6）：754-769.
DUAN S K，HU X F，WANG L D，et al.Memristor-based RRAM with applications[J].Scientia Sinica Informationis，2012，42（6）：754-769.
[23] 辜勇，段晶晶，苏宇霞，等.基于改进蚁群算法的仓储物流机器人路径规划[J].武汉理工大学学报（交通科学与工程版），2020，44（4）：688-693.
GU Y，DUAN J J，SU Y X，et al.Path planning of warehouse logistics robot based on improved ant colony algorithm[J].Journal of Wuhan University of Technology（Transportation Science and Engineering），2020，44（4）：688-693.
[24] 胡飞，尤志强，刘鹏，等.基于忆阻器交叉阵列的卷积神经网络电路设计[J].计算机研究与发展，2018，55（5）：1097-1107.
HU F，YOU Z Q，LIU P，et al.Circuit design of convolutional neural network based on memristor crossbar arrays[J].Journal of Computer Research and Development，2018，55（5）：1097-1107.