Application of action prediction in multi-robot reinforcement learning cooperation

Abstract

Abstract: In multi-robot systems, the spatial scale of reinforcement learning of the cooperation environment exploration is made up of the exponential function of the number of robots. And the enormous learning space results in the slow convergence rate. To solve this problem, a prediction-based reinforcement learning algorithm and the action selection strategy are applied to the research on multi-robot cooperation. By predicting the probability of actions that other robots may execute, the convergence rate of this algorithm is accelerated. The experimental results show that reinforcement learning algorithm based-on action prediction can achieve the multi-robot’s cooperation strategy much faster, compared to the primitive algorithm.

Key words: action prediction, reinforcement learning, multi-robot cooperation

摘要： 在多机器人系统中，协作环境探索的强化学习的空间规模是机器人个数的指数函数，学习空间非常庞大造成收敛速度极慢。为了解决这个问题，将基于动作预测的强化学习方法及动作选择策略应用于多机器人协作研究中，通过预测机器人可能执行动作的概率以加快学习算法的收敛速度。实验结果表明，基于动作预测的强化学习方法能够比原始算法更快速地获取多机器人的协作策略。

关键词: 动作预测, 强化学习, 多机器人协作

CAO Jie, ZHU Ningning. Application of action prediction in multi-robot reinforcement learning cooperation[J]. Computer Engineering and Applications, 2013, 49(8): 257-260.

曹洁，朱宁宁. 动作预测在多机器人强化学习协作中的应用[J]. 计算机工程与应用, 2013, 49(8): 257-260.

[1]	WANG Xiao, TANG Lun, HE Xiaoyu, CHEN Qianbin. Multi-dimensional Resource Optimization of Service Function Chain Based on Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(4): 68-76.
[2]	LAI Jun, WEI Jingyi, CHEN Xiliang. Overview of Hierarchical Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(3): 72-79.
[3]	MA Zhihao, ZHU Xiangbin. Research on Quasi-hyperbolic Momentum Gradient for Adversarial Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(24): 90-99.
[4]	LI Baoshuai, YE Chunming. Job Shop Scheduling Problem Based on Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(23): 248-254.
[5]	WANG Jun, CAO Lei, CHEN Xiliang, LAI Jun, ZHANG Legui. Overview on Reinforcement Learning of Multi-agent Game [J]. Computer Engineering and Applications, 2021, 57(21): 1-13.
[6]	CHENG Yi, HAO Mimi. Path Planning for Indoor Mobile Robot with Improved Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(21): 256-262.
[7]	KUANG Liqun, LI Siyuan, FENG Li, HAN Xie, XU Qingyu. Application of Deep Reinforcement Learning Algorithm on Intelligent Military Decision System [J]. Computer Engineering and Applications, 2021, 57(20): 271-278.
[8]	KONG Songtao, LIU Chichi, SHI Yong, XIE Yi, WANG Kun. Review of Application Prospect of Deep Reinforcement Learning in Intelligent Manufacturing [J]. Computer Engineering and Applications, 2021, 57(2): 49-59.
[9]	LI Hao, NING Haoyu, KANG Yan, LIANG Wentao, HUO Wen. SMRFGAN Model for Text Emotion Transfer [J]. Computer Engineering and Applications, 2021, 57(2): 170-176.
[10]	ZHANG Rongxia, WU Changxu, SUN Tongchao, ZHAO Zengshun. Progress on Deep Reinforcement Learning in Path Planning [J]. Computer Engineering and Applications, 2021, 57(19): 44-56.
[11]	YANG Xueyu, CHEN Jianping, FU Qiming, LU You, WU Hongjie. Deep Deterministic Policy Gradient Algorithm Based on Stochastic Variance Reduction Method [J]. Computer Engineering and Applications, 2021, 57(19): 104-111.
[12]	SONG Haonan, ZHAO Gang, WANG Xingfen. Knowledge Reasoning Method Combining Knowledge Representation with Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(19): 189-197.
[13]	WANG Keyin, SHI Zhen, YANG Zhengcai, YANG Yahui, WANG Sishan. Path Planning for Mobile Robot Using Improved Reinforcement Learning Algorithm [J]. Computer Engineering and Applications, 2021, 57(18): 270-274.
[14]	ZHANG Jun, ZHU Qingwei, YAN Junjie, WEN Bo. UAV Indoor 3D Track Planning Based on Improved Reinforcement Learning Algorithm [J]. Computer Engineering and Applications, 2021, 57(16): 175-181.
[15]	CHE Xiangbei, KANG Wenqian, OUYANG Yuhong, YANG Kehan, LI Jian. SDN Routing Optimization Algorithm Based on Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(12): 93-98.

Application of action prediction in multi-robot reinforcement learning cooperation

动作预测在多机器人强化学习协作中的应用

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics