Reinforcement learning for Multi-Agents Systems and its application in RoboCup

doi:10.3778/j.issn.1002-8331.2008.23.014

Computer Engineering and Applications ›› 2008, Vol. 44 ›› Issue (23): 46-48.DOI: 10.3778/j.issn.1002-8331.2008.23.014

• 理论研究 • Previous Articles Next Articles

Reinforcement learning for Multi-Agents Systems and its application in RoboCup

LIU Guo-dong,YANG Bao-qing

School of Communication and Control Engineering，Jiangnan University，Wuxi，Jiangsu 214122，China

Received:2007-10-18 Revised:2008-01-21 Online:2008-08-11 Published:2008-08-11
Contact: LIU Guo-dong

多智能体的增强学习及其在RoboCup中的应用

刘国栋,杨宝庆

江南大学控制科学与工程研究中心，江苏无锡 214122

通讯作者: 刘国栋

Abstract

Abstract: Due to the presence of other agents，the environment of Multi-Agent Systems（MAS） cannot be simply treated as Markov Decision Processes（MDPs）.The current reinforcement learning which are based on MDPs must be reformed before it can be applicable to MAS.Based on an agent’s independent learning ability，this paper proposes a novel Q-learning algorithm for MAS-an agent learning other agents action policies through observing the joint action.The politicies of other agents are expressed as action probability distribution matrixes.A concise and yet useful updating method for the matrixes is proposed.The full joint probability of distribution matrixes guarantees the learning agent to choose its optimal action.In experiment，the implemention of the agent and the enhancement of AFU shows that the approach is valid and efficient.

Key words: Multi-Agents Systems（MAS）, reinforcement learning, Robot World Cup（RoboCup）

摘要： 针对非确定马尔可夫环境下的多智能体系统，提出了多智能体Q学习模型和算法。算法中通过对联合动作的统计来学习其它智能体的行为策略，并利用智能体策略向量的全概率分布保证了对联合最优动作的选择。在实验中，成功实现了智能体的决策，提高了AFU队的整体的对抗能力，证明了算法的有效性和可行性。

关键词: 多智能体, 增强学习, 机器人世界杯足球锦标赛

LIU Guo-dong,YANG Bao-qing. Reinforcement learning for Multi-Agents Systems and its application in RoboCup[J]. Computer Engineering and Applications, 2008, 44(23): 46-48.

刘国栋,杨宝庆. 多智能体的增强学习及其在RoboCup中的应用[J]. 计算机工程与应用, 2008, 44(23): 46-48.

[1]	WANG Xiao, TANG Lun, HE Xiaoyu, CHEN Qianbin. Multi-dimensional Resource Optimization of Service Function Chain Based on Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(4): 68-76.
[2]	LAI Jun, WEI Jingyi, CHEN Xiliang. Overview of Hierarchical Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(3): 72-79.
[3]	MA Zhihao, ZHU Xiangbin. Research on Quasi-hyperbolic Momentum Gradient for Adversarial Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(24): 90-99.
[4]	LI Baoshuai, YE Chunming. Job Shop Scheduling Problem Based on Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(23): 248-254.
[5]	WANG Jun, CAO Lei, CHEN Xiliang, LAI Jun, ZHANG Legui. Overview on Reinforcement Learning of Multi-agent Game [J]. Computer Engineering and Applications, 2021, 57(21): 1-13.
[6]	CHENG Yi, HAO Mimi. Path Planning for Indoor Mobile Robot with Improved Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(21): 256-262.
[7]	KUANG Liqun, LI Siyuan, FENG Li, HAN Xie, XU Qingyu. Application of Deep Reinforcement Learning Algorithm on Intelligent Military Decision System [J]. Computer Engineering and Applications, 2021, 57(20): 271-278.
[8]	KONG Songtao, LIU Chichi, SHI Yong, XIE Yi, WANG Kun. Review of Application Prospect of Deep Reinforcement Learning in Intelligent Manufacturing [J]. Computer Engineering and Applications, 2021, 57(2): 49-59.
[9]	LI Hao, NING Haoyu, KANG Yan, LIANG Wentao, HUO Wen. SMRFGAN Model for Text Emotion Transfer [J]. Computer Engineering and Applications, 2021, 57(2): 170-176.
[10]	ZHANG Rongxia, WU Changxu, SUN Tongchao, ZHAO Zengshun. Progress on Deep Reinforcement Learning in Path Planning [J]. Computer Engineering and Applications, 2021, 57(19): 44-56.
[11]	YANG Xueyu, CHEN Jianping, FU Qiming, LU You, WU Hongjie. Deep Deterministic Policy Gradient Algorithm Based on Stochastic Variance Reduction Method [J]. Computer Engineering and Applications, 2021, 57(19): 104-111.
[12]	SONG Haonan, ZHAO Gang, WANG Xingfen. Knowledge Reasoning Method Combining Knowledge Representation with Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(19): 189-197.
[13]	WANG Keyin, SHI Zhen, YANG Zhengcai, YANG Yahui, WANG Sishan. Path Planning for Mobile Robot Using Improved Reinforcement Learning Algorithm [J]. Computer Engineering and Applications, 2021, 57(18): 270-274.
[14]	ZHANG Jun, ZHU Qingwei, YAN Junjie, WEN Bo. UAV Indoor 3D Track Planning Based on Improved Reinforcement Learning Algorithm [J]. Computer Engineering and Applications, 2021, 57(16): 175-181.
[15]	CHE Xiangbei, KANG Wenqian, OUYANG Yuhong, YANG Kehan, LI Jian. SDN Routing Optimization Algorithm Based on Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(12): 93-98.

Reinforcement learning for Multi-Agents Systems and its application in RoboCup

多智能体的增强学习及其在RoboCup中的应用

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics