基于多Agent Q学习的RoboCup局部配合策略

计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (23): 127-130.

• 数据库、数据挖掘、机器学习 • 上一篇下一篇

基于多Agent Q学习的RoboCup局部配合策略

赵发君，李龙澍

安徽大学计算机科学与技术学院，合肥 230601

出版日期:2014-12-01 发布日期:2014-12-12

RoboCup regional cooperative strategy based on multi-Agent Q-learning

ZHAO Fajun, LI Longshu

School of Computer Science and Engineering, Anhui University, Hefei 230601, China

Online:2014-12-01 Published:2014-12-12

摘要/Abstract

摘要： 针对RoboCup（Robot World Cup）中，多Agent之间的配合策略问题，采用了一种局部合作的多Agent Q-学习方法：通过细分球场区域和Agent回报值的方法，加强了Agent之间的协作能力，从而增强了队伍的进攻和防守能力。同时通过约束此算法的使用范围，减少了学习所用的时间，确保了比赛的实时性。最后在仿真2D平台上进行的实验证明，该方法比以前的效果更好，完全符合初期的设计目标。

关键词: 随机对策, Q-学习, 实时性, 局部合作, RoboCup仿真2D, 配合策略

Abstract: Because many multi-Agent cooperative problems can hardly be solved in RoboCup, this paper investigates a regional cooperative multi-Agent Q-learning method. Through subdividing the stadium area and rewards of agents, the agents’ collaboration ability can be strengthened. As a result, the team’s offensive and defensive abilities are enhanced. At the same time, the agents can spend less time learning via restricting the using range of the algorithm. Consequently, the real-time of the game can be ensured. Finally, the experiment on the platform of the simulation 2D proves that the effect of this method is much better than that of the previous one, and it fully complies with the design of the original goal.

Key words: stochastic game, Q-learning, real-time, regional cooperation, RoboCup simulation 2D, cooperative strategy

赵发君，李龙澍. 基于多Agent Q学习的RoboCup局部配合策略[J]. 计算机工程与应用, 2014, 50(23): 127-130.

ZHAO Fajun, LI Longshu. RoboCup regional cooperative strategy based on multi-Agent Q-learning[J]. Computer Engineering and Applications, 2014, 50(23): 127-130.

[1]	王凡，周国清，张荣庭，刘德全. 面向FPGA的连通域快速标记方法[J]. 计算机工程与应用, 2020, 56(22): 230-235.
[2]	徐喆，宋泽奇. 带比例因子的卷积神经网络压缩方法[J]. 计算机工程与应用, 2018, 54(12): 105-109.
[3]	张晓焱1，2，刘永1. 快速移不变稀疏分类算法在线识别汽油机故障[J]. 计算机工程与应用, 2018, 54(11): 230-235.
[4]	丁毅1，曹江涛1，李平1，姬晓飞2. 基于BOF-Gist特征的手势识别算法研究[J]. 计算机工程与应用, 2017, 53(9): 170-174.
[5]	刘宁，吕鲲. 实时以太网系统中控制与通信的协同调度方法[J]. 计算机工程与应用, 2017, 53(7): 15-20.
[6]	沈磊，杨剑锋，徐俊，郭成城. 基于Wi-Fi的无线控制网络TDMA协议的设计[J]. 计算机工程与应用, 2017, 53(11): 142-145.
[7]	马骏，柴志雷，王芝斌，钟传杰. 基于FPGA的稠密光流计算系统[J]. 计算机工程与应用, 2016, 52(3): 139-144.
[8]	梁晟，万羊所. 基于节点属性的启发式网络拓扑图布局算法[J]. 计算机工程与应用, 2016, 52(20): 122-126.
[9]	曹海鹏１，谢兴生１，祝宝友2. 一种改进的网格搜索闪电定位算法[J]. 计算机工程与应用, 2016, 52(16): 7-11.
[10]	孙劲光，侯加利. 新型烟花模拟方法[J]. 计算机工程与应用, 2016, 52(14): 215-219.
[11]	钱凯，陈秀宏，孙百伟. 一种鲁棒的时空上下文快速跟踪算法[J]. 计算机工程与应用, 2016, 52(12): 163-167.
[12]	马聪，王璞，邵栋，荣国平，张贺. 基于OBDII&EOBD的机动车监测系统分析与设计[J]. 计算机工程与应用, 2016, 52(1): 233-238.
[13]	宋爽1，任洪娥1，2，官俊1. 基于Sobel梯度模板的多阈值实时边缘检测方法[J]. 计算机工程与应用, 2015, 51(23): 199-202.
[14]	戈军1，周莲英2. 面向交通信号的两层递阶控制解决方案[J]. 计算机工程与应用, 2015, 51(20): 246-252.
[15]	周欣，蒋欣荣，潘薇. 基于分块投影和语义约束的车牌定位算法[J]. 计算机工程与应用, 2014, 50(9): 141-144.

基于多Agent Q学习的RoboCup局部配合策略

RoboCup regional cooperative strategy based on multi-Agent Q-learning

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics