基于R学习的合同网实时调度模型

计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (10): 221-226.

基于R学习的合同网实时调度模型

赵良辉，熊作贞

五邑大学经济管理学院，广东江门 529020

出版日期:2014-05-15 发布日期:2014-05-14

Real-time contract-net-protocol scheduling model based on R-learning

ZHAO Lianghui, XIONG Zuozhen

School of Economics & Management, Wuyi University, Jiangmen, Guangdong 529020, China

Online:2014-05-15 Published:2014-05-14

摘要/Abstract

摘要： 提出一种融入合同网运行机制的R学习方法，以此方法为核心构造Agent形成具有学习能力的实时调度模型。模型以最小化作业累计平均流动比为主要目标，同时借助对强化学习报酬的设计减小机器负载的不均衡性，实现对调度过程的双重优化；构造实时调度实例投入测试的结果证明了模型的绩效。另外，一个包含强化学习Agent与无学习Agent的混合机器环境被构建并测试其性能，测试结果表明：在Agent之间借助强化学习过程形成了某种隐性的合作，正是这种合作保证了高质量实时调度方案的输出。

关键词: R学习, 合同网, 多Agent合作, 实时调度

Abstract: This paper proposes a real-time scheduling model based on contract net protocol structure employing reinforcement learning agents. To this end, an R-learning procedure is elaborated and embedded in machine agents’ decision process, enabling them to treat bid-invitations in more complicated way than in a simple contract net protocol environment. Efficiency of the proposed method is verified through experiments in a simulated real-time scheduling environment. Furthermore, the performance of mixed machine groups which comprises both reinforcement learning agents and non-reinforcement-learning agents shows that there is spontaneous implicit teamwork occurring between reinforcement learning agents, and this teamwork guarantees high quality output of the scheduling model.

Key words: R-learning, contract net protocol, multi-agent cooperation, real-time schedule

赵良辉，熊作贞. 基于R学习的合同网实时调度模型[J]. 计算机工程与应用, 2014, 50(10): 221-226.

ZHAO Lianghui, XIONG Zuozhen. Real-time contract-net-protocol scheduling model based on R-learning[J]. Computer Engineering and Applications, 2014, 50(10): 221-226.

[1]	胡晓辉，李兰凤，方政，刘雪亮. 改进的任务分配策略在WSN中的应用研究[J]. 计算机工程与应用, 2017, 53(2): 124-128.
[2]	袁龙强1，党建武1，赵庶旭2. 基于合同网的城市道路事故应急救援策略研究[J]. 计算机工程与应用, 2016, 52(9): 253-257.
[3]	王建红，晏立. 多处理器EPDF Pfair算法的可调度性判定[J]. 计算机工程与应用, 2013, 49(1): 43-45.
[4]	赵良辉，王天擎，陶雪萍. 以多Agent系统为架构的实时调度模型[J]. 计算机工程与应用, 2012, 48(13): 223-226.
[5]	彭礼强1，尹俊文2，汪飞1，3. 非传统安全关键系统中模糊分类调度模型[J]. 计算机工程与应用, 2011, 47(27): 55-59.
[6]	夏秋粉，李明楚，徐子川，吴国伟. 具有温度感知特性的实时调度研究[J]. 计算机工程与应用, 2011, 47(26): 27-31.
[7]	梁浩，晏立. 实时调度EDZL算法的可调度性判定[J]. 计算机工程与应用, 2011, 47(2): 60-61.
[8]	赵明1，2，赵海1，张浩华1，邹勇1. 满足偏序约束的在线调度[J]. 计算机工程与应用, 2011, 47(12): 43-45.
[9]	杨沁¹，卫道柱¹，赵福民². 协作企业制造过程多Agent调度建模技术[J]. 计算机工程与应用, 2009, 45(29): 214-216.
[10]	周鹏. 管道流量泄漏在线监测中的模糊调度设计[J]. 计算机工程与应用, 2009, 45(27): 231-236.
[11]	曾广平，熊海涛. 软件人协作机制与算法的研究[J]. 计算机工程与应用, 2009, 45(23): 33-37.
[12]	晏立,彭晨辉. 基于数据有效期的实时调度设计[J]. 计算机工程与应用, 2009, 45(1): 92-95.
[13]	沈卓炜. 长释放时间间隔优先的混合任务调度算法[J]. 计算机工程与应用, 2007, 43(5期): 3-6.
[14]	韦兆文,区云鹏,闫俊燕. 一种改进的动态合同网协议[J]. 计算机工程与应用, 2007, 43(36): 208-210.
[15]	沈卓炜. 不可抢占式EDF调度算法的可调度性分析 [J]. 计算机工程与应用, 2006, 42(9期): 10-.