计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (21): 342-350.DOI: 10.3778/j.issn.1002-8331.2504-0299

• 工程与应用 • 上一篇    下一篇

柔性作业车间中图嵌入的深度强化调度策略研究

陈明童,张健欣,侯淑君   

  1. 1.内蒙古工业大学 电力学院,呼和浩特 010000
    2.中国电子科技集团公司 第十五研究所,北京 100083
  • 出版日期:2025-11-01 发布日期:2025-10-31

Graph Embedded Deep Reinforcement Learning Strategy for Flexible Job Shop Scheduling

CHEN Mingtong, ZHANG Jianxin, HOU Shujun   

  1. 1.College of Electrical Engineering, Inner Mongolia University of Technology, Hohhot 010000, China
    2.The 15th Institute of China Electronics Technology Group Corporation, Beijing 100083, China
  • Online:2025-11-01 Published:2025-10-31

摘要: 柔性作业车间调度问题(FJSP)因其NP-hard特性、资源约束和工序强耦合在大规模实例中求解难度高、效率低。启发式调度方法受限于计算复杂度和泛化性不足。提出融合图神经网络(GNN)与深度强化学习(DRL)的状态与负载感知(state load aware)、掩码引导(mask guided)、图结构增强(graph enhance)的SMG-DRL调度框架。SMG-DRL通过状态负载感知实现全局与局部的状态嵌入、采用掩码引导机制优化动作选择,利用稀疏图建模与GIN提取结构特征,显著提升了调度框架的求解能力。在Brandimarte数据集上的实验结果显示,SMG-DRL调度框架相较于三种调度规则(SPT、MWKR和LWKR)平均误差降低约42%(三种启发式算法的平均Gap为35.55%,10×5和20×10的SMG-DRL框架的平均Gap为20.55%);与Zhang和Lei提出的DRL方法相比,SMG-DRL在Gap水平和结果稳定性上更具优势,具有更强的收敛性与鲁棒性。相较于OR-Tools的求解时间(978 s),计算时间缩短至约3?s,效率提升约300倍。且在30×10和40×10大规模实例中Gap低至6.23%~8.76%,展现出良好的泛化能力。

关键词: 柔性作业车间调度, 图神经网络, 强化学习

Abstract: The flexible job shop scheduling problem (FJSP), characterized by its NP-hard nature, complex resource constraints, and strong coupling of operations, presents significant challenges in large-scale instances in terms of both solution quality and computational efficiency. Traditional heuristic methods often suffer from limited scalability and poor generalization. To address these limitations, this paper proposes the SMG-DRL scheduling framework, which integrates graph neural networks (GNNs) with deep reinforcement learning (DRL), and incorporates three key designs: state load awareness, mask-guided action selection, and graph structural enhancement. Specifically, SMG-DRL embeds both global and local state features via load-aware mechanisms, improves action selection efficiency through a masking strategy, and utilizes sparse graph modeling combined with graph isomorphism networks (GIN) to extract structural features and reduce graph density. Experimental results on the Brandimarte benchmark dataset demonstrate that SMG-DRL achieves approximately 42% reduction in average relative error (Gap) compared to three classic dispatching rules (SPT, MWKR, and LWKR), with the latter yielding an average Gap of 35.55%, while the proposed framework achieves 20.55% on the 10×5 and 20×10 training instances. Moreover, compared with the DRL-based methods proposed by Zhang and Lei, SMG-DRL exhibits superior performance in both Gap levels and solution stability, reflecting stronger convergence and robustness. In terms of computational efficiency, SMG-DRL reduces the average solving time from 978 seconds (OR-Tools) to approximately 3 seconds, achieving a nearly 300-fold speed-up. Furthermore, on large-scale instances (30×10 and 40×10), the framework maintains low Gap values between 6.23% and 8.76%, showcasing its excellent scalability and generalization capability.

Key words: flexible work vehicle workshop scheduling, graph neural network, reinforcement learning