计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (17): 344-354.DOI: 10.3778/j.issn.1002-8331.2405-0153

• 工程与应用 • 上一篇    下一篇

多智能体序列决策的多交叉口交通信号协同控制方法

王智文,卢玉梅,张海鹏,庞煜丽   

  1. 1.广西科技大学 电子工程学院,广西 柳州 545006
    2.广西科技大学 计算机科学与技术学院(软件学院),广西 柳州 545006
  • 出版日期:2025-09-01 发布日期:2025-09-01

Multi-Intersection Traffic Signal Cooperative Control Method Based on Multi-Agent Sequential Decision Making

WANG Zhiwen, LU Yumei, ZHANG Haipeng, PANG Yuli   

  1. 1.College of Electronic Engineering, Guangxi University of Science and Technology, Liuzhou, Guangxi 545006, China
    2.College of Computer Science and Technology (College of Software), Guangxi University of Science and Technology, Liuzhou, Guangxi 545006, China
  • Online:2025-09-01 Published:2025-09-01

摘要: 深度强化学习可以利用大序列模型自身的优势,来解决多交叉口交通信号协同控制问题,为此,提出了多智能体序列决策的多交叉口交通信号协同控制方法。根据多智能体优势分解定理,利用序列模型的特性将多交叉口交通信号控制建模为序列问题,将实时的多交叉口交通信号控制转变成一个多智能体序列决策问题,充分利用了多智能体强化学习决策过程与序列模型预测之间惊人的联系。使用小样本Transformer序列模型来在线学习每个智能体的最优控制策略,实现多交叉口交通信号协同控制,解决了集中训练分散执行的训练模式很难覆盖多智能体交互的全部复杂性,随着智能体数量不断增多,导致最优联合值函数求解更复杂等问题。实验结果表明,所提出的方法可以显著提高交通信号控制算法的性能并降低其实现的复杂性。

关键词: 多智能体优势分解, 序列决策, 多交叉口, 协同控制, 强化学习

Abstract: Deep reinforcement learning can use the advantages of large sequence models to solve the problem of multi-intersection traffic signal cooperative control, and a multi-agent sequential decision-making method for coordinated control of multi-intersection traffic signals is proposed. Firstly, according to the multi-agent dominance decomposition theorem, the multi-intersection traffic signal control is modeled as a sequence problem by using the characteristics of the sequence models, and the real-time multi-intersection traffic signal control is transformed into a multi-agent sequence decision-making problem, which makes full use of the amazing relationship between the multi-agent reinforcement learning decision-making process and the sequence model prediction. Then, the small-sample Transformer sequence model is used to learn the optimal control strategy of each agent online to realize the cooperative control of traffic signals at multiple intersections, which solves the problem that it is difficult to cover all the complexity of multi-agent interaction in the training mode of centralized training and decentralized execution, and the optimal joint value function is more complex to solve with the increasing number of agents. The experimental results show that the proposed method can significantly improve the performance of the traffic signal control algorithm and reduce the complexity of its implementation.

Key words: multi-agent dominance decomposition, sequential decision making, multiple intersections, cooperative control, reinforcement learning