计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (17): 224-229.DOI: 10.3778/j.issn.1002-8331.2102-0169

• 模式识别与人工智能 • 上一篇    下一篇

自适应对抗学习求解旅行商问题

熊文瑞,陶继平   

  1. 1.厦门大学 航空航天学院,福建 厦门 361005
    2.厦门大学 大数据智能分析与决策重点实验室,福建 厦门 361005
  • 出版日期:2022-09-01 发布日期:2022-09-01

Adaptive Adversarial Learning for Solving TSP

XIONG Wenrui, TAO Jiping   

  1. 1.School of Aerospace Engineering, Xiamen University, Xiamen, Fujian 361005, China
    2.Key Laboratory of Big Data Intelligent Analysis and Decision-marking of Xiamen Province, Xiamen University, Xiamen, Fujian 361005, China
  • Online:2022-09-01 Published:2022-09-01

摘要: 深度学习为组合优化问题提供了新的解决思路,目前该研究方向多关注于对模型和训练方法的改良,更多的论文引入自然语言处理方向的新模型来加以改进求解效果,而缺乏从实例的数据生成方向来关注模型的泛化能力和鲁棒性。为解决该问题,借鉴对抗学习的思想,针对经典组合优化问题——旅行商问题,从数据生成方向切入研究,设计生成器网络,使用监督学习的方式来产生对抗样本,并将对抗样本加入到随机样本中混合训练,以改善模型对该类问题的泛化性能。同时,依据强化学习训练过程中判别器模型的更新方式提出一种自适应机制,来训练对抗模型,最终得到能够在随机分布样本上和对抗样本上都取得较好结果的模型。仿真验证了所提出方法的有效性。

关键词: 对抗训练, 强化学习, 模型泛化, 旅行商问题

Abstract: Deep learning gives a new insight into solutions to combinatorial optimization issues. Recently, the majority of related works focus on the developments of models as well as training methods. More researches try to promote the solution quality by introducing a particular model which belongs to the field of natural language processing, instead of evaluating its generalization performance and robustness from the prospective of data generation.?Aiming to a typical travelling salesman problem, this paper bases on the process of generating instances and designs a generator network, which is inspired by adversarial learning. To be more specific, supervised learning is used to produce the adversarial samples. They are required to be mixed with random samples for further training so as to improve the generalization of the model. Simultaneously, a self-adaption mechanism is derived from the updating mode of the discriminator during the reinforcement training process, which will be used later to train the certain adversarial model. In this way, a model which can achieve great results on both types of samples is created. Simulation results demonstrate the effectiveness of the proposed approach.

Key words: adversarial training, reinforce learning, model generalization, traveling salesman problem(TSP)