计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (4): 258-269.DOI: 10.3778/j.issn.1002-8331.2210-0079

• 图形图像处理 • 上一篇    下一篇

多尺度语义信息无监督山水画风格迁移网络

周粤川,张建勋,董文鑫,高林枫,倪锦园   

  1. 重庆理工大学  计算机科学与工程学院,重庆  400054
  • 出版日期:2024-02-15 发布日期:2024-02-15

Unsupervised Landscape Painting Style Transfer Network with Multiscale Semantic Information

ZHOU Yuechuan, ZHANG Jianxun, DONG Wenxin, GAO Linfeng, NI Jinyuan   

  1. College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, China
  • Online:2024-02-15 Published:2024-02-15

摘要: 针对图像转换类的生成对抗网络在处理无监督风格迁移任务时存在的纹理杂乱、生成图像质量差的问题,基于循环一致性损失提出了循环矫正多尺度评估生成对抗网络。首先在网络架构的设计上,基于图像的三层语义信息提出了多尺度评估网络架构,以此强化源域到目标域的迁移效果;其次在损失函数的改进上,提出了多尺度对抗损失以及循环矫正损失,用于以更严苛的目标引导模型的迭代优化方向,生成视觉质量更好的图片;最后为了预防模式崩溃的问题,在风格特征的编码阶段添加了注意力机制以提取重要的特征信息,在网络的各阶段引入ACON激活函数以加强网络的非线性表达能力,避免神经元坏死。实验结果表明,相比于CycleGAN、ACL-GAN,所提出方法在山水画风格迁移数据集上的FID值分别降低了21.80%和34.33%;为了验证模型的泛化能力,在Vangogh2Photo和Monet2Photo两个公开数据集上进行了泛化实验对比,FID值相比于两个对照网络分别降低了7.58%、18.14%和4.65%、6.99%。

关键词: 无监督风格迁移, 生成对抗网络(GAN), 多尺度评估, CycleGAN

Abstract: This paper proposes CCME-GAN (circulatory correction multiscale evaluation-generative adversarial networks) based on the cycle consistency loss, aiming at the problems of texture clutter and poor quality of generated images when the generative adversarial network of image conversion class is dealing with the task of unsupervised style transfer. Firstly, in the design of the network architecture, a multiscale evaluation network architecture based on the three-layer semantic information of images is proposed to enhance the transfer effect from the source domain to the target domain. Secondly, in the improvement of the loss function, a multiscale adversarial loss and a cyclic correction loss are proposed to guide the optimization iteration direction of the model with a stricter target, and generate pictures with better visual quality. Finally, in order to prevent the problem of pattern collapse, this paper adds an attention mechanism in the encoding stage of style features to extract important feature information, and then introduces the ACON activation function in each stage of the network to strengthen the nonlinear expression ability of the network and avoid neuron necrosis. The experimental results show that the FID value of this paper method is reduced by 21.80% and 34.33% compared with CycleGAN and ACL-GAN on the landscape painting style migration dataset. In addition, in order to verify the generalization ability of the model, the generalization experiments are compared on two public datasets, Vangogh2Photo, and Monet2Photo and the FID values are decreased by 7.58%, 18.14% and 4.65%, 6.99% respectively.

Key words: unsupervised style transfer, generative adversarial networks (GAN), multiscale evaluation, cycleGAN