Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (7): 212-221.DOI: 10.3778/j.issn.1002-8331.2211-0196

• Graphics and Image Processing • Previous Articles     Next Articles

Generative Adversarial Network with Dual Discriminator and Mixed Attention

WANG Lei, YANG Jun, ZHANG Chiyu, DAI Zaiyan   

  1. College of Computer Science, Sichuan Normal University, Chengdu 610101, China
  • Online:2024-04-01 Published:2024-04-01

结合混合注意力的双判别生成对抗网络

王磊,杨军,张驰宇,代在燕   

  1. 四川师范大学 计算机科学学院,成都 610101

Abstract: In image generation tasks, how to improve the quality of generated images is a key problem. Currently, the multi-layer convolutional structure adopted by GAN has the problem of local induction bias, which cannot focus on key information, resulting in losing image features during training process. In this paper, a model of generative adversarial network with dual discriminator and mixed attention, termed as DDMA-GAN, is proposed. Firstly, DDMA-GAN designs a mixed attention mechanism, which utilizes channel attention and spatial attention to fully capture image feature information. Secondly, to solve the problem of discrimination error of single discriminator, a dual discriminator structure is proposed. The fusion coefficient is used to fuse the judgment results to make the returned parameters more objective, and the data augmentation module is embedded to further improve the robustness of the model. Finally, the hinge loss is used as loss function to maximize the distance between true and fake samples. The model is verified on public datasets LSUN and CelebA. Experimental results show that images generated by DDMA-GAN on classical datasets are more realistic. FID and MMD of DDMA-GAN are significantly reduced, which fully indicate validity of model.

Key words: image generation, convolutional neural network (CNN), mixed attention, dual discriminator, data augmentation, generative adversarial network (GAN)

摘要: 图像生成任务中,如何提升生成图片的质量是一个关键问题。当前,生成对抗网络采用的多层卷积结构存在局部性归纳偏置的问题,无法准确聚焦关键信息,导致图像特征丢失严重,生成图像效果较差。为此,提出了结合混合注意力的双判别生成对抗网络(DDMA-GAN)。设计了一种混合注意力机制,利用通道注意力和空间注意力模块,从两个维度充分捕获图像特征信息;为解决单判别器存在判别误差的问题,提出一种双判别器结构,使用融合系数将判定结果融合,使回传参数更具客观性,并嵌入数据增强模块,进一步提升模型鲁棒性;采用铰链损失作为模型损失函数,最大化真假样本间的距离,明确决策边界。模型在公开数据集LSUN和CelebA上进行验证,实验结果表明,DDMA-GAN生成的图像更加真实,纹理细节更加丰富,其FID和MMD值均显著降低且优于其他常见模型,证明了模型的有效性。

关键词: 图像生成, 卷积神经网络, 混合注意力, 双判别器, 数据增强, 生成对抗网络