计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (19): 221-229.DOI: 10.3778/j.issn.1002-8331.2307-0036

• 图形图像处理 • 上一篇    下一篇

DL-GAN生成对抗网络的半监督语义分割模型

刘凡,段先华,胡维康   

  1. 江苏科技大学  计算机学院,江苏  镇江  212100
  • 出版日期:2024-10-01 发布日期:2024-09-30

DL-GAN Semi Supervised Semantic Segmentation Model for Generative Adversarial Network

LIU Fan, DUAN Xianhua, HU Weikang   

  1. School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang, Jiangsu 212100, China
  • Online:2024-10-01 Published:2024-09-30

摘要: 语义分割目前主流的全监督学习方式、数据质量和数量决定了网络的训练效果。只有花费大量的标注成本,才能得到质量高且数据量大的训练数据。根据上述情况,出现了基于半监督学习方式的语义分割。半监督学习可以节省数据的标注成本,很好地解决需要大量标注成本的问题,越来越多人开始关注半监督学习的图像语义分割。根据图像语义分割方法目前的发展现状,提出了一种结合DeepLabv2的生成对抗网络(DL-GAN)的半监督语义分割的模型。将DeepLabv2作为生成对抗网络的生成网络,完全卷积的网络作为判别器网络;对生成网络进行改进,首次将CBAM注意力机制和深度可分离卷积结合应用于DeepLabv2,将其作为生成网络,具体一是在DeepLabv2最后的卷积层前添加CBAM注意力机制,二是将DeepLabv2网络中Resnet残差块的标准卷积替换为深度可分离卷积,使整个模型将权重参数更为合理地分配,提高模型的表征能力且计算更为高效,加快训练效率;用空洞卷积替换判别器的标准卷积,提升整个判别器的感受野,提高训练效果,提升语义分割精度。该方法在PASCAL VOC 2012数据集上的实验结果相对于Affinitynet网络平均交并比提高6.3个百分点,证明了提出方法是有效的。

关键词: 生成对抗网络, 注意力机制, 语义分割, 深度可分离卷积

Abstract: Semantic segmentation is currently the mainstream fully supervised learning method, and the quality and quantity of data determine the training effect of the network. High-quality and large-scale training data can be obtained only by spending a lot of annotation costs. Based on the above situation, semantic segmentation based on semi-supervised learning emerged. Semi-supervised learning can save the cost of labeling data and solve the problem of requiring a large amount of labeling costs. More and more people are beginning to pay attention to semi-supervised learning for image semantic segmentation. Based on the current development status of image semantic segmentation methods, a semi-supervised semantic segmentation model combining DeepLabv2’s generative adversarial network (DL-GAN) is proposed. First, it uses DeepLabv2 as the generator network of the generative adversarial network, and a fully convolutional network as the discriminator network of the generator network. Secondly, it improves the generation network by applying the CBAM attention mechanism and deep separable convolution for the first time to DeepLabv2 as a generation network. Specifically, it adds the CBAM attention mechanism before the final convolutional layer of DeepLabv2, and replaces the standard convolution of Resnet residual blocks in the DeepLabv2 network with deep separable convolution, which makes the weight parameters of the entire model more reasonably distributed, improves the model’s representation ability and computational efficiency, and accelerates the training efficiency. Finally, replacing the standard convolution of the discriminator with a hole convolution improves the receptive field of the entire discriminator, enhances training effectiveness, and improves semantic segmentation accuracy. The experimental results of the proposed method on the PASCAL VOC 2012 dataset show an average improvement of 6.3 percentage points compared to the Affinitynet network, proving the effectiveness of the proposed method.

Key words: generative adversarial network, attention mechanism, semantic segmentation, deep separable convolution