计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (19): 214-225.DOI: 10.3778/j.issn.1002-8331.2406-0116

• 图形图像处理 • 上一篇    下一篇

融合显著边界约束的弱监督语义分割方法

白雪飞,张丽娜,王文剑   

  1. 1.山西大学 计算机与信息技术学院,太原 030006
    2.山西大学 计算智能与中文信息处理教育部重点实验室,太原 030006
    3.山西警察学院 网络安全保卫系,太原 030401
  • 出版日期:2025-10-01 发布日期:2025-09-30

Weakly-Supervised Semantic Segmentation Method with Saliency Boundary Constraints

BAI Xuefei, ZHANG Lina, WANG Wenjian   

  1. 1.School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
    2.Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 030006, China
    3.Department of Network Security, Shanxi Police College, Taiyuan 030401, China
  • Online:2025-10-01 Published:2025-09-30

摘要: 针对现有弱监督语义分割方法存在的类激活不足、伪标签边界不清晰的问题,提出了融合显著边界约束的弱监督语义分割方法。提出由共享参数的孪生网络作为类激活图生成网络,将仿射变换前后的图像作为孪生网络两个分支的输入,得到不同的类激活图后,通过一致性损失函数融合仿射变换前后的互补信息,以生成具有完整信息的类激活图。设计显著性修正模块,在类激活图中引入边界约束,抑制背景信息的错误激活;同时,设计显著性亲和模块从显著图中学习像素之间的亲和矩阵,进一步细化初始伪标签,提升模型的语义分割性能。实验结果表明,该方法在PASCAL VOC 2012验证集上的mIoU值为71.4%,与基线相比,性能提升了2.1个百分点,测试集上的mIoU值为70.8%;在COCO 2014验证集上的mIoU值为39.2%,展现了良好的分割结果,该方法可以更好地完成弱监督语义分割任务。

关键词: 弱监督语义分割, 图像级标签, Transformer, 卷积神经网络, 孪生网络, 显著图

Abstract: In order to solve the problems of insufficient class activation and unclear pseudo-label boundaries in the existing weakly-supervised semantic segmentation methods, a weakly-supervised semantic segmentation method with saliency boundary constraints is proposed. A twin network with shared parameters is used as the class activation map generation network, and the images before and after the affine transformation are used as the inputs of two branches of the twin network. After obtaining different class activation maps, the complementary information is fused by consistency loss function to generate a more complete class activation map. The saliency correction module is designed, and boundary constraints are introduced into the class activation map to suppress the wrong activation of background information. At the same time, the saliency affinity module is designed to learn the affinity matrix between pixels from the saliency map, which further refines the initial pseudo-labels and improves the semantic segmentation performance of the model. The experimental results show that the mIoU value of this method is 71.4% on PASCAL VOC 2012 validation set, and the performance is improved by 2.1 percentage points compared with the baseline, and the mIoU value on the test set is 70.8%. The mIoU value on the COCO 2014 validation set is 39.2%, which shows a good segmentation result, and the method can better complete the task of weakly-supervised semantic segmentation.

Key words: weakly-supervised semantic segmentation, image-level labels, Transformer, convolutional neural network, siamese network, saliency map