计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (22): 223-232.DOI: 10.3778/j.issn.1002-8331.2206-0247

• 图形图像处理 • 上一篇    下一篇

多尺度注意力引导的全景分割网络

付都,瞿绍军,付亚   

  1. 1.湖南师范大学 信息科学与工程学院,长沙 410081
    2.国网湖南超高压变电公司,长沙 410004
  • 出版日期:2023-11-15 发布日期:2023-11-15

Multiscale Attention-Guided Panoptic Segmentation Network

FU Du, QU Shaojun, FU Ya   

  1. 1.College of Information Science and Engineering, Hunan Normal University, Changsha 410081, China
    2.State Grid Hunan Extra High Voltage Substation Company, Changsha 410004, China
  • Online:2023-11-15 Published:2023-11-15

摘要: 全景分割是近年来新提出的图像分割任务。现有全景分割模型大都对前景实例对象和背景未定形区采用不同的方式进行特征表示,因此需要额外的后处理和融合操作来处理各种实例重叠和语义冲突问题。全卷积全景分割网络实现了统一的特征表示,省去了这些复杂操作,但其对于前景实例对象的分割准确率不高,对图像中远距离小目标的分割效果不是很理想。针对这些问题,基于全卷积全景分割网络进行改进优化,提出一种多尺度注意力引导的全景分割网络。首先改进特征提取网络,通过在主干网中添加一条自底向上的辅助路径来增强模型的多尺度特征获取能力。其次提出一种注意力模块,通过将空洞空间金字塔池化与通道注意力融合,来引导卷积核更新,生成更匹配的权重。在Cityscapes数据集上与全卷积全景分割网络进行对比实验,图像实例级全景分割质量提高了2.74个百分点,背景未定形区全景分割质量和综合全景分割质量分别提高了1.36个百分点和1.94个百分点,对于交通灯和摩托车等小物体的类别检测准确率分别提高了4.4个百分点和8.3个百分点。提出的全景分割网络综合了全卷积全景分割网络、多尺度特征及注意力机制的优点,使得图像实例级全景分割准确率更高。

关键词: 图像分割, 全景分割, 全卷积全景分割网络, 多尺度特征, 注意力模块, 空洞空间金字塔池化

Abstract: Panoptic segmentation is a newly proposed image segmentation task in recent years. Most existing panoptic segmentation models use different ways to represent foreground instance objects and background undefined regions, so additional post-processing and fusion operations are required to deal with various instance overlapping and semantic conflicts. Fully convolutional panoptic segmentation network achieves unified feature representation and saves these complex operations, but its segmentation accuracy for foreground instance objects is not high, and the segmentation effect for long-distance small objects in images is not ideal. To solve these problems, based on the improvement and optimization of fully convolutional panoptic segmentation network, a multiscale attention-guided panoptic segmentation network is proposed. Firstly, the feature extraction network is improved, and the multiscale features acquisition capability of the model is enhanced by adding a bottom-up auxiliary path in backbone. Secondly, an attention module is proposed, which guides the update of the convolution kernel and generates more matching weights by fusing atrous spatial pyramid pooling with channel attention. Through the comparison experiment with fully convolutional panoptic segmentation network on the Cityscapes dataset, the image instance-level panoptic segmentation quality is improved by 2.74 percentage points, and the quality of the background unshaped regions and the comprehensive panoptic segmentation is improved by 1.36 percentage points and 1.94 percentage points, respectively. Class detection accuracy for small objects such as traffic lights and motorcycles is improved by 4.4 percentage points and 8.3 percentage points, respectively. The proposed panoptic segmentation network combines the advantages of fully convolutional panoptic segmentation network, multiscale features and attention mechanism, resulting in higher image instance-level panoptic segmentation accuracy and performance.

Key words: image segmentation, panoptic segmentation, fully convolutional panoptic segmentation network,  , multiscale features, attention modules, atrous spatial pyramid pooling