计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (8): 225-233.DOI: 10.3778/j.issn.1002-8331.2211-0439

• 图形图像处理 • 上一篇    下一篇

融合双重极化注意力的轻量化半监督语义分割

马冬梅,李悦媛,陈曦   

  1. 1.西北师范大学 物理与电子工程学院,兰州 730070
    2.甘肃省智能信息技术与应用工程研究中心,兰州 730070
    3.河北工业大学 电子信息工程学院,天津 300401
  • 出版日期:2024-04-15 发布日期:2024-04-15

Lightweight Semi-Supervised Semantic Segmentation Algorithm Based on Dual-Polarization Self-Attention

MA Dongmei, LI Yueyuan, CHEN Xi   

  1. 1.College of Physics and Electronic Engineering, Northwest Normal University, Lanzhou 730070, China
    2.Engineering Research Center of Gansu Province for Intelligent Information Technology and Application, Lanzhou 730070, China
    3.School of Electronic and Information Engineering, Hebei University of Technology, Tianjin 300401, China
  • Online:2024-04-15 Published:2024-04-15

摘要: 针对目前半监督语义分割方法复杂度高、训练精度低、参数量过大等问题,提出融合双重极化自注意力机制的轻量级半监督语义分割算法。模型使用由位置感知循环卷积构造的Resnet-101残差网络作为分割骨干网络提取深层特征。融合了通道及空间双重极化自注意力机制,在极化通道和空间注意力分支中保持较高内部分辨率。将位置感知循环卷积与通道注意力操作结合起来,提升分割精度并降低计算成本,克服硬件支持等问题。在公开数据集PASCAL VOC 2012上的实验结果显示,该算法其平均交并比可达到76.32%,较基准模型准确率提高了2.52个百分点,参数量减少了9%,模型硬件所占内存减小了61.6%。设计的模型与领域内最新算法相比,该算法在精度、模型复杂度、参数量等方面均展现出了显著的优势。

关键词: 半监督语义分割, 位置感知循环卷积, 极化自注意力, 内部分辨率

Abstract: Aiming at the problems of high complexity, low training accuracy and large number of parameters of the current semi-supervised semantic segmentation method, a lightweight semi-supervised semantic segmentation algorithm integrating the dual-polarization self-attention mechanism is proposed. Firstly, the model uses the Resnet-101 residual network constructed by location-aware cyclic convolution as the segmentation backbone network to extract deep features. Secondly, the dual-polarization self-attention mechanism of channel and space is integrated to maintain high internal resolution in polarization channel and spatial attention branch. Finally, position-aware cyclic convolution is combined with channel attention operation to improve segmentation accuracy, reduce computing cost, and overcome problems such as hardware support. The experimental results on the public dataset PASCAL VOC 2012 show that the average intersection union ratio of the algorithm can reach 76.32%, which is 2.52?percentage points higher than the benchmark model accuracy, the number of parameters is reduced by 9%, and the memory occupied by the model hardware is reduced by 61.6%. Compared with the latest algorithms in the field, the model designed in this paper shows significant advantages in terms of accuracy, model complexity, and parameter quantity.

Key words: semi-supervised semantic segmentation, position aware circular convolution, polarized self-attention, internal resolution