计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (13): 220-228.DOI: 10.3778/j.issn.1002-8331.2203-0618

• 图形图像处理 • 上一篇    下一篇

改进ASPP及多层次特征语义融合分割方法

王银宇,孟凡云,王金鹤,刘志浩   

  1. 青岛理工大学 信息与控制工程学院,山东 青岛 266000
  • 出版日期:2023-07-01 发布日期:2023-07-01

Improved ASPP and Multilevel Feature Semantic Fusion Segmentation Method

WANG Yinyu, MENG Fanyun, WANG Jinhe, LIU Zhihao   

  1. School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong 266000, China
  • Online:2023-07-01 Published:2023-07-01

摘要: 为解决图像语义分割中多尺度目标分割困难、类别边界预测不准确等问题,提出一种基于改进空洞空间金字塔池化的多层次特征语义融合分割方法。将深层次网络特征按通道分组,利用分组空洞空间金字塔池化模块捕获每个分组多尺度特征上下文信息;引入条状池化模块对上下文信息补充和完善,增强全局语义信息表达;根据语义引导融合模块建立不同层次特征像素间对应关系,将深层次语义信息以自底向上方式逐步融入到低层次高分辨率图像中。实验结果表明,该方法在PASCAL VOC 2012和Cityscapes公开数据集上分别获得73.1%、71.8%的平均交并比,且在相同精度下,该方法减少了39%的参数量。

关键词: 语义分割, 空洞空间金字塔池化, 特征融合, 上下文信息

Abstract: To solve the problems of the difficult multi-scale target segmentation and inaccurate category boundary prediction in image semantic segmentation, a multilevel feature semantic fusion segmentation method based on improved atrous spatial pyramid pooling is proposed. Firstly, the deep-level network features are grouped by the channels, and the multi-scalefeature context information of each grouped is captured by using the split atrous spatial pyramid pooling module. Secondly, the strip pooling module is introduced to supplement and refine the contextual information and enhance the global semantic information representation. Finally, the semantic guidance fusion module is used to establish the correspondence between the feature pixels at different levels, and the deep-level semantic information is gradually incorporated into the low-level high-resolution image with a bottom-up manner. The experimental results show that this method obtains 73.1% and 71.8% of the mean intersection over union on the PASCAL VOC 2012 and Cityscapes public datasets, respectively, and reduces the number of parameters by 39% with the same accuracy.

Key words: semantic segmentation, atrous spatial pyramid pooling, feature fusion, contextual information