Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (8): 217-226.DOI: 10.3778/j.issn.1002-8331.2204-0141

• Graphics and Image Processing • Previous Articles     Next Articles

Improved SegFormer Network Based Method for Semantic Segmentation of Remote Sensing Images

TIAN Xuewei, WANG Jiali, CHEN Ming, DU Shouqing   

  1. 1.College of Information Technology, Shanghai Ocean University, Shanghai 201306, China
    2.Key Laboratory of Fisheries Information, Ministry of Agriculture, Shanghai 201306, China
  • Online:2023-04-15 Published:2023-04-15



  1. 1.上海海洋大学 信息学院,上海 201306
    2.农业农村部渔业信息重点实验室,上海 201306

Abstract: Existing segmentation algorithms have difficulties to accurately segment small objects and object boundaries on remote sensing images, due to the multiple object scales and insufficient semantic information of small objects on remote sensing images. Therefore, an improved SegFormer network semantic segmentation method for remote sensing images is proposed, which combines the features of multiple scales output by the SegFormer encoder in a cascaded manner. When merging high-level semantic information features, the semantic feature fusion module is used to preserve the fuzzy boundaries; when merging detailed information features, the gated attention mechanism module is used to filter some high-level semantic information features to reduce their interference to the detailed information features. After that, the features of multiple scales are up-sampled and connected, and the multi-local channel attention module is used to recalibrate the mapping relationship of the connected features according to the channel context to enhance the final segmentation effect. The experimental results on UAVid and ISPRS Potsdam datasets show that the improved SegFormer segmentation method is better than the current mainstream segmentation methods compared, and has better semantic segmentation effect on small objects and boundaries in remote sensing images.

Key words: remote sensing image, semantic segmentation, feature fusion, gated-attention, multi-local channels attention

摘要: 由于遥感图像存在目标尺度多、小目标的语义信息不足等问题,现有算法对遥感图像中小目标和目标边界难以精准分割。为此提出了一种改进SegFormer网络的遥感图像语义分割方法,以级联的方式合并SegFormer编码器输出的多个尺度的特征。在合并高层语义信息特征时使用语义特征融合模块保留模糊边界;在合并细节信息特征时使用门控注意力机制模块过滤部分高层语义信息特征,减少其对细节信息特征的干扰。之后将多个尺度的特征上采样后连接,使用多局部通道注意力模块根据通道上下文关系重新校准连接特征的映射关系,增强最终的分割效果。在UAVid和ISPRS Potsdam数据集上的实验结果表明,改进SegFormer的分割方法优于比较的当前主流分割方法,对遥感图像中的小目标和边界有更好的语义分割效果。

关键词: 遥感图像, 语义分割, 特征融合, 门控注意力, 多局部通道注意力