计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (13): 255-265.DOI: 10.3778/j.issn.1002-8331.2304-0151

• 图形图像处理 • 上一篇    下一篇

改进U-net++的遥感图像语义分割方法

何佳佳,徐杨,张永丹   

  1. 1.贵州大学 大数据与信息工程学院,贵阳 550025
    2.贵阳铝镁设计研究院有限公司,贵阳 550009
  • 出版日期:2024-07-01 发布日期:2024-07-01

Improved U-net++ Semantic Segmentation Method for Remote Sensing Images

HE Jiajia, XU Yang, ZHANG Yongdan   

  1. 1.College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, China
    2.Guiyang Aluminum-magnesium Design and Research Institute Co. Ltd., Guiyang 550009, China
  • Online:2024-07-01 Published:2024-07-01

摘要: 遥感图像语义分割在土地资源规划、智慧城市等领域有着广泛的应用。由于高分辨率遥感的图像存在边界分割不清、小尺寸目标分割不清等问题,为此提出了一种基于U-net++的改进网络TU-net,该模型通过优化网络结构加强模型的特征提取能力,引入特征头细化模块,通过构建了两条通道来增强通道特征表示和空间特征表示,提升了对高层语义信息的解析能力;引入基于Transformer的注意力聚合模块来捕捉全局上下文信息,设计了十字窗交互模块,显著降低了计算复杂度;在解码器末端设计了一个动态特征融合块,以此得到多类、多尺度的语义信息,增强最终的分割效果。TU-net在两个数据集上进行实验,其中OA、mIoU、mF1分数均高于主流模型,Vaihingen数据集中小尺寸目标车的IoU和F1分数分别为0.896和0.962,比次优模型提升了5%和15.8%;Potsdam数据集中树的IoU和F1分数分别为0.913和0.936,比次优模型提升了6.3%和4.3%,实验结果表明该模型能够更精准地分割小尺寸目标及目标边界。

关键词: 高分辨率遥感的图像, 语义分割, 小尺寸目标, 特征头细化模块, 十字窗交互模块, 动态特征融合块

Abstract: Semantic segmentation of remote sensing images has extensive applications in fields such as land resources planning and smart cities. In this regard, a TU-net model based on U-net++ is proposed to enhance the feature extraction ability of the model and address the issues of unclear boundary segmentation and small target segmentation in high-resolution remote sensing images. Firstly, the feature head thinning module is introduced to enhance the channel feature representation and spatial feature representation by constructing two channels, thereby improving the resolution ability for high-level semantic information. Secondly, the Transformer-based attention aggregation module is introduced to capture global context information, replacing the multi-level skip connections of U-net++, and the cross-window interaction module is designed to significantly reduce computational complexity. Finally, a dynamic feature fusion block is designed at the end of the decoder to obtain multi-class, multi-scale semantic information, enhancing the final segmentation results. TU-net is tested on two datasets, and the OA, mIoU, and mF1 scores are higher than those of mainstream models. In the Vaihingen dataset, the IoU and F1 scores of small target cars are 0.896 and 0.962, respectively, which are 5% and 15.8% higher than the suboptimal model. In the Potsdam dataset, the IoU and F1 scores of trees are 0.913 and 0.936, respectively, which are 6.3% and 4.3% higher than the suboptimal model. The experimental results show that the model can more accurately segment small targets and target boundaries.

Key words: high-resolution remote sensing images, semantic segmentation, small target, feature head refinement module, cross window interaction module, dynamic feature fusion block