计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (22): 215-225.DOI: 10.3778/j.issn.1002-8331.2408-0077

• 图形图像处理 • 上一篇    下一篇

聚合全局重构语义的航空遥感多目标分割模型

吴小所,乔煜栋,贺成龙,刘小明,闫浩文   

  1. 1.兰州交通大学 电子与信息工程学院,兰州 730070
    2.青海大学 电子信息学院,西宁 810016
  • 出版日期:2025-11-15 发布日期:2025-11-14

Multi-Target Segmentation Model for Aerial Remote Sensing Based on Global Reconstruction Semantics Aggregation

WU Xiaosuo, QIAO Yudong, HE Chenglong, LIU Xiaoming, YAN Haowen   

  1. 1.School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
    2.School of Electronics and Information, Qinghai University, Xining 810016, China
  • Online:2025-11-15 Published:2025-11-14

摘要: 为了解决航空遥感图像存在目标尺度多且语义信息不足和特征边界不清晰等问题,设计了一种聚合全局信息再对特征分类后重构语义的分割模型。将Swin-Transformer作为编码结构,利用其对上下文信息的理解来提取特征,再通过设计的深浅语义重构模块和通道残差重构模块将提取到的特征按信息量进行分类后重构,最后通过设计的区域上采样及下采样连接,将重构后的特征与编码器提取的特征融合成全面的特征聚合块后进行输出。对多目标下重构目标特征做到精细化并生成对应的分割图,以此提高分割精度,实现了高质量的逐像素回归。在ISPRS Vaihingen和ISPRS Potsdam两个数据集上的平均交并比(mean intersection over union,mIoU)分数达到了87.2%和82.9%,整体精准度(overall accuracy,OA)分数达到了91.4%和91.2%。

关键词: 航空遥感图像, 语义分割, 深浅语义重构卷积组, 通道残差重构卷积组, 区域上采样, 特征融合模块

Abstract: To address the challenges of multiple target scales, insufficient semantic information, and blurred feature boundaries in aerial remote sensing images, a segmentation model that aggregates global information and reconstructs semantic representations after feature classification is proposed. Swin-Transformer is employed as the encoder to capture contextual information and extract deep features. A designed deep shallow semantic reconstruction module and a channel residual reconstruction module classify and reconstruct these features based on their information contens. Subsequently, a regional upsampling and downsampling connection strategy is introduced to fuse the reconstructed features with the encoder features into a comprehensive feature aggregation block for final output. This approach enables the fine-grained reconstruction of multi-target features and the generation of accurate segmentation maps, thereby enhancing segmentation precision and achieving high-quality pixel-wise regression. Experimental results show that the model achieves mean intersection over union (mIoU) scores of 87.2% and 82.9%, and overall accuracy (OA) scores of 91.4% and 91.2% on the ISPRS Vaihingen and ISPRS Potsdam datasets, respectively.

Key words: aerial remote sensing image, semantic segmentation, depth and shallow semantic reconstruction convolution group, channel residual reconstruction convolution group, region upsampling, feature fusion module