Improved U-net++ Semantic Segmentation Method for Remote Sensing Images

doi:10.3778/j.issn.1002-8331.2304-0151

Abstract

Abstract: Semantic segmentation of remote sensing images has extensive applications in fields such as land resources planning and smart cities. In this regard, a TU-net model based on U-net++ is proposed to enhance the feature extraction ability of the model and address the issues of unclear boundary segmentation and small target segmentation in high-resolution remote sensing images. Firstly, the feature head thinning module is introduced to enhance the channel feature representation and spatial feature representation by constructing two channels, thereby improving the resolution ability for high-level semantic information. Secondly, the Transformer-based attention aggregation module is introduced to capture global context information, replacing the multi-level skip connections of U-net++, and the cross-window interaction module is designed to significantly reduce computational complexity. Finally, a dynamic feature fusion block is designed at the end of the decoder to obtain multi-class, multi-scale semantic information, enhancing the final segmentation results. TU-net is tested on two datasets, and the OA, mIoU, and mF1 scores are higher than those of mainstream models. In the Vaihingen dataset, the IoU and F1 scores of small target cars are 0.896 and 0.962, respectively, which are 5% and 15.8% higher than the suboptimal model. In the Potsdam dataset, the IoU and F1 scores of trees are 0.913 and 0.936, respectively, which are 6.3% and 4.3% higher than the suboptimal model. The experimental results show that the model can more accurately segment small targets and target boundaries.

Key words: high-resolution remote sensing images, semantic segmentation, small target, feature head refinement module, cross window interaction module, dynamic feature fusion block

摘要： 遥感图像语义分割在土地资源规划、智慧城市等领域有着广泛的应用。由于高分辨率遥感的图像存在边界分割不清、小尺寸目标分割不清等问题，为此提出了一种基于U-net++的改进网络TU-net，该模型通过优化网络结构加强模型的特征提取能力，引入特征头细化模块，通过构建了两条通道来增强通道特征表示和空间特征表示，提升了对高层语义信息的解析能力；引入基于Transformer的注意力聚合模块来捕捉全局上下文信息，设计了十字窗交互模块，显著降低了计算复杂度；在解码器末端设计了一个动态特征融合块，以此得到多类、多尺度的语义信息，增强最终的分割效果。TU-net在两个数据集上进行实验，其中OA、mIoU、mF1分数均高于主流模型，Vaihingen数据集中小尺寸目标车的IoU和F1分数分别为0.896和0.962，比次优模型提升了5%和15.8%；Potsdam数据集中树的IoU和F1分数分别为0.913和0.936，比次优模型提升了6.3%和4.3%，实验结果表明该模型能够更精准地分割小尺寸目标及目标边界。

关键词: 高分辨率遥感的图像, 语义分割, 小尺寸目标, 特征头细化模块, 十字窗交互模块, 动态特征融合块

HE Jiajia, XU Yang, ZHANG Yongdan. Improved U-net++ Semantic Segmentation Method for Remote Sensing Images[J]. Computer Engineering and Applications, 2024, 60(13): 255-265.

何佳佳, 徐杨, 张永丹. 改进U-net++的遥感图像语义分割方法[J]. 计算机工程与应用, 2024, 60(13): 255-265.

References

[1] 廖小罕, 肖青, 张颢. 无人机遥感: 大众化与拓展应用发展趋势[J]. 遥感学报, 2019, 23(6): 1046-1052.
LIAO X H, XIAO Q, ZHANG H. UAV remote sensing: popularization and expand application development trend[J]. Journal of Remote Sensing, 2019, 23(6): 1046-1052.
[2] YANG Z, WU Q, ZHANG F, et al. Optimizing spatial relationships in GCN to improve the classification accuracy of remote sensing images[J]. Intelligent Automation & Soft Computing, 2023, 37(1): 491-506.
[3] YANG Q C, LIU M, ZHANG Z T, et al. Mapping plastic mulched farmland for high resolution images of unmanned aerial vehicle using deep semantic segmentation[J]. Remote Sensing, 2019, 11(17): 2008-2023.
[4] PI Y, NATH N D, BEHZADAN A H. Detection and semantic segmentation of disaster damage in UAV footage[J]. Journal of Computing in Civil Engineering, 2021, 35(2): 04020063.
[5] GUO Y T, LONG T F, JIAO W L, et al. Siamese detail difference and self-inverse network for forest cover change extraction based on Landsat 8 OLI satellite images[J]. Remote Sensing, 2022, 14(3): 627-646.
[6] AHLSWEDE S, THEKKE-MADAM N, SCHULZ C, et al. Weakly supervised semantic segmentation of remote sensing images for tree species classification based on explanation methods[J]. arXiv: 2201.07495, 2022.
[7] SARITURK B, SEKER D Z. Comparison of residual and dense neural network approaches for building extraction from high-resolution aerial images[J]. Advances in Space Research, 2023, 71(7): 3076-3089.
[8] 张浩然, 赵江洪, 张晓光. 利用 U-net网络的高分遥感影像建筑提取方法[J]. 遥感信息, 2020, 35(3): 143-150.
ZHANG H R, ZHAO J H, ZHANG X G. Building extraction of high-resolution remote sensing images using U-net network[J]. Remote Sensing Information, 2020, 35(3): 143-150.
[9] ZHANG X J, WANG X L. Image segmentation models of remote sensing using full residual connectin and feature fusion[J]. National Remote Sensing Bulletin, 2020, 24(9): 1120-1133.
[10] YANG J Y, ZHOU Z X, DU Z R, et al. Rural construction land extraction from high spatial resolution remote sensing image based on SegNet semantic segmentation model[J]. Transactions of the Chinese Society of Agricultural Engineering, 2019, 35(5): 251-258.
[11] CUI M, LI K, LI Y, et al. Semi-supervised semantic segmentation of remote sensing images based on dual cross-entropy consistency[J]. Entropy, 2023, 25(4): 681.
[12] JI X, TANG L, LU T, et al. DBENet: dual-branch ensemble network for sea-land segmentation of remote-sensing images[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 1-11.
[13] ZHOU Z, SIDDIQUEE M, TAJBAKHSH N, et al. UNet++: a nested U-Net architecture for medical image segmentation[M]//Deep learning in medical image analysis and multimodal learning for clinical decision support (DLMIA). Berlin: Springer-Verlag, 2018: 3-11.
[14] ZHANG C, HARRISON P A, PAN X, et al. Scale sequence joint deep learning (SS-JDL) for land use and land cover classification[J]. Remote Sensing of Environment, 2020, 237: 111593.
[15] CHEN L C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 801-818.
[16] 张哲晗, 方薇, 杜丽丽, 等. 基于编码-解码卷积神经网络的遥感图像语义分割[J]. 光学学报, 2020, 40(3): 10-16.
ZHANG Z H, FANG W, DU L L, et al. Remote sensing image semantic segmentation based on encoding decoding convolutional neural network[J]. Journal of Optics, 2020, 40(3): 10-16.
[17] GUO Y, JIA X, PAULL D. Effective sequential classifier training for SVM-based multitemporal remote sensing image classification[J]. IEEE Transactions on Image Processing, 2018, 27(6): 3036-3048.
[18] 刘腊梅, 王晓娜, 刘万军, 等. 融合转置卷积与深度残差图像语义分割方法[J]. 计算机科学与探索, 2022, 16(9): 2132-2142.
LIU L M, WANG X N, LIU W J, et al. Image semantic segmentation method with fusion of transposed convolution and deep residual[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 2132-2142.
[19] ADEDE C, OBOKO R, WAGACHA P W, et al. A mixed model approach to vegetation condition prediction using artificial neural networks (ANN): case of Kenya’s operational drought monitoring[J]. Remote Sensing, 2019, 11(9): 1099.
[20] 欧阳柳, 贺禧, 瞿绍军. 全卷积注意力机制神经网络的图像语义分割[J]. 计算机科学与探索, 2022, 16(5): 1136-1145.
OUYANG L, HE X, QU S J. ?Fully convolutional neural network with attention module for semantic segmentation[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1136-1145.
[21] LIU Z, NING J, CAO Y, et al. Video swin transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 3202-3211.
[22] LI R, DUAN C, ZHENG S, et al. MACU-Net for semantic segmentation of fine-resolution remotely sensed images[J]. IEEE Geoscience and Remote Sensing Letters, 2021: 1-5.
[23] ALAM M, WANG J F, CONG G, et al. Convolutional neural network for the semantic segmentation of remote sensing images[J]. Mobile Networks and Applications, 2021, 26: 200-215.
[24] 孙汉淇, 潘晨, 何灵敏, 等. 多模态特征融合的遥感图像语义分割网络[J]. 计算机工程与应用, 2022, 58(24): 256-264.
SUN H Q, PAN C, HE L M, et al. Remote sensing image semantic segmentation network based on multimodal feature fusion[J]. Computer Engineering and Applications, 2022, 58(24): 256-264.