[1] LIN A, CHEN B, XU J, et al. DS-TransUNet: dual swin Transformer U-Net for medical image segmentation[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 1-15.
[2] ZHANG L, MA J, LV X, et al. Hierarchical weakly supervised learning for residential area semantic segmentation in remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2019, 17(1): 117-121.
[3] ZHAO J, ZHOU Y, SHI B, et al. Multi-stage fusion and multi-source attention network for multi-modal remote sensing image segmentation[J]. ACM Transactions on Intelligent Systems and Technology (TIST), 2021, 12(6): 1-20.
[4] SHEIKH R, MILIOTO A, LOTTES P, et al. Gradient and log-based active learning for semantic segmentation of crop and weed for agricultural robots[C]//Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020: 1350-1356.
[5] ZHANG X, XIAO Z, LI D, et al. Semantic segmentation of remote sensing images using multiscale decoding network[J]. IEEE Geoscience and Remote Sensing Letters, 2019, 16(9): 1492-1496.
[6] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 3431-3440.
[7] YI Y N, ZHANG Z J, ZHANG W C, et al. Semantic segmentation of urban buildings from VHR remote sensing imagery using a deep convolutional neural network[J]. Remote Sensing, 2019, 10(15): 1774-1792.
[8] MINAEE S, BOYKOV Y Y, PORIKLI F, et al. Image segmentation using deep learning: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(7): 3523-3542.
[9] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[J]. arXiv:1706.05587, 2017.
[10] 田雪伟, 汪佳丽, 陈明, 等. 改进 SegFormer 网络的遥感图像语义分割方法[J]. 计算机工程与应用, 2023, 59(8): 217-226.
TIAN X W, WANG J L, CHEN M, et al. Semantic segmentation of remote sensing image based on improved SegFormer network[J]. Computer Engineering and Applications, 2023, 59(8): 217-226.
[11] RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation[C]//Proceedings of the Medical Image Computing and Computer-Assisted Intervention (MICCAI 2015), 2015: 234-241.
[12] RASTOGI K, BODANI P, SHARMA S A. Automatic building footprint extraction from very high-resolution imagery using deep learning techniques[J]. Geocarto International, 2022, 37(5): 1501-1513.
[13] 项剑文, 陈泯融, 杨百冰. 结合Swin及多尺度特征融合的细粒度图像分类[J]. 计算机工程与应用, 2023, 59(20): 147-157.
XIANG J W, CHEN M R, YANG B B. Fine-grained image classification combined with Swin and multi-scale feature fusion[J]. Computer Engineering and Applications, 2023, 59(20): 147-157.
[14] FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 3146-3154.
[15] LI R, ZHENG S, ZHANG C, et al. Multiattention network for semantic segmentation of fine-resolution remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1-13.
[16] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[J]. arXiv:2010.11929, 2020.
[17] LIN H, CHENG X, WU X, et al. CAT: cross attention in vision transformer[C]//Proceedings of the 2022 IEEE International Conference on Multimedia and Expo, 2022: 1-6.
[18] HE X, ZHOU Y, ZHAO J, et al. Swin transformer embedding UNet for remote sensing image semantic segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-15.
[19] GAO L, LIU H, YANG M, et al. STransFuse: fusing swin transformer and convolutional neural network for remote sensing image semantic segmentation[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14: 10990-11003.
[20] WANG L, LI R, DUAN C, et al. A novel transformer based semantic segmentation scheme for fine-resolution remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-5.
[21] ALMARZOUQI H, SAOUD L S. Semantic labeling of high resolution images using EfficientUNets and Transformers[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 1-13.
[22] LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical visiontransformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 10012-10022.
[23] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
[24] DAI Y, GIESEKE F, OEHMCKE S, et al. Attentional feature fusion[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021: 3560-3569.
[25] YUAN Y, HUANG L, GUO J, et al. OCNet: object context for semantic segmentation[J]. International Journal of Computer Vision, 2021, 129(8): 2375-2398.
[26] WANG L, LI R, WANG D, et al. Transformer meets convolution: a bilateral awareness network for semantic segmentation of very fine resolution urban scene images[J]. Remote Sensing, 2021, 13(16): 3065.
[27] CHEN J, ZHANG D, WU Y, et al. A context feature enhancement network for building extraction from high-resolution remote sensing imagery[J]. Remote Sensing, 2022, 14(9): 2276. |