Improved SegFormer Network Based Method for Semantic Segmentation of Remote Sensing Images
TIAN Xuewei, WANG Jiali, CHEN Ming, DU Shouqing
1.College of Information Technology, Shanghai Ocean University, Shanghai 201306, China
2.Key Laboratory of Fisheries Information, Ministry of Agriculture, Shanghai 201306, China
TIAN Xuewei, WANG Jiali, CHEN Ming, DU Shouqing. Improved SegFormer Network Based Method for Semantic Segmentation of Remote Sensing Images[J]. Computer Engineering and Applications, 2023, 59(8): 217-226.
[1] 廖小罕,肖青,张颢.无人机遥感:大众化与拓展应用发展趋势[J].遥感学报,2019,23(6):1046-1052.
LIAO X H,XIAO Q,ZHANG H.UAV remote sensing:popularization and expand application development trend[J].Journal of Remote Sensing,2019,23(6):1046-1052.
[2] LV Q,DOU Y,NIU X,et al.Urban land use and land cover classification using remotely sensed SAR data through deep belief networks[J].Journal of Sensors,2015:538063.
[3] YANG Q C,LIU M,ZHANG Z T,et al.Mapping plastic mulched farmland for high resolution images of unmanned aerial vehicle using deep semantic segmentation[J].Remote Sensing,2019,11(17):2008-2023.
[4] PI Y L,NATH N D,BEHZADAN A H,et al.Detection and semantic segmentation of disaster damage in UAV footage[J].Journal of Computing in Civil Engineering,2021,35(2):1-19.
[5] GUO Y T,LONG T F,JIAO W L,et al.Siamese detail difference and self-inverse network for forest cover change extraction based on Landsat 8 OLI satellite images[J].Remote Sensing,2022,14(3):627-646.
[6] 徐辉,祝玉华,甄彤,等.深度神经网络图像语义分割方法综述[J].计算科学与探索,2021,15(1):47-59.
XU H,ZHU Y H,ZHEN T,et al.Survey of image semantic segmentation methods based on deep neural network[J].Journal of Frontiers of Computer Science and Technology,2021,15(1):47-59.
[7] YAMASHITA R,NISHIO M,DO R K G,et al.Convolutional neural networks:an overview and application in radiology[J].Insights Imaging,2018,9(4):611-629.
[8] ZEILER M D,FERGUS R.Visualizing and understanding convolutional networks[C]//2014 13th European Conference on Computer Vision(ECCV),Zurich,September 5-12,2014.Cham:Springer,2014:818-833.
[9] LONG J,SHELHAMER E,DARRELL T,et al.Fully convolutional networks for semantic segmentation[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),Boston,June 7-12,2015.New York:IEEE Press,2015:3431-3440.
[10] JIANG B D,AN X Y,XU S F,et al.Intelligent image semantic segmentation:a review through deep learning techniques for remote sensing image analysis[J].Journal of the Indian Society of Remote Sensing,2022:1-14.
[11] RONNEBERGER O,FISCHER P,BROX T.U-net:convolutional networks for biomedical image segmentation[C]//2015 International Conference on Medical Image Computing and Computer-assisted Intervention,Munich,October 5-9,2015.Cham:Springer,2015:234-241.
[12] BADRINARAYANAN V,KENDALL A,CIPOLLA R.Segnet:a deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495.
[13] ZHAO H S,QI X J,SHEN X Y,et al.Icnet for real-time semantic segmentation on high-resolution images[C]//2018 15th European Conference on Computer Vision(ECCV),Munich,September 8-14,2018.Cham:Springer,2018:405-420.
[14] CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Deeplab:semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(4):834-848.
[15] 熊风光,张鑫,韩燮,等.改进的遥感图像语义分割研究[J].计算机工程与应用,2022,58(8):185-190.
XIONG F G,ZHANG X,HAN X,et al.Research on improved semantic segmentation of remote sensing[J].Computer Engineering and Applications,2022,58(8):185-190.
[16] CHEN L C,ZHU Y,PAPANDREOU G,et al.Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//2018 15th European Conference on Computer Vision(ECCV),Munich,September 8-14,2018.Cham:Springer,2018:801-818.
[17] FU J,LIU J,TIAN H J,et al.Dual attention network for scene segmentation[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),Long Beach,June 15-20,2019.New York:IEEE Press,2019:3141-3149.
[18] WOO S Y,PARK J C,LEE J Y,et al.CBAM:convolutional block attention module[C]//2018 15th European Conference on Computer Vision(ECCV),Munich,September 8-14,2018.Cham:Springer,2018:3-19.
[19] YIN M H,YAO Z L,CAO Y,et al.Disentangled non-local neural networks[C]//2020 16th European Conference on Computer Vision(ECCV),Glasgow,August 23-28,2020.Cham:Springer,2020:191-207.
[20] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems,2017:1-10.
[21] 刘文婷,卢新明.基于计算机视觉的Transformer研究进展[J].计算机工程与应用,2022,58(6):1-16.
LIU W T,LU X M.Research progress of transformer based on computer vision[J].Computer Engineering and Applications,2022,58(6):1-16.
[22] DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An image is worth 16×16 words:transformers for image recognition at scale[J].arXiv:2010.11929,2020.
[23] ZHENG S,LU J,ZHAO H,et al.Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),June 19-25,2021.New York:IEEE Press,2021:6881-6890.
[24] XIE E Z,WANG W H,YU Z D,et al.SegFormer:simple and efficient design for semantic segmentation with transformers[C]//Advances in Neural Information Processing Systems,2021.
[25] ZHOU G B,WU J X,ZHANG C L,et al.Minimal gated unit for recurrent neural network[J].International Journal of Automation and Computing,2016,13(3):226-234.
[26] HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//2018 IEEE conference on Computer Vision and Pattern Recognition(CVPR),Salt Lake City,June 18-22,2018.New York:IEEE Press,2018:7132-7141.
[27] WANG Q L,WU B G,ZHU P F,et al.ECA-Net:efficient channel attention for deep convolutional neural networks[C]//2020 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),Seattle,June 13-19,2020.New York:IEEE Press,2020:11531-11539.
[28] LYU Y,VOSSELMAN G,XIA G S,et al.UAVid:a semantic segmentation dataset for UAV imagery[J].ISPRS Journal of Photogrammetry and Remote Sensing,2020,165:108-119.
[29] 2D semantic labeling contest-potsdam[EB/OL].(2022-02-08)[2022-03-27].https://www2.isprs.org/commissions/comm2/wg4/benchmark/2d-sem-label-potsdam.aspx.
[30] DENG J,DONG W,SOCHER R,et al.Imagenet:a large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),Miami,June 20-25,2009.New York:IEEE Press,2009:248-255.
[31] HE K M,ZHANG X Y,REN S Q,et al.Delving deep into rectifiers:surpassing human-level performance on imagenet classification[C]//2015 IEEE International Conference on Computer Vision(ICCV),Santiago,December 13-16,2015.New York:IEEE Press,2015:1026-1034.
[32] ZHAO H S,SHI J P,QI X J,et al.Pyramid scene parsing network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),Honolulu,July 21-26,2017.New York:IEEE Press,2017:6230-6239.
[33] HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),Las Vegas,June 26-July 1,2016.New York:IEEE Press,2016:770-778.