[1] 何家峰, 陈宏伟, 骆德汉. 深度学习实时语义分割算法研究综述[J]. 计算机工程与应用, 2023, 59(8): 13-27.
HE J F, CHEN H W, LUO D H. Review of real-time semantic segmentation algorithms for deep learning[J]. Computer Engineering and Applications, 2023, 59(8): 13-27.
[2] 田雪伟, 汪佳丽, 陈明, 等. 改进SegFormer网络的遥感图像语义分割方法[J]. 计算机工程与应用, 2023, 59(8): 217-226.
TIAN X W, WANG J L, CHEN M, et al. Semantic segmentation of remote sensing images based on improved SegFormer network[J]. Computer Engineering and Applications, 2023, 59(8): 217-226.
[3] 周家厚, 普运伟, 陈如俊, 等. 改进的UNet3+网络高分辨率遥感影像道路提取[J]. 激光杂志, 2024, 45(2): 161-168.
ZHOU J H, PU Y W, CHEN R J, et al. Improved UNet3+ network high-resolution remote sensing image road extraction[J]. Laser Journal , 2024, 45(2): 161-168.
[4] 徐光宪, 冯春, 马飞. 基于UNet的医学图像分割综述[J]. 计算机科学与探索, 2023, 17(8): 1776-1792.
XU G X, FENG C, MA F. Review of medical image segmentation based on UNet[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(8): 1776-1792.
[5] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 3431-3440.
[6] RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C]//Proceedings of Medical Image Computing and Computer-Assisted Intervention, 2015: 234-241.
[7] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[J]. arXiv:2010.11929, 2020.
[8] HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[J]. arXiv:1704.04861, 2017.
[9] SANDLER M, HOWARD A, ZHU M, et al. MobileNetv2: inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 4510-4520.
[10] ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 6848-6856.
[11] S?NDERBY C K, ESPEHOLT L, HEEK J, et al. MetNet: a neural weather model for precipitation forecasting[J]. arXiv: 2003.12140, 2020.
[12] WAN Q, HUANG Z, LU J, et al. SeaFormer: squeeze-enhanced axial transformer for mobile semantic segmentation[J]. arXiv:2301.13156, 2023.
[13] HUANG Z, WANG X, HUANG L, et al. CCNet: Criss-cross attention for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 603-612.
[14] HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 13713-13722.
[15] ZHU L, WANG X, KE Z, et al. BiFormer: vision transformer with bi-level routing attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 10323-10333.
[16] CHEN X, LIU Z, TANG H, et al. SparseViT: revisiting activation sparsity for efficient high-resolution vision transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 2061-2070.
[17] HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 1580-1589.
[18] TAN M. Rethinking model scaling for convolutional neural networks[J]. arXiv:1905.11946, 2019.
[19] GO J, RYU J. Spatial bias for attention-free non-local neural networks[J]. arXiv:2302.12505, 2023.
[20] TOUVRON H, CORD M, DOUZE M, et al. Training data-efficient image transformers & distillation through attention[C]//Proceedings of the International Conference on Machine Learning, 2021: 10347-10357.
[21] GRAHAM B, EL-NOUBY A, TOUVRON H, et al. LeViT: a vision transformer in convnet’s clothing for faster inference[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 12259-12269.
[22] MEHTA S, RASTEGARI M. MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer[J]. arXiv:2110.02178, 2021.
[23] HUANG T, HUANG L, YOU S, et al. Towards light-weight convolution-free vision transformers[J]. arXiv:2207.05557, 2022.
[24] PAN X, YE T, XIA Z, et al. Slide-Transformer: hierarchical vision transformer with local self-attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 2082-2091.
[25] ZHAO H, QI X, SHEN X, et al. ICNet for real-time semantic segmentation on high-resolution images[C]//Proceedings of the European Conference on Computer Vision, 2018: 405-420.
[26] YU C, WANG J, PENG C, et al. BiSeNet: bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision, 2018: 325-341.
[27] ZHANG W, HUANG Z, LUO G, et al. TopFormer: token pyramid transformer for mobile semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 12083-12093.
[28] DONG B, WANG P, WANG F. Head-free lightweight semantic segmentation with linear transformer[J]. arXiv: 2301.04648, 2023.
[29] ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2881-2890.
[30] CHEN L C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer Vision, 2018: 801-818.
[31] HONG Y, PAN H, SUN W, et al. Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes[J]. arXiv:2101.06085, 2021.
[32] CHENG B, MISRA I, SCHWING A G, et al. Masked-attention mask transformer for universal image segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 1290-1299.
[33] YU C, GAO C, WANG J, et al. BiSeNet v2: bilateral network with guided aggregation for real-time semantic segmentation[J]. International Journal of Computer Vision, 2021, 129: 3051-3068.
[34] XIE E, WANG W, YU Z, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[C]//Advances in Neural Information Processing Systems, 2021, 34: 12077-12090.
[35] CHENG B, SCHWING A, KIRILLOV A. Per-pixel classification is not all you need for semantic segmentation[C]//Advances in Neural Information Processing Systems, 2021, 34: 17864-17875.
[36] JAIN J, LI J, CHIU M T, et al. OneFormer: one transformer to rule universal image segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 2989-2998. |