计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (6): 57-69.DOI: 10.3778/j.issn.1002-8331.2207-0139
于营,王春平,付强,寇人可,吴巍屹,刘天勇
出版日期:
2023-03-15
发布日期:
2023-03-15
YU Ying, WANG Chunping, FU Qiang, KOU Renke, WU Weiyi, LIU Tianyong
Online:
2023-03-15
Published:
2023-03-15
摘要: 深度学习算法在语义分割领域已经取得大量突破,对这些算法的性能评估应选择标准、通用、全面的度量指标,以保证评价的客观性和有效性。通过对当前语义分割评价指标和度量方法进行归纳分析,从像素标记准确性、深度估计误差度量、执行效率、内存占用、鲁棒性等方面进行了多角度阐述,尤其对广泛应用的F1分数、mIoU、mPA、Dice系数、Hausdorff距离等准确性指标进行了详细介绍,并总结了提高分割网络鲁棒性的方法,指出了语义分割实验的要求和当前分割质量评价存在的问题。
于营, 王春平, 付强, 寇人可, 吴巍屹, 刘天勇. 语义分割评价指标和评价方法综述[J]. 计算机工程与应用, 2023, 59(6): 57-69.
YU Ying, WANG Chunping, FU Qiang, KOU Renke, WU Weiyi, LIU Tianyong. Survey of Evaluation Metrics and Methods for Semantic Segmentation[J]. Computer Engineering and Applications, 2023, 59(6): 57-69.
[1] MINAEE S,BOYKOV Y Y,PORIKLI F,et al.Image segmentation using deep learning:a survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,44(7):3523-3542. [2] 张鑫,姚庆安,赵健,等.全卷积神经网络图像语义分割方法综述[J].计算机工程与应用,2022,58(8):45-57. ZHANG X,YAO Q A,ZHAO J,et al.Image semantic segmentation based on fully convolutional neural network[J].Computer Engineering and Applications,2022,58(8):45-57. [3] 王涛,王文举,蔡宇.基于深度学习的三维点云语义分割方法研究[J].计算机工程与应用,2021,57(23):18-26. WANG T,WANG W J,CAI Y.Research of deep learning-based semantic segmentation for 3D point cloud[J].Computer Engineering and Applications,2021,57(23):18-26. [4] GARCIA-GARCIA A,ORTS-ESCOLANO S,OPREA S,et al.A survey on deep learning techniques for image and video semantic segmentation[J].Applied Soft Computing,2018,70:41-65. [5] OTSU N.A threshold selection method from gray-level histograms[J].IEEE Transactions on Systems Man & Cybernetics,2007,9(1):62-66. [6] SHAFARENKO L,PETROU H,KITTLER J.Histogram-based segmentation in a perceptually uniform color space[J].IEEE Transactions on Image Processing,1998,7(9):1354-1358. [7] NOCK R,NIELSEN F.Statistical region merging[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2004,26(11):1452-1458. [8] KANUNGO T,MOUNT D M,NETANYAHU N S,et al.An efficient [k]-means clustering algorithm:analysis and implementation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(7):881-892. [9] NAJMAN L,SCHMITT M.Watershed of a continuous function[J].Signal Processing,2014,38(1):99-112. [10] KASS M,WITKIN A,TERZOPOULOS D.Snakes:active contour models[J].International Journal of Computer Vision,1988,1(4):321-331. [11] BOYKOV Y,VEKSLER O,ZABIH R.Fast approximate energy minimization via graph cuts[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,23(11):1222-1229. [12] PLATH N,TOUSSAINT M,NAKAJIMA S.Multi-class image segmentation using conditional random fields and global classification[C]//Proceedings of the International Conference on Machine Learning,2009:1-8. [13] LI S Z.Modeling image analysis problems using Markov random fields[M]//Stochastic processes:modelling and simulation.[S.l.]:Elsevier,2003:473-513. [14] GABAIX X.A sparsity-based model of bounded rationality[J].Quarterly Journal of Economics,2014,129(4). [15] DONG W,XIN L,LEI Z,et al.Sparsity-based image denoising via dictionary learning and structural clustering[C]//Proceedings of the Computer Vision and Pattern Recognition(CVPR),2011:457-464. [16] LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2015:7-12. [17] RONNEBERGER O,FISCHER P,BROX T.U-Net:convolutional networks for biomedical image segmentation[C]//Proceedings of the Medical Image Computing and Computer-Assisted Intervention(MICCAI).Cham:Springer International Publishing,2015. [18] BADRINARAYANAN V,KENDALL A,CIPOLLA R.SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495. [19] ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2017:21-26. [20] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2016:770-778. [21] GHIASI G,FOWLKES C C.Laplacian pyramid reconstruction and refinement for semantic segmentation[C]//Proceedings of the Computer Vision-ECCV.Cham:Springer International Publishing,2016. [22] YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[C]//Proceedings of the International Conference on Learning Representations(ICLR),2016. [23] CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Semantic image segmentation with deep convolutional nets and fully connected CRFs[J].Computer Science,2014(4):357-361. [24] CHEN L C,PAPANDREOU G,KOKKINOS I,et al.DeepLab:semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,40(4):834-848. [25] CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv:1706.05587,2017. [26] CHEN L C,ZHU Y,PAPANDREOU G,et al.Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer Vision(ECCV).Cham:Springer International Publishing,2018:833-851. [27] VISIN F,ROMERO A,CHO K,et al.ReSeg:a recurrent neural network-based model for semantic segmentation[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops(CVPRW),2016. [28] VISIN F,KASTNER K,CHO K,et al.ReNet:a recurrent neural network based alternative to convolutional networks[J].Computer Science,2015,25(7):2983-2996. [29] BYEON W,BREUEL T M,RAUE F,et al.Scene labeling with LSTM recurrent neural networks[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2015. [30] LIANG X,SHEN X,FENG J,et al.Semantic object parsing with graph LSTM[C]//Proceedings of the European Conference on Computer Vision(ECCV).Cham:Springer International Publishing,2016. [31] DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An image is worth 16x16 words:transformers for image recognition at scale[J].arXiv:2010.11929,2020. [32] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems,Long Beach,California,USA,2017. [33] LIU Z,LIN Y,CAO Y,et al.Swin transformer:hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision(ICCV),2021. [34] 于营,杨婷婷,杨博雄.混淆矩阵分类性能评价及Python实现[J].现代计算机,2021(20):70-73. YU Y,YANG T T,YANG B X.Confusion matrix classification performance evaluation and Python implementation[J].Modern Computer,2021(20):70-73. [35] PILLAI I,FUMERA G,ROLI F.Designing multi-label classifiers that maximize F measures:state of the art[J].Pattern Recognition,2017,61:394-404. [36] PEREIRA R B,PLASTINO A,ZADROZNY B,et al.Correlation analysis of performance measures for multi-label classification[J].Information Processing & Management,2018,54(3):359-369. [37] CAO Y,XU J,LIN S,et al.Gcnet:non-local networks meet squeeze-excitation networks and beyond[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop(ICCVW),2020. [38] ZHAO H,ZHANG Y,LIU S,et al.Psanet:point-wise spatial attention network for scene parsing[C]//Proceedings of the European Conference on Computer Vision(ECCV),2018:267-283. [39] WANG X,GIRSHICK R,GUPTA A,et al.Non-local neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2018:7794-7803. [40] FU J,LIU J,TIAN H,et al.Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),2019:3146-3154. [41] YUAN Y,CHEN X,WANG J.Object-contextual representations for semantic segmentation[J].arXiv:1909.11065v6, 2019. [42] ZHENG S,LU J,ZHAO H,et al.Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),2021:6881-6890. [43] XIE E,WANG W,YU Z,et al.Segformer:simple and efficient design for semantic segmentation with transformers[J].arXiv:2105.15203,2021. [44] YAN H,ZHANG C,WU M.Lawin transformer:improving semantic segmentation transformer with multi-scale representations via large window attention[J].arXiv:2201. 01615,2022. [45] CHEN Z,DUAN Y,WANG W,et al.Vision transformer adapter for dense predictions[J].arXiv:2205.08534v1,2022. [46] LIU H,ZHANG J,YANG K,et al.CMX:cross-modal fusion for RGB-X semantic segmentation with transformers[J].arXiv:2203.04838v2,2022. [47] WANG Y,CHEN X,CAO L,et al.Multimodal token fusion for vision transformers[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),2022:12176-12185. [48] WANG Y,HUANG W,SUN F,et al.Channel exchanging networks for multimodal and multitask dense image prediction[J].arXiv:2112.02252,2021. [49] CHEN X,LIN K,WANG G,et al.Bi-directional cross-modality feature propagation with separation-and-aggregation gate for RGB-D semantic segmentation[C]//Proceedings of the European Conference on Computer Vision(ECCV),2020:561-577. [50] CAO J,LENG H,LISCHINSKI D,et al.ShapeConv:shape-aware convolutional layer for indoor RGB-D semantic segmentation[J].arXiv:2108.10528,2021. [51] XIONG Z,YUAN Y,GUO N,et al.Variational context-deformable convnets for indoor scene parsing[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),2020. [52] WANG W,NEUMANN U.Depth-aware CNN for RGB-D segmentation[C]//Proceedings of the European Conference on Computer Vision(ECCV).Cham:Springer International Publishing,2018. [53] LEE S,PARK S J,HONG K S.RDFNet:RGB-D multi-level residual feature fusion for indoor semantic segmentation[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision(ICCV),2017. [54] LIN G,MILAN A,SHEN C,et al.RefineNet:multi-path refinement networks for high-resolution semantic segmentation[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2017. [55] ZOU K H,WARFIELD S K,BHARATHA A,et al.Statistical validation of image segmentation quality based on a spatial overlap index[J].Academic Radiology,2004,11(2):178-189. [56] ANUAR N,MD SULTAN A B.Validate conference paper using dice coefficient[J].Computer and Information Science,2010,3(3). [57] KLEIN S,VAN DER HEIDE U A,LIPS I M,et al.Automatic segmentation of the prostate in 3D MR images by atlas matching using localized mutual information[J].Medical Physics,2008,35(4):1407-1417. [58] MILLETARI F,NAVAB N,AHMADI S A.V-Net:fully convolutional neural networks for volumetric medical image segmentation[J].arXiv:1606.04797,2016. [59] GHOSAL S,XIE A,SHAH P.Uncertainty quantified deep learning for predicting dice coefficient of digital histopathology image segmentation[J].arXiv:2109.00115,2021. [60] WANG J,HUANG Q,TANG F,et al.Stepwise feature fusion:local guides global[J].arXiv:2203.03635,2022. [61] SRIVASTAVA A,JHA D,CHANDA S,et al.MSRF-Net:a multi-scale residual fusion network for biomedical image segmentation[C]//Proceedings of the IEEE Journal of Biomedical and Health Informatics,2022:2252-2263. [62] TOMAR N K,JHA D,RIEGLER M A,et al.FANet:a feedback attention network for improved biomedical image segmentation[J].arXiv:2103.17235v2,2021. [63] XU Q,DUAN W,HE N.DCSAU-Net:a deeper and more compact split-attention U-Net for medical image segmentation[J].arXiv:2202.00972,2022. [64] JHA D,RIEGLER M A,JOHANSEN D,et al.DoubleU-Net:a deep convolutional neural network for medical image segmentation[J].arXiv:2006.04868v2,2020. [65] ZHOU Z,SIDDIQUEE R,TAJBAKHSH N,et al.UNet++:a nested U-Net architecture for medical image segmentation[M]//Deep learning in medical image analysis and multimodal learning for clinical decision support,part of the lecture notes in computer science book series(LNIP).Cham:Springer,2018:3-11. [66] DOU Q,YU L,CHEN H,et al.3D deeply supervised network for automated segmentation of volumetric medical images[J].Medical Image Analysis,2017,41:40-54. [67] NIKOLOV S,BLACKWELL S,ZVEROVITCH A,et al.Clinically applicable segmentation of head and neck anatomy for radiotherapy:deep learning algorithm development and validation study[EB/OL].(2020-11-30).DOI:10.2196/preprints.26151. [68] TAHA A A,HANBURY A.Metrics for evaluating 3D medical image segmentation:analysis,selection,and tool[J].BMC Medical Imaging,2015,15(29). [69] KLINE D M,BERARDI V L.Revisiting squared-error and cross-entropy functions for training neural network classifiers[J].Neural Computing and Applications,2005,14(4):310-318. [70] NASR G E,BADR E A,JOUN C.Cross entropy error function in neural networks:forecasting gasoline demand[C]//Proceedings of the Fifteenth International Florida Artificial Intelligence Research Society Conference,2002:381-384. [71] LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision(ICCV),2017. [72] YU J,JIANG Y,WANG Z,et al.UnitBox:an advanced object detection network[C]//Proceedings of the 24th ACM International Conference on Multimedia,Amsterdam,Association for Computing Machinery,2016:516-520. [73] SALEHI S S M,ERDOGMUS D,GHOLIPOUR A.Tversky loss function for image segmentation using 3D fully convolutional deep networks[C]//Proceedings of the Machine Learning in Medical Imaging.Cham:Springer International Publishing,2017. [74] BERMAN M,RANNEN A,BLASCHKO M.The Lovasz-Softmax loss:a tractable surrogate for the optimization of the intersection-over-union measure in neural networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),2018:4413-4421. [75] PASZKE A,CHAURASIA A,KIM S,et al.ENet:a deep neural network architecture for real-time semantic segmentation[J].arXiv:1606.02147,2016. [76] CAO Y J,WU S,LIU C,et al.Seg-CapNet:a capsule-based neural network for the segmentation of left ventricle from cardiac magnetic resonance imaging[J].Journal of Computer Science & Technology,2021,36(2):323-333. [77] BORSE S,WANG Y,ZHANG Y,et al.InverseForm:a loss function for structured boundary-aware segmentation[J].arXiv:2104.02745,2021. [78] ROMERA E,ALVAREZ J M,BERGASA L M,et al.ERFNet:efficient residual factorized convnet for real-time semantic segmentation[J].IEEE Transactions on Intelligent Transportation Systems,2017(1):1-10. [79] OR?I? M,KRE?O I,BEVANDI? P,et al.In defense of pretrained ImageNet architectures for real-time semantic segmentation of road-driving images[C]//Proceedings of 2019 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2019:12607-12616. [80] KALODIMAS P,NIKITAKIS A,PAPAEFSTATHIOU I.An open-source high-throughput,reduced memory footprint,face detection,pose estimation and landmark localization system[C]//Proceedings of the 2019 22nd Euromicro Conference on Digital System Design(DSD),2019. [81] CHEN C,DOU Q,CHEN H,et al.Semantic-aware generative adversarial nets for unsupervised domain adaptation in chest X-Ray segmentation[C]//Proceedings of the Machine Learning in Medical Imaging.Cham:Springer International Publishing,2018. [82] MOOSAVI-DEZFOOLI S,FAWZI A,FROSSARD P.DeepFool:a simple and accurate method to fool deep neural networks[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2016:2574-2582. [83] PAPERNOT N,MCDANIEL P,JHA S,et al.The limitations of deep learning in adversarial settings[C]//Proceedings of the 2016 IEEE European Symposium on Security and Privacy(EuroS&P),2016:372-387. [84] CARLINI N,WAGNER D.Towards evaluating the robustness of neural networks[C]//Proceedings of the 2017 IEEE Symposium on Security and Privacy(SP),2017:39-57. [85] KURAKIN A,GOODFELLOW I,BENGIO S.Adversarial machine learning at scale[J].arXiv:1611.01236,2016. [86] ARNAB A,MIKSIK O,TORR P H S.On the robustness of semantic segmentation models to adversarial attacks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(12):3040-3053. |
[1] | 何儒汉, 熊捷繁, 熊明福. 基于背景自适应学习的行人重识别算法研究[J]. 计算机工程与应用, 2023, 59(7): 126-133. |
[2] | 罗盆琳, 方艳红, 李鑫, 李雪. RGB-D双模态特征融合语义分割[J]. 计算机工程与应用, 2023, 59(7): 222-231. |
[3] | 赵雪冰, 王俊杰. 基于改进DeeplabV3+和迁移学习的桥梁裂缝检测[J]. 计算机工程与应用, 2023, 59(5): 262-269. |
[4] | 韦婷, 李馨蕾, 刘慧. 小样本困境下的图像语义分割综述[J]. 计算机工程与应用, 2023, 59(2): 1-11. |
[5] | 张贺童, 姚康, 裴融浩, 丁上上, 付威威. 基于ADEU-Net分割网络的瞳孔精确分割方法[J]. 计算机工程与应用, 2023, 59(2): 212-221. |
[6] | 吴良武, 周永霞, 王宇航, 朱钰萍. 多注意力机制金字塔池化金手指划痕分割方法[J]. 计算机工程与应用, 2023, 59(1): 213-220. |
[7] | 田敏, 刘名果, 陈立家, 韩宗桓, 兰天翔, 梁倩. 面向样本扩充的新型风格迁移网络研究[J]. 计算机工程与应用, 2023, 59(1): 228-235. |
[8] | 张鑫, 姚庆安, 赵健, 金镇君, 冯云丛. 全卷积神经网络图像语义分割方法综述[J]. 计算机工程与应用, 2022, 58(8): 45-57. |
[9] | 熊风光, 张鑫, 韩燮, 况立群, 刘欢乐, 贾炅昊. 改进的遥感图像语义分割研究[J]. 计算机工程与应用, 2022, 58(8): 185-190. |
[10] | 杨景峰, 朱大鹏, 赵瑞琳. 城市轨道交通网络特性与级联失效鲁棒性分析[J]. 计算机工程与应用, 2022, 58(7): 250-258. |
[11] | 朱亚梅, 施一萍, 江悦莹, 邓源, 刘瑾. 结合MASP和语义分割的双链路行人重识别方法[J]. 计算机工程与应用, 2022, 58(24): 143-150. |
[12] | 孙汉淇, 潘晨, 何灵敏, 胥智杰. 多模态特征融合的遥感图像语义分割网络[J]. 计算机工程与应用, 2022, 58(24): 256-264. |
[13] | 张蕊, 孟晓曼, 曾志远, 金玮, 武益超. 图卷积神经网络在点云语义分割中的研究综述[J]. 计算机工程与应用, 2022, 58(24): 29-46. |
[14] | 胡瑞娟, 周会娟, 刘海砚, 李健. 基于深度学习的篇章级事件抽取研究综述[J]. 计算机工程与应用, 2022, 58(24): 47-60. |
[15] | 杨斌超, 续欣莹, 程兰, 冯洲. 道路环境下动态特征视觉里程计研究[J]. 计算机工程与应用, 2022, 58(23): 197-204. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||