计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (8): 45-57.DOI: 10.3778/j.issn.1002-8331.2109-0091
张鑫,姚庆安,赵健,金镇君,冯云丛
出版日期:
2022-04-15
发布日期:
2022-04-15
ZHANG Xin, YAO Qing’an, ZHAO Jian, JIN Zhenjun, FENG Yuncong
Online:
2022-04-15
Published:
2022-04-15
摘要: 图像语义分割是计算机视觉领域的热点研究课题,随着全卷积神经网络的迅速兴起,图像语义分割和全卷积神经网络的融合发展取得了非常卓越的成绩。通过对近年来高质量文献的收集,重点对全卷积神经网络图像语义分割方法进行总结。将收集的文献,按照应用场景的不同,划分为经典语义分割、实时性语义分割和RGBD语义分割,对具有代表性的分割方法进行阐述。同时归纳了常用的公共数据集和性能的评价指标,并对常用数据集上的实验进行分析总结,对全卷积神经网络未来可能的研究方向进行展望。
张鑫, 姚庆安, 赵健, 金镇君, 冯云丛. 全卷积神经网络图像语义分割方法综述[J]. 计算机工程与应用, 2022, 58(8): 45-57.
ZHANG Xin, YAO Qing’an, ZHAO Jian, JIN Zhenjun, FENG Yuncong. Image Semantic Segmentation Based on Fully Convolutional Neural Network[J]. Computer Engineering and Applications, 2022, 58(8): 45-57.
[1] 汪海洋,潘德炉,夏德深.二维Otsu自适应阈值选取算法的快速实现[J].自动化学报,2007,33(9):968-971. WANG H Y,PAN D L,XIA D S.A fast algorithm for two-dimensional otsu adaptive threshold algorithm[J].Acta Automatica Sinica,2007,33(9):968-971. [2] PUN T.A new method for gray-level picture thresholding using the entropy of the histogram[J].Signal Processing,1985,2(3):223-237. [3] OTSU N.A threshold selection method from gray-level histograms[J].IEEE Transactions on Systems Man,and Cybernetics,2007,9(1):62-66. [4] YEN J C,CHANG F J,CHANG S.A new criterion for automatic multilevel thresholding[J].IEEE Transactions on Image Processing,1995,4(3):370-378. [5] DERICHE R.Using Canny’s criteria to derive a recursively implemented optimal edge detector[J].International Journal of Computer Vision,1987,1(2):167-187. [6] ROSENFELD A.The max Roberts operator is a Hueckel-type edge detector[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1981,3(1):101-103. [7] YANG L,WU X Y,ZHAO D W,et al.An improved Prewitt algorithm for edge detection based on noised image[C]//International Congress on Image and Signal Processing.New York:IEEE Press,2011:1197-1200. [8] BOWYER K,KRANENBURG C,DOUGHERTY S.Edge detector evaluation using empirical ROC curves[J].Comput Vision & Image Understand,2001,84(1):77-103. [9] COATES A,NG A Y.Learning feature representations with K-means[J].Lecture Notes in Computer Science,2012,7700:561-580. [10] CHENG Y.Mean shift,mode seeking,and clustering[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1995,17(8):790-799. [11] FUKUNAGA K,HOSTETLER L.The estimation of the gradient of a density function,with applications in pattern recognition[J].IEEE Transactions on Information Theory,2006,21(1):32-40. [12] ACHANTA R,SHAJI A,SMITH K,et al.SLIC superpixels compared to state-of-the-art superpixel methods[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(11):2274-2282. [13] HAN S,TAO W,WANG D,et al.Image segmentation based on GrabCut framework integrating multiscale nonlinear structure tensor[J].IEEE Transactions on Image Processing,2009,18(10):2289-2302. [14] TANG M,GORELICK L,VEKSLERR O,et al.GrabCut in one cut[C]//IEEE International Conference on Computer Vision,2013:1769-1776. [15] BOYKOV Y Y.Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images[C]//Proc Eighth IEEE International Conference on Computer Vision,2001:105-112. [16] ROTHER C.GrabCut:interactive foreground extraction using iterated graph cuts[J].ACM Transactions on Graphics,2004,23(3):309-314. [17] HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507. [18] LONG J,SHELHAMER E,DARRELL T.Fully convo- lutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,39(4):640-651. [19] GARCIA-GARCIA A,ORTS-ESCOLANO S,OPREA S,et al.A review on deep learning techniques applied to semantic segmentation[J].arXiv:1704.06857,2017. [20] 黄鹏,郑淇,梁超.图像分割方法综述[J].武汉大学学报(理学版),2020,66(6):519-531. HUANG P,ZHENG Q,LIANG C.Overview of image segmentation methods[J].Journal of Wuhan University(Natural Science Edition),2020,66(6):519-531. [21] 田萱,王亮,丁琪.基于深度学习的图像语义分割方法综述[J].软件学报,2019,30(2):440-468. TIAN X,WANG L,DING Q.Review of image semantic segmentation based on deep learning[J].Journal of Software,2019,30(2):440-468. [22] 章琳,袁非牛,张文睿,等.全卷积神经网络研究综述[J].计算机工程与应用,2020,56(1):25-37. ZHANG L,YUAN F N,ZHANG W R,et al.Review of fully convolutional neural network[J].Computer Engineering and Applications,2020,56(1):25-37. [23] 徐辉,祝玉华,甄彤,等.深度神经网络图像语义分割方法综述[J].计算机科学与探索,2021,15(1):47-59. XU H,ZHU Y H,ZHEN T,et al.Survey of image semantic segmentation methods based on deep neural network[J].Journal of Frontiers of Computer Science and Technology,2021,15(1):47-59. [24] KRIZHEVSKY A,SUTSKEVER I,HINTON G.ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems,2012:1097-1105. [25] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409. 1556,2014. [26] SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[J].arXiv:1409.4842,2014. [27] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of International Conference on Computer Vision and Pattern Recognition,2016:770-778. [28] RONNEBERGER O,FISCHER P,BROX T.U-net:convolutional networks for biomedical image segmentation[C]//Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention,2015:234-241. [29] BADRINARAYANAN V,KENDALL A,CIPOLLA R.SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(12):2481-2495. [30] CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Semantic image segmentation with deep convolutional nets and fully connected CRFs[J].arXiv:1606.00915,2016. [31] CHEN L C,PAPANDREOU G,KOKKINOS I,et al.DeepLab:semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence & Machine Intelligence,2018,40(4):834-848. [32] CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv:1706.05587,2017. [33] CHEN L C,ZHU Y,PPAPNDRROU G,et al.Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer Vision,2018:801-818. [34] PASZKE A,CHAURASIA A,KIM S,et al.ENet:a deep neural network architecture for real-time semantic segmentation[J].arXiv:1606.02147,2016. [35] CHAURASIA A,CULURCIELLO E.Linknet:exploiting encoder representations for efficient semantic segmentation[C]//Proceedings of the IEEE Visual Communications and Image Processing,2017:1-4. [36] YU C,WANG J,PENG C,et al.BiSeNet:bilateral segmentation network for real-time semantic segmentation[M].Berlin,Germany:Springer,2018:334-349. [37] LI H,XIONG P,FAN H,et al.DFANet:deep feature aggregation for real-time semantic segmentation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:9522-9531. [38] JIANG J,ZHENG L,LUO F,et al.RedNet:residual encoder-decoder network for indoor RGB-D semantic segmentation[J].arXiv:1806.01054,2018. [39] PARK S J,HONG K S,LEE S.RDFNet:RGB-D multi-level residual feature fusion for indoor semantic segmentation[C]//IEEE International Conference on Computer Vision,2017:4980-4989. [40] ZHOU Z,SIDDIQUEE M,TAJBAKHSH N,et al.UNet++:a nested U-Net architecture for medical image segmentation[C]//4th Deep Learning in Medical Image Analysis(DLMIA) Workshop,2018. [41] HUANG H,LIN L,TONG R,et al.UNet 3+:a full-scale connected UNet for medical image segmentation[C]//2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2020. [42] OKTAY O,SCHLEMPER J,FOLGOC L L,et al.Attention U-Net:learning where to look for the pancreas[J].arXiv:1804.03999,2018. [43] 孟俊熙,张莉,曹洋,等.基于Deeplab v3+的图像语义分割算法优化研究[J/OL].激光与光电子学进展:1-15[2021-08-10].http://kns.cnki.net/kcms/detail/31.1690.TN.20210716. 1534.006.html. MENG J X,ZHANG L,CAO Y,et al.Research on optimization of image semantic segmentation algorithms based on Deeplab v3+[J/OL].Laser & Optoelectronics Progress:1-15[2021-08-10].http://kns.cnki.net/kcms/detail/31.1690.TN.20210716.1534.006.html. [44] 赵小强,徐慧萍.分级特征融合的图像语义分割[J].计算机科学与探索,2021,15(5):949-957. ZHAO X Q,XU H P.Image semantic segmentation method with hierarchical feature fusion[J].Journal of Frontiers of Computer Science and Technology,2021,15(5):949-957. [45] LIN G,MILAN A,SHEN C,et al.Refinenet:multi-path refinement networks for high-resolution semantic segmen-tation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:1925-1934. [46] ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of International Conference on Computer Vision and Pattern Recognition,2017:6230-6239. [47] PENG C,ZHANG X,YU G,et al.Large kernel matters-improve semantic segmentation by global convolutional network[C]//Proceedings of International Conference on Computer Vision and Pattern Recognition,2017:1743-1751. [48] YU C,WANG J,PENG C,et al.Learning a discriminative feature network for semantic segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),2018. [49] QUAN T M,HILDEBRAND D,JEONG W K.FusionNet:a deep fully residual convolutional neural network for image segmentation in connectomics[J].arXiv:1614.05360, 2016. [50] NOH H,HONG S,HAN B.Learning deconvolution network for semantic segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision,2015:1520-1528. [51] GADDE R,JAMPANI V,GEHLER P V.Semantic video CNNs through representation warping[J].arXiv:1708.03088, 2017. [52] NILSSON D,SMINCHISESCU C.Semantic video segmentation by gated recurrent flow propagation[J].arXiv:1612.08871,2016. [53] JIN X,LI X,XIAO H,et al.Video scene parsing with predictive feature learning[J].arXiv:1612.00119,2016. [54] RADFORD A,METZ L,CHINTALA S.Unsupervised representation learning with deep convolutional generative adversarial networks[J].arXiv:1511.06434,2015. [55] FAN M,LAI S,HUANG J,et al.Rethinking BiSeNet for real-time semantic segmentation[J].arXiv:2104.13188,2021. [56] LYU H,FU H,HU X,et al.Esnet:edge-based segmentation network for real-time semantic segmentation in traffic scenes[C]//2019 IEEE International Conference on Image Processing(ICIP),2019:1855-1859. [57] FANG Q,QIU J,WU H,et al.DFPNet:dislocation double feature pyramid real-time semantic segmentation network[C]//2020 Chinese Automation Congress(CAC),2020:2587-2592. [58] NEKRASOV V,SHEN C,REID I.Light-weight RefineNet for real-time semantic segmentation[J].arXiv:1810.03272, 2018. [59] HINTON G,VINYALS O,DEAN J.Distilling the knowledge in a neural network[J].arXiv:1503.02531,2015. [60] ELSKEN T,METZEN J H,HUTTER F.Neural architecture search:a survey[J].arXiv:1808.05377,2018. [61] XING Y,WANG J,CHEN X,et al.Coupling two-stream RGB-D semantic segmentation network by idempotent mappings[C]//2019 IEEE International Conference on Image Processing(ICIP),2019:1850-1854. [62] XING Y,WANG J,CHEN X,et al.2.5D convolution for RGB-D semantic segmentation[C]//2019 IEEE International Conference on Image Processing(ICIP),2019:1410-1414. [63] HU X,YANG K,FEI L,et al.ACNet:attention based network to exploit complementary features for RGBD semantic segmentation[C]//IEEE International Conference on Image Processing,2019:1440-1444. [64] SHI W,ZHU D,ZHANG G,et al.Multilevel cross-aware RGBD semantic segmentation of indoor environments[C]//2019 IEEE International Conference on Cyborg and Bionic Systems(CBS),2019:382-390. [65] LI Y,ZHANG J,CHENG Y,et al.Semantics-guided multi-level RGB-D feature fusion for indoor semantic segmentation[C]//2017 IEEE International Conference on Image Processing(ICIP),2018:1262-1266. [66] EIGEN D,FERGUS R.Predicting depth,surface normals and semantic labels with a common multi-scale convolutional architecture[C]//2015 IEEE International Conference on Computer Vision(ICCV),2014. [67] MANCINI M,COSTANTE G,VALIGI P,et al.Fast robust monocular depth estimation for obstacle detection with fully convolutional networks[J].arXiv:1607.06349,2016. [68] HU X,YANG K,FEI L,et al.ACNet:attention based network to exploit complementary features for RGBD semantic segmentation[J].arXiv:1905.10089,2019. [69] CHEN S,ZHU X,LIU W,et al.Global-local propagation network for RGB-D semantic segmentation[J].arXiv:2101. 10801,2021. [70] CHEN X,LIN K Y,WANG J,et al.Bi-directional cross-modality feature propagation with separation-and-aggregation gate for RGB-D semantic segmentation[J].arXiv:2007. 09183,2020. [71] QI X,LIAO R,JIA J,et al.3D graph neural networks for RGBD semantic segmentation[C]//2017 IEEE International Conference on Computer Vision(ICCV),2017. [72] EVERINGHAM M,ESLAMI S,GOOL L V,et al.The pascal visual object classes challenge:a retrospective[J].International Journal of Computer Vision,2015,111(1):98-136. [73] MOTTAGHI R,CHEN X,LIU X,et al.The role of context for object detection and semantic segmentation in the wild[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2014:891-898. [74] CHEN X,MOTTAGHI R,LIU X,et al.Detect what you can:detecting and representing objects using holistic models and body parts[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition.Washington DC,USA:IEEE Press,2014:1971-1978. [75] LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft COCO:common objects in context[C]//Proceedings of the European Conference on Computer Vision,2014:740-755. [76] CORDTS M,OMRAN M,RAMOUS S,et al.The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:3213-3223. [77] BROSTOW G J,SHOTTON J,FAUQUEUR J,et al.Segmentation and recognition using structure from motion point clouds[C]//Proceedings of the European Conference on Computer Vision,2008:44-57. [78] GEIGER A,LENZ P,STILLER C,et al.Vision meets robotics:the KITTI dataset[J].The International Journal of Robotics Research,2013,32(11):1231-1237. [79] ALVAREZ J M,GEVERS T,LECUN Y,et al.Road scene segmentation from a single image[C]//European Conference on Computer Vision,2012:376-389. [80] ROS G,ALVAREZ J M.Unsupervised image transformation for outdoor semantic labelling[C]//2015 IEEE Intelligent Vehicles Symposium(IV),2015:537-542. [81] ZHANG R,CANDRA S A,KAI V,et al.Sensor fusion for semantic segmentation of urban scenes[C]//IEEE International Conference on Robotics & Automation,2015:1850-1857. [82] ROS G,RAMOS S,GRANADOS M,et al.Vision-based offline-online perception paradigm for autonomous driving[C]//2015 IEEE Winter Conference on Applications of Computer Vision,2015:231-238. [83] LIU C,YUEN J,TORRALBA A.Nonparametric scene parsing:label transfer via dense scene alignment[C]//IEEE Conference on Computer Vision and Pattern Recognition,2009:1972-1979. [84] GOULD S,FULTON R,KOLLER D.Decomposing a scene into geometric and semantically consistent regions[C]//IEEE International Conference on Computer Vision,2009:1-8. [85] SILBERMAN N,HOIEM D,KOHLI P,et al.Indoor segmentation and support inference from RGBD images[C]//Proceedings of European Conference on Computer Vision,2012:746-760. [86] XIAO J,OWENS A H,TORRALBA A.SUN3D:a database of big spaces reconstructed using SfM and object labels[C]//2013 IEEE International Conference on Computer Vision(ICCV),2013:1625-1632. [87] SONG S,LICHTENBERG S P,XIAO J.SUN RGB-D:a RGB-D scene understanding benchmark suite[C]//IEEE Conference on Computer Vision & Pattern Recognition,2015:567-576. [88] LAI K,BO L,REN X,et al.A large-scale hierarchical multi-view RGB-D object dataset[C]//IEEE International Conference on Robotics & Automation,2011:1817-1824. [89] GARCIA-GARCIA A,ORTS-ESCOLANO S,OPREA S,et al.A review on deep learning techniques applied to semantic segmentation[J].arXiv:1704.06857,2017. |
[1] | 周天宇, 朱启兵, 黄敏, 徐晓祥. 基于轻量级卷积神经网络的载波芯片缺陷检测[J]. 计算机工程与应用, 2022, 58(7): 213-219. |
[2] | 刘文婷, 卢新明. 基于计算机视觉的Transformer研究进展[J]. 计算机工程与应用, 2022, 58(6): 1-16. |
[3] | 王宏飞, 程鑫, 赵祥模, 周经美. 光流与纹理特征融合的人脸活体检测算法[J]. 计算机工程与应用, 2022, 58(6): 170-176. |
[4] | 韩明, 王景芹, 王敬涛, 孟军英. 级联特征融合孪生网络目标跟踪算法研究[J]. 计算机工程与应用, 2022, 58(6): 208-218. |
[5] | 陈智丽, 高皓, 潘以轩, 邢风. 乳腺X线图像计算机辅助诊断技术综述[J]. 计算机工程与应用, 2022, 58(4): 1-21. |
[6] | 刘艳菊, 伊鑫海, 李炎阁, 张惠玉, 刘彦忠. 深度学习在场景文字识别技术中的应用综述[J]. 计算机工程与应用, 2022, 58(4): 52-63. |
[7] | 李雷霆, 武光利, 郭振洲. 自注意力机制和随机森林回归的视频摘要生成[J]. 计算机工程与应用, 2022, 58(4): 198-205. |
[8] | 鞠思博, 徐晶, 李岩芳. 基于自注意力机制的文本生成单目标图像方法[J]. 计算机工程与应用, 2022, 58(3): 249-258. |
[9] | 许昊,张凯,田英杰,种法广,王子超. 深度神经网络图像描述综述[J]. 计算机工程与应用, 2021, 57(9): 9-22. |
[10] | 冉蓉,徐兴华,邱少华,崔小鹏,欧阳斌. 基于深度卷积神经网络的裂纹检测方法综述[J]. 计算机工程与应用, 2021, 57(9): 23-35. |
[11] | 许德刚,王露,李凡. 深度学习的典型目标检测算法研究综述[J]. 计算机工程与应用, 2021, 57(8): 10-25. |
[12] | 李明山,韩清鹏,张天宇,王道累. 改进SSD的安全帽检测方法[J]. 计算机工程与应用, 2021, 57(8): 192-197. |
[13] | 肖雨晴,杨慧敏. 目标检测算法在交通场景中应用综述[J]. 计算机工程与应用, 2021, 57(6): 30-41. |
[14] | 袁铭阳,黄宏博,周长胜. 全监督学习的图像语义分割方法研究进展[J]. 计算机工程与应用, 2021, 57(4): 43-54. |
[15] | 彭璟,罗浩宇,赵淦森,林成创,易序晟,陈少洁. 深度学习下的医学影像分割算法综述[J]. 计算机工程与应用, 2021, 57(3): 44-57. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||