计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (8): 13-27.DOI: 10.3778/j.issn.1002-8331.2210-0144
何家峰,陈宏伟,骆德汉
出版日期:
2023-04-15
发布日期:
2023-04-15
HE Jiafeng, CHEN Hongwei, LUO Dehan
Online:
2023-04-15
Published:
2023-04-15
摘要: 语义分割是从像素的角度分割出图片中的不同对象,并对原始图片中的每个像素进行标注的一种技术。但由于无人机导航、遥感图像、医疗诊断等应用领域需要实时地进行语义分割处理。所以,基于深度学习的实时语义分割技术得到了迅速的发展。实时语义分割技术发展至今已有许多的技术与模型。基于此,在对相关文献进行研究的基础上,由语义分割技术引出了实时语义分割技术,并简单叙述了实时语义分割的优点。随后,研讨出目前实时语义分割存在的重难点。根据重难点进而对已存在的相关技术与模型进行阐述,并总结技术与模型的优缺点。最后,展望实时语义分割所面临的挑战,并对实时语义分割进行了总结与归纳,为后续的研讨提供了一些理论参考。
何家峰, 陈宏伟, 骆德汉. 深度学习实时语义分割算法研究综述[J]. 计算机工程与应用, 2023, 59(8): 13-27.
HE Jiafeng, CHEN Hongwei, LUO Dehan. Review of Real-Time Semantic Segmentation Algorithms for Deep Learning[J]. Computer Engineering and Applications, 2023, 59(8): 13-27.
[1] 张鑫,姚庆安,赵健,等.全卷积神经网络图像语义分割方法综述[J].计算机工程与应用,2022,58(8):45-57. ZHANG X,YAO Q A,ZHAO J,et al.Image semantic segmentation based on fully convolutional neural network[J].Computer Engineering and Applications,2022,58(8):45-57. [2] YUAN X,SHI J,GU L.A review of deep learning methods for semantic segmentation of remote sensing imagery[J].Expert Systems with Applications,2021,169:114417. [3] TAKOS G.A survey on deep learning methods for semantic image segmentation in real-time[J].arXiv:2009.12942,2020. [4] ZHANG M,ZHOU Y,ZHAO J,et al.A survey of semi- and weakly supervised semantic segmentation of images[J].Artificial Intelligence Review,2020,53(6):4259-4288. [5] 苏丽,孙雨鑫,苑守正.基于深度学习的实例分割研究综述[J].智能系统学报,2022,17(1):16-31. SU L,SUN Y X,YUAN S Z.A survey of instance segmentation research based on deep learning[J].CAAI Transactions on Intelligent Systems,2022,17(1):16-31. [6] 李晓筱,胡晓光,王梓强,等.基于深度学习的实例分割研究进展[J].计算机工程与应用,2021,57(9):60-67. LI X X,HU X G,WANG Z Q,et al.Survey of instance segmentation based on deep learning[J].Computer Engineering and Applications,2021,57(9):60-67. [7] 王可,沈川贵,罗孟华.基于深度学习的图像语义分割方法综述[J].信息技术与信息化,2022(4):23-30. WANG K,SHEN C G,LUO M H.Survey of image semantic segmentation methods based on deep learning[J].Information Technology and Informatization,2022(4):23-30. [8] GARCIA-GARCIA A,ORTS-ESCOLANO S,OPREA S,et al.A survey on deep learning techniques for image and video semantic segmentation[J].Applied Soft Computing,2018,70:41-65. [9] PAPADEAS I,TSOCHATZIDIS L,AMANATIADIS A,et al.Real-time semantic image segmentation with deep learning for autonomous driving:a survey[J].Applied Sciences,2021,11(19):8802. [10] PASZKE A,CHAURASIA A,KIM S,et al.Enet:a deep neural network architecture for real-time semantic segmentation[J].arXiv:1606.02147,2016. [11] MARDIA K V,HAINSWORTH T J.A spatial thresholding method for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1988,10(6):919-927. [12] ADAMS R,BISCHOF L.Seeded region growing[J].Retranslations on Pattern Analysis and Machine Intelligence,1994,16:641-647. [13] GIANNAKEAS N,KARVELIS P S,EXARCHOS T P,et al.Segmentation of microarray images using pixel classi- fication comparison with clustering-based methods[J].Computers in Biology and Medicine,2013,43(6):705-716. [14] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Image- Net classification with deep convolutional neural networks[J].Communacation ACM,2017,60(6):84-90. [15] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[C]//Proceedings of International Conference on Learning Representations,2015. [16] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2016. [17] RONNEBERGER O,FISCHER P,BROX T.U-net:convolutional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Computer Assisted Intervention.Cham:Springer,2015:234-241. [18] SHELHAMER E,LONG J,DARRELL T.Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(4):640-651. [19] CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Semantic image segmentation with deep convolutional nets and fully connected CRFs[J].arXiv:1412.7062,2014. [20] CHEN L,PAPANDREOU G,KOKKINOS I,et al.Deep lab:semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,40(4):834-848. [21] CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv:1706.05587,2017. [22] CHEN L C,ZHU Y,PAPANDREOU G,et al.Encoder- decoder with atrous separable convolution for semantic image segmentation[C]//2018 European Conference on Computer Vision(ECCV),2018:833-851. [23] ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2017. [24] BADRINARAYANAN V,KENDALL A,CIPOLLA R.Seg Net:a deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analy- sis and Machine Intelligence,2017,39(12):2481-2495. [25] HAN S,MAO H,DALLY W J.Deep compression:compressing deep neural networks with pruning,trained quantization and huffman coding[J].arXiv:1510.00149,2015. [26] HAN S,POOL J,TRAN J,et al.Learning both weights and connections for efficient neural network[C]//Advances in Neural Information Processing Systems,2015. [27] HASSIBI B,STORK D.Second order derivatives for network pruning:optimal brain surgeon[C]//Advances in Neural Information Processing Systems,1992. [28] LECUN Y,DENKER J,SOLLA S.Optimal brain damage[C]//Advances in Neural Information Processing Systems,1989. [29] LI H,KADAV A,DURDANOVIC I,et al.Pruning filters for efficient convnets[J].arXiv:1608.08710,2016. [30] LI C,SHI C J.Constrained optimization based low-rank approximation of deep neural networks[C]//Proceedings of the European Conference on Computer Vision(ECCV),2018:732-747. [31] WEN W,WU C,WANG Y,et al.Learning structured sparsity in deep neural networks[C]//Advances in Neural Information Processing Systems,2016. [32] LUO J H,WU J,LIN W.Thinet:a filter level pruning method for deep neural network compression[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:5058-5066. [33] HAN S,LIU X,MAO H,et al.EIE:efficient inference engine on compressed deep neural network[J].ACM SIGARCH Computer Architecture News,2016,44(3):243-254. [34] HE Y,ZHANG X,SUN J.Channel pruning for accelerating very deep neural networks[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:1389-1397. [35] LIU Y,SHU C,WANG J,et al.Structured knowledge distillation for dense prediction[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020:1-10. [36] BA J,CARUANA R.Do deep nets really need to be deep?[C]//Advances in Neural Information Processing Systems,2014. [37] BUCILUǎ C,CARUANA R,NICULESCU-MIZIL A.Model compression[C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2006:535-541. [38] HINTON G,VINYALS O,DEAN J.Distilling the knowledge in a neural network[J].arXiv:1503.02531,2015. [39] ROMERO A,BALLAS N,KAHOU S E,et al.Fitnets:hints for thin deep nets[J].arXiv:1412.6550,2014. [40] CHEN W,GONG X,LIU X,et al.Fasterseg:searching for faster real-time semantic segmentation[J].arXiv:1912.10917,2019. [41] GONG Y,LIU L,YANG M,et al.Compressing deep convolutional networks using vector quantization[J].arXiv:1412.6115,2014. [42] ZHOU A,YAO A,GUO Y,et al.Incremental network quantization:towards lossless CNNs with low-precision weights[J].arXiv:1702.03044,2017. [43] ZHOU S,WU Y,NI Z,et al.Dorefa-net:training low bitwidth convolutional neural networks with low bitwidth gradients[J].arXiv:1606.06160,2016. [44] DENTON E L,ZAREMBA W,BRUNA J,et al.Exploiting linear structure within convolutional networks for efficient evaluation[C]//Advances in Neural Information Processing Systems,2014. [45] JADERBERG M,VEDALDI A,ZISSERMAN A.Speeding up convolutional neural networks with low rank expansions[J].arXiv:1405.3866,2014. [46] ANDRI R,CAVIGELLI L,ROSSI D,et al.YodaNN:an architecture for ultralow power binary-weight CNN acceleration[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2017,37(1):48-60. [47] COURBARIAUX M,HUBARA I,SOUDRY D,et al.Binarized neural networks:training deep neural networks with weights and activations constrained to +1 or -1[J].arXiv:1602.02830,2016. [48] HUBARA I,COURBARIAUX M,SOUDRY D,et al.Quantized neural networks:training neural networks with low precision weights and activations[J].The Journal of Machine Learning Research,2017,18(1):6869-6898. [49] RASTEGARI M,ORDONEZ V,REDMON J,et al.Xnor-net:imagenet classification using binary convolutional neural networks[C]//European Conference on Computer Vision.Cham:Springer,2016:525-542. [50] SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:2818-2826. [51] ROMERA E,ALVAREZ J M,BERGASA L M,et al.Erfnet:efficient residual factorized convnet for real-time semantic segmentation[J].IEEE Transactions on Intelligent Transportation Systems,2017,19(1):263-272. [52] WANG Y,ZHOU Q,XIONG J,et al.ESNet:an efficient symmetric network for real-time semantic segmentation[C]//Chinese Conference on Pattern Recognition and Computer Vision(PRCV).Cham:Springer,2019:41-52. [53] LI Y,LI X,XIAO C,et al.EACNet:enhanced asymmetric convolution for real-time semantic segmentation[J].IEEE Signal Processing Letters,2021,28:234-238. [54] LOU A,LOEW M.Cfpnet:channel-wise feature pyramid for real-time semantic segmentation[C]//2021 IEEE International Conference on Image Processing(ICIP),2021:1894-1898. [55] YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[J].arXiv:1511.07122,2015. [56] SIFRE L,MALLAT S.Rigid-motion scattering for texture classification[J].arXiv:1403.1687,2014. [57] ZHANG X,ZHOU X,LIN M,et al.Shufflenet:an extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:6848-6856. [58] GAMAL M,SIAM M,ABDEL-RAZEK M.Shuffleseg:real-time semantic segmentation network[J].arXiv:1803.03816,2018. [59] WANG Y,ZHOU Q,LIU J,et al.Lednet:a lightweight encoder-decoder network for real-time semantic segmentation[C]//2019 IEEE International Conference on Image Processing(ICIP),2019:1860-1864. [60] ZHUANG J,YANG J.ShelfNet for real-time semantic segmentation[J].arXiv:1811.11254,2018. [61] ZHUANG J.LadderNet:multi-path networks based on U-Net for medical image segmentation[J].arXiv:1810.07810,2018. [62] LIN M,CHEN Q,YAN S.Network in network[J].arXiv:1312.4400,2013. [63] YU C,WANG J,PENG C,et al.Bisenet:bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision(ECCV),2018:325-341. [64] ZHAO H,QI X,SHEN X,et al.Icnet for real-time semantic segmentation on high-resolution images[C]//Proceedings of the European Conference on Computer Vision(ECCV),2018:405-420. [65] YU C,GAO C,WANG J,et al.Bisenet v2:bilateral network with guided aggregation for real-time semantic segmentation[J].International Journal of Computer Vision,2021,129(11):3051-3068. [66] POUDEL R P K,LIWICKI S,CIPOLLA R.Fast-scnn:fast semantic segmentation network[J].arXiv:1902.04502,2019. [67] POUDEL R P K,BONDE U,LIWICKI S,et al.Contextnet:exploring context and detail for semantic segmentation in real-time[J].arXiv:1805.04554,2018. [68] XU Q,MA Y,WU J,et al.Faster BiSeNet:a faster bilateral segmentation network for real-time semantic segmentation[C]//2021 International Joint Conference on Neural Networks(IJCNN),2021:1-8. [69] WANG F,LUO X Y,WANG Q X,et al.Aerial-BiSeNet:a real-time semantic segmentation network for high resolution aerial imagery[J].Chinese Journal of Aeronautics,2021,34(9):47-59. [70] NIRKIN Y,WOLF L,HASSNER T.Hyperseg:patch-wise hypernetwork for real-time semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:4061-4070. [71] YANG Z,YU H,FU Q,et al.NDNet:narrow while deep network for real-time semantic segmentation[J].IEEE Transactions on Intelligent Transportation Systems,2020,22(9):5508-5519. [72] GAO G,XU G,YU Y,et al.MSCFNet:a lightweight network with multi-scale context fusion for real-time semantic segmentation[J].IEEE Transactions on Intelligent Transportation Systems,2021:1-11. [73] WU Y,JIANG J,HUANG Z,et al.FPANet:feature pyramid aggregation network for real-time semantic segmentation[J].Applied Intelligence,2022,52(3):3319-3336. [74] LIU M,YIN H.Feature pyramid encoding network for real-time semantic segmentation[J].arXiv:1909.08599,2019. [75] LI H,XIONG P,FAN H,et al.Dfanet:deep feature aggregation for real-time semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:9522-9531. [76] MAZZINI D.Guided upsampling network for real-time semantic segmentation[J].arXiv:1807.07466,2018. [77] TANG X,TU W,LI K,et al.DFFNet:an IoT-perceptive dual feature fusion network for general real-time semantic segmentation[J].Information Sciences,2021,565:326-343. [78] KANG D,WONG A,LEE B,et al.Real-time semantic segmentation of 3D point cloud for autonomous driving[J].Electronics,2021,10(16):1960. [79] 袁旭亮,王娟,武明虎,等.基于注意力机制的航拍图像实时语义分割方法[J].激光杂志,2023,44(1):122-129. YUAN X L,WANG J,WU M H,et al.Real-time semantic segmentation method of aerial images based on attention mechanism[J].Laser Journal,2023,44(1):122-129. [80] 霍占强,贾海洋,乔应旭,等.边界感知的实时语义分割网络[J].计算机工程与应用,2022,58(17):165-173. HUO Z Q,JIA H Y,QIAO Y X,et al.Boundary-aware real-time semantic segmentation network[J].Computer Engineering and Applications,2022,58(17):165-173. [81] LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755. [82] ZHOU B,ZHAO H,PUIG X,et al.Scene parsing through ade20k dataset[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:633-641. [83] CORDTS M,OMRAN M,RAMOS S,et al.The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:3213-3223. [84] ROS G,SELLART L,MATERZYNSKA J,et al.The synthia dataset:a large collection of synthetic images for semantic segmentation of urban scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:3234-3243. [85] BROSTOW G J,FAUQUEUR J,CIPOLLA R.Semantic object classes in video:a high-definition ground truth database[J].Pattern Recognition Letters,2009,30(2):88-97. [86] GEIGER A,LENZ P,STILLER C,et al.Vision meets robotics:the kitti dataset[J].The International Journal of Robotics Research,2013,32(11):1231-1237. [87] GARCIA-GARCIA A,ORTS-ESCOLANO S,OPREA S,et al.A survey on deep learning techniques for image and video semantic segmentation[J].Applied Soft Computing,2018,70:41-65. |
[1] | 李厚君, 韦柏全. 属性蒸馏的零样本识别方法[J]. 计算机工程与应用, 2024, 60(9): 219-227. |
[2] | 车运龙, 袁亮, 孙丽慧. 基于强语义关键点采样的三维目标检测方法[J]. 计算机工程与应用, 2024, 60(9): 254-260. |
[3] | 邱云飞, 王宜帆. 双分支结构的多层级三维点云补全[J]. 计算机工程与应用, 2024, 60(9): 272-282. |
[4] | 叶彬, 朱兴帅, 姚康, 丁上上, 付威威. 面向桌面交互场景的双目深度测量方法[J]. 计算机工程与应用, 2024, 60(9): 283-291. |
[5] | 王彩玲, 闫晶晶, 张智栋. 基于多模态数据的人体行为识别方法研究综述[J]. 计算机工程与应用, 2024, 60(9): 1-18. |
[6] | 廉露, 田启川, 谭润, 张晓行. 基于神经网络的图像风格迁移研究进展[J]. 计算机工程与应用, 2024, 60(9): 30-47. |
[7] | 杨晨曦, 庄旭菲, 陈俊楠, 李衡. 基于深度学习的公交行驶轨迹预测研究综述[J]. 计算机工程与应用, 2024, 60(9): 65-78. |
[8] | 宋建平, 王毅, 孙开伟, 刘期烈. 结合双曲图注意力网络与标签信息的短文本分类方法[J]. 计算机工程与应用, 2024, 60(9): 188-195. |
[9] | 周定威, 扈静, 张良锐, 段飞亚. 面向目标检测的数据集标签遗漏的协同修正技术[J]. 计算机工程与应用, 2024, 60(8): 267-273. |
[10] | 周伯俊, 陈峙宇. 基于深度元学习的小样本图像分类研究综述[J]. 计算机工程与应用, 2024, 60(8): 1-15. |
[11] | 孙石磊, 李明, 刘静, 马金刚, 陈天真. 深度学习在糖尿病视网膜病变分类领域的研究进展[J]. 计算机工程与应用, 2024, 60(8): 16-30. |
[12] | 汪维泰, 王晓强, 李雷孝, 陶乙豪, 林浩. 时空图神经网络在交通流预测研究中的构建与应用综述[J]. 计算机工程与应用, 2024, 60(8): 31-45. |
[13] | 谢威宇, 张强. 基于深度学习的图像中无人机与飞鸟检测研究综述[J]. 计算机工程与应用, 2024, 60(8): 46-55. |
[14] | 谌海云, 黄忠义, 王海川, 余鸿皓. 基于改进Tracktor的行人多目标跟踪算法[J]. 计算机工程与应用, 2024, 60(8): 242-249. |
[15] | 常禧龙, 梁琨, 李文涛. 深度学习优化器进展综述[J]. 计算机工程与应用, 2024, 60(7): 1-12. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||