Dual-Modal Feature Fusion Semantic Segmentation of RGB-D
LUO Penlin, FANG Yanhong, LI Xin, LI Xue
1.School of Information Engineering, Southwest University of Science and Technology, Mianyang, Sichuan 621010, China
2.Robot Technology Used for Special Environment Key Laboratory of Sichuan Province, Southwest University of Science and Technology, Mianyang, Sichuan 621010, China
LUO Penlin, FANG Yanhong, LI Xin, LI Xue. Dual-Modal Feature Fusion Semantic Segmentation of RGB-D[J]. Computer Engineering and Applications, 2023, 59(7): 222-231.
[1] WANG W,FU Y,PAN Z,et al.Real-time driving scene semantic segmentation[J].IEEE Access,2020,8:36776-36788.
[2] LUO R C,CHIOU M.Hierarchical semantic mapping using convolutional neural networks for intelligent service robotics[J].IEEE Access,2018,6:61287-61294.
[3] CHEN L,BENTLEY P,MORI K,et al.DRINet for medical image segmentation[J].IEEE Transactions on Medical Imaging,2018,37(11):2453-2462.
[4] XIANG S,XIE Q,WANG M.Semantic segmentation for remote sensing images based on adaptive feature selection network[J].IEEE Geoscience and Remote Sensing Letters,2021,19:8006705.
[5] INACIO A D S,LOPES H S.EPYNET:efficient pyramidal network for clothing segmentation[J].IEEE Access,2020,8:187882-187892.
[6] BOYKOV Y,VEKSLER O,ZABIH R.Fast approximate energy minimization via graph cuts[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,23(11):1222-1239.
[7] BENSON H Y,SHANNO D F.An exact primal-dual penalty method approach to warmstarting interior-point methods for linear programming[J].Computational Optimization and Applications,2007,38(3):371-399.
[8] LAFFERTY J,MCCALLUM A,PEREIRA F C N.Conditional random fields:probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the 18th International Conference on Machine Learning,2001:282-289.
[9] LARLUS D,JURIE F.Combining appearance models and markov random fields for category level object segmentation[C]//Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition,2008:1-7.
[10] HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
[11] LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:3431-3440.
[12] KRIZHEVSKY A,SUSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems,2012:1097-1105.
[13] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[14] SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:1-9.
[15] DONAHUE J,JIA Y,VINYALS O,et al.Decaf:a deep convolutional activation feature for generic visual recognition[C]//Proceedings of International Conference on Machine Learning,2014:647-655.
[16] RONNEBERGER O,FISCHER P,BROX T.U-net:convolutional networks for biomedical image segmentation[C]//Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention,2015:234-241.
[17] HAZIRBAS C,MA L,DOMOKOS C,CREMER D.FuseNet:incorporating depth into semantic segmentation via fusion-based CNN architecture[C]//Proceedings of Asian Conference on Computer Vision(ACCV),2016:213-228.
[18] JIANG J,ZHENG L,LUO F,et al.Rednet:residual encoder-decoder network for indoor RGB-D semantic segmentation[J].arXiv:1806.01054,2018.
[19] ZHONG Y,DAI Y,LI H.3D geometry-aware semantic labeling of outdoor street scenes[C]//Proceedings of the 2018 24th International Conferenceon Pattern Recognition(ICPR),2018:2343-2349.
[20] XING Y,WANG J,ZENG G.Malleable 2.5 dconvolution:learning receptive fields along the depth-axis for rgb-dscene parsing[C]//Proceedings of 16th European Conference on Computer Vision,2020:555-571.
[21] WANG W,NEUMANN U.Depth-aware cnn for rgb-d segmentation[C]//Proceedings of the European Conference on Computer Vision(ECCV),2018:135-150.
[22] HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:7132-7141.
[23] CAO Y,XU J,LIN S,et al.GCNET:non-local networks meet squeeze-excitation networks and beyond[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops,2019.
[24] LIU J J,HOU Q,CHENG M M,et al.Improving convolutional networks with self-calibrated convolutions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:10096-10105.
[25] HOU Q,ZHOU D,FENG J.Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:13713-13722.
[26] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778.
[27] ROMERA E,ALVAREZ J M,BERGASA L M,et al.Erfnet:efficient residual factorized convnet for real-time semantic segmentation[J].IEEE Transactions on Intelligent Transportation Systems,2017,19(1):263-272.
[28] FU J,LIU J,TIAN H,et al.Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:3146-3154.
[29] SEICHTER D,K?HLER M,LEWANDOWSKI B,et al.Efficient RGB-D semantic segmentation for indoor scene analysis[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation(ICRA),2021:13525-13531.
[30] SILBERMAN N,HOIEM D,KOHLI P,et al.Indoor segmentation and support inference from RGBD images[C]//Proceedings of the European Conference on Computer Vision,2012:746-760.
[31] SONG S,LICHTENBERG S P,XIAO J.Sun RGB-D:a RGB-D scene understanding benchmark suite[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:567-576.
[32] BORNSCHEIN J,VISIN F,OSINDERO S.Small data,big decisions:model selection in the small-data regime[C]//Proceedings of the International Conference on Machine Learning,2020:1035-1044.
[33] 李鑫,张红英,刘汉玉.融合多尺度和边界优化的图像语义分割网络[J].计算机工程与应用,2022,58(21):250-257.
LI X,ZHANG H Y,LIU H Y.Image semantic segmentation network fusing multi-scale and boundary optimization[J].Computer Engineering and Applications,2022,58(21):250-257.
[34] PARK S J,HONG K S,LEE S.RDFNet:RGB-D multi-level residual feature fusion for indoor semantic segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:4980-4989.
[35] HU X,YANG K,FEI L,et al.ACNET:attention based network to exploit complementary features for RGBD semantic segmentation[C]//Proceedings of 2019 IEEE International Conference on Image Processing(ICIP),2019:1440-1444.
[36] XING Y,WANG J,CHEN X,et al.2.5D convolution for RGB-D semantic segmentation[C]//Proceedings of 2019 IEEE International Conference on Image Processing(ICIP),2019:1410-1414.
[37] FOOLADGAR F,KASAEI S.Multi-modal attention-based fusion model for semantic segmentation of RGB-depth images[J].arXiv:1912.11691,2019.