Multi-Level Fusion Supervised Saliency Object Detection Based on Global Response

doi:10.3778/j.issn.1002-8331.2306-0341

Abstract

Abstract: Saliency object detection is the process of autonomously locating the most salient objects in images and videos. In response to common challenges such as scale variations and background misjudgment, existing saliency object detection methods have primarily focused on feature fusion, attention mechanisms, and deep supervision to enhance the detection capabilities of networks. This paper proposes a global response-based multi-level fusion algorithm that builds upon these optimization directions. It enhances the network’s representation capabilities of salient objects by integrating contour information with high-level semantic feature, supplementing target structural features, and suppressing prediction errors. Moreover, it improves the network’s sensitivity to scale variations and salient region. The global response module emphasizes the global characteristics of images, computing and evaluating the responsiveness of images in spatial and channel dimension as well as the saliency at different locations. This effectively filters out shallow background noise, allowing the network to quickly identify salient regions and improve learning efficiency. Experimental results using general metrics demonstrate the superiority and efficiency of the proposed algorithm.

Key words: salient object detection, deep learning, convolutional neural network, feature fusion, attention mechanism

摘要： 显著性目标检测是自主寻找图像视频中最具显著性物体的过程。针对目前常见的尺度变化和背景误判等问题，现有的显著性目标检测方法主要从特征融合、注意力机制和深度监督等角度进行优化以提高网络的检测能力。所提的基于全局响应的多级融合算法基于以上优化方向，主要通过轮廓信息与高级语义交融学习、补充目标结构特征以及抑制预测噪声来增强网络对显著目标的表征能力，同时提高了网络对目标尺度变化的感知能力以及对显著特征的敏感度。全局响应模块的构建强调了图像的全局特性，计算和判断了图像在空间和通道的响应值和不同位置的显著性，此举有效滤除了浅层背景噪声，使网络更快地锁定显著区域，提高学习效率。在通用指标上，实验数据表明了所提算法的优越性和高效性。

关键词: 显著性目标检测, 深度学习, 卷积神经网络, 特征融合, 注意力机制

CHEN Hui, PENG Li. Multi-Level Fusion Supervised Saliency Object Detection Based on Global Response[J]. Computer Engineering and Applications, 2023, 59(24): 238-247.

陈慧, 彭力. 基于全局响应的多级融合监督显著性目标检测[J]. 计算机工程与应用, 2023, 59(24): 238-247.

References

[1] PATACCHIOLA M，CANGELOSI A.A head pose estimation in the wild using convolutional neural networks and adaptive gradient methods[J].Pattern Recognition，2017（71）：132-143.
[2] GUPTA R，PAL S，KANADE A，et al.DeepFix：fixing common C language errors by deep learning[C]//National Conference on Artificial Intelligence，2017.
[3] LI G，YU Y.Deep contrast learning for salient object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2016.
[4] UIJLINGS J R，SANDE K E，GEVERS T，et al.Selective search for object recognition[J].International Journal of Computer Vision，2013，103（2）：154-171.
[5] PONT-TUSET J，ARBELAEZ P，BARRON J T，et al.Multiscale combinatorial grouping for image segmentation and object proposal generation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2016（39）：128-140.
[6] FAN Q，ZHUO W，TANG C K，et al.Few-shot object detection with attention-RPN and multi-relation detector[C]//IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2020：4012-4021.
[7] CHEN L C，PAPANDREOU G，KOKKINOS I，et al.Deeplab：semantic image segmentation with deep convolutional nets，atrous convolution，and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2017（40）：834-848.
[8] RONNEBERGER O，FISCHER P，BROX T.U-net：convolutional networks for biomedical image segmentation[C]//18th International Conference on Medical Image Computing and Computer-Assisted Intervention（MICCAI 2015），Munich，Germany，October 5-9，2015：234-241.
[9] ZHOU Z，RAHMAN SIDDIQUEE M M，TAJBAKHSH N，et al.Unet++：a nested u-net architecture for medical image segmentation[C]//4th International Workshop on Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support（DLMIA 2018），Granada，Spain，September 20，2018：3-11.
[10] HUANG H，LIN L，TONG R，et al.Unet 3+：a full-scale connected unet for medical image segmentation[C]//2020 IEEE International Conference on Acoustics，Speech and Signal Processing（ICASSP），2020：1055-1059.
[11] LIU J J，HOU Q，CHENG M M，et al.A simple pooling-based design for real-time salient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：3917-3926.
[12] CHEN Z，XU Q，CONG R，et al.Global context-aware progressive aggregation network for salient object detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2020：10599-10606.
[13] ZHANG X，WANG T，QI J，et al.Progressive attention guided recurrent network for salient object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：714-722.
[14] WEI K，FU Y，YANG J，et al.A physics-based noise formation model for extreme low-light raw denoising[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：2758-2767.
[15] GUO Q，ZHUANG F，QIN C，et al.A survey on knowledge graph-based recommender systems[J].IEEE Transactions on Knowledge and Data Engineering，2020（34）：3549-3568.
[16] YANG X，HOU L，ZHOU Y，et al.Dense label encoding for boundary discontinuity free rotation detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2021：15819-15829.
[17] FENG M，LU H，DING E.Attentive feedback network for boundary-aware salient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：1623-1632.
[18] QIN X，ZHANG Z，HUANG C，et al.Basnet：boundary-aware salient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：7479-7489.
[19] LIN T Y，MAIRE M，BELONGIE S，et al.Microsoft coco：common objects in context[C]//13th European Conference on Computer Vision，Zurich，Switzerland，September 6-12，2014：740-755.
[20] SHI J，YAN Q，XU L，et al.Hierarchical image saliency detection on extended CSSD[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2015（38）：717-729.
[21] WANG L，LU H，WANG Y，et al.Learning to detect salient objects with image-level supervision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：136-145.
[22] HOU X，ZHANG L.Saliency detection：a spectral residual approach[C]//2007 IEEE Conference on Computer Vision and Pattern Recognition，2007：1-8.
[23] LI G，YU Y.Visual saliency detection based on multiscale deep CNN features[J].IEEE Transactions on Image Processing，2016（25）：5012-5024.
[24] YANG C，ZHANG L，LU H，et al.Saliency detection via graph-based manifold ranking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2013：3166-3173.
[25] YAN Q，XU L，SHI J，et al.Hierarchical saliency detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2013：1155-1162.
[26] RUSSAKOVSKY O，DENG J，SU H，et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision，2015（115）：211-252.
[27] XIAO J，HAYS J，EHINGER K A，et al.Sun database：large-scale scene recognition from abbey to zoo[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition，2010：3485-3492.
[28] EVERINGHAM M，VAN GOOL L，WILLIAMS C K，et al.The pascal visual object classes（voc） challenge[J].International Journal of Computer Vision，2010（88）：303-338.
[29] WANG W，SHEN J，CHENG M M，et al.An iterative and cooperative top-down and bottom-up inference network for salient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：5968-5977.
[30] WU Z，SU L，HUANG Q.Cascaded partial decoder for fast and accurate salient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：3907-3916.
[31] ZHAO X，PANG Y，ZHANG L，et al.Suppress and balance：a simple gated network for salient object detection[C]//16th European Conference on Computer Vision：Glasgow，UK，August 23-28，2020：35-51.
[32] ZHOU H，XIE X，LAI J H，et al.Interactive two-stream decoder for accurate and fast saliency detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：9141-9150.
[33] PANG Y，ZHAO X，ZHANG L，et al.Multi-scale interactive network for salient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：9413-9422.
[34] CHEN T，HU X，XIAO J，et al.BINet：bidirectional interactive network for salient object detection[J].Neurocomputing，2021（465）：490-502.
[35] CHEN X，ZHANG Q，ZHANG L Edge-aware salient object detection network via context guidance[J].Image and Vision Computing，2021（110）：104166.
[36] ZHUGE M，FAN D P，LIU N，et al.Salient object detection via integrity learning[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2023，45（3）.
[37] LIU S，FENG X，REN Y，et al.DCENet：a dynamic correlation evolve network for short-term traffic prediction[J].Physica A：Statistical Mechanics and its Applications，2023（614）：128525.
[38] HE K，ZHANG X，REN S，et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：770-778.
[39] LIN X，GUO Y A，WANG J.Global correlation network：end-to-end joint multi-object detection and tracking[J].arXiv：2103.12511，2021.
[40] BADRINARAYANAN V，KENDALL，CIPOLLA R.SegNet：a deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2017（39）：2481-2495.
[41] QIN X，ZHANG Z，HUANG C，et al.U2-Net：going deeper with nested U-structure for salient object detection[J].Pattern Recognition，2020（106）.