采用特征优化和引导的显著目标检测研究

doi:10.3778/j.issn.1002-8331.2306-0298

摘要/Abstract

摘要： 针对目前深度图存在对比度不明显和预测图边界模糊等问题，提出了一种新型显著目标检测网络模型。该模型包括特征优化模块和特征引导模块。为了降低低质量深度图的负面影响，并精确地突出显著目标，在特征优化模块对深度图的各层特征进行混合注意力计算并进行双向融合。为解决边界模糊问题，在特征引导模块利用引导融合的方式引入低层特征来精细化目标边界。在解码阶段，引入不增加模型参数的权值计算方法，计算RGB特征和深度特征对最终预测的贡献比重。通过与近年来十二种先进方法进行的对比实验表明，所提算法模型在NJU2K、NLPR、DES、SIP、STERE和LFSD测试数据集上具有更优秀的检测性能，其中在SIP数据集上，提出的模型与第二名相比，最大F值提升了1.3%，平均F值提升了1%，E-measure提升了1.7%，S-measure提升了1.5%，消融实验证明了所提模块的有效性。

关键词: 深度图, 显著目标检测, 混合注意力, 特征融合

Abstract: Aiming at the problems of weak contrast in depth map and fuzzy boundary in prediction map, this paper proposes a new salient object detection model. It includes feature optimization module and feature guidance module. In order to reduce the negative effects of low quality depth maps and precisely highlight salient objects, it applies mixed attention to compute the feature of each layer, and then the bidirectional fusion is used to fuse in the feature optimization module. In the feature guidance module, it introduces low-level features to refine the object boundary by means of guidance and fusion. In the decoding stage, it measures the contribution of RGB features and depth features by the weight calculation method without increasing the model parameters. Experiments compared with twelve advanced methods in recent years show that the proposed algorithm model has better detection performance on NJU2K, NLPR, DES, SIP, STERE and LFSD test datasets. On the SIP dataset, the performance of the model is improved by 1.3% in maximum F-value, 1% in average F-value, 1.7% in E-measure, 1.5% in S-measure compared with the second place, and ablation experiments show the effectiveness of the proposed module.

Key words: depth map, salient object detection, mixed attention, feature fusion

吴文介, 王丰. 采用特征优化和引导的显著目标检测研究[J]. 计算机工程与应用, 2024, 60(18): 256-265.

WU Wenjie, WANG Feng. Research on Salient Object Detection Using Feature Optimization and Guidance[J]. Computer Engineering and Applications, 2024, 60(18): 256-265.

参考文献

[1] 王万良, 王铁军, 陈嘉诚, 等. 融合多尺度和多头注意力的医疗图像分割方法[J]. 浙江大学学报(工学版), 2022, 56(9): 1796-1805.
WANG W L, WANG T J, CHEN J C, et al. Medical image segmentation method combining multi-scale and multi-head attention[J]. Journal of Zhejiang University (Engineering Science), 2022, 56(9): 1796-1805.
[2] HONG S, YOU T, KWAK S, et al. Online tracking by learning discriminative saliency map with convolutional neural network[C]//Proceedings of the International Conference on Machine Learning, 2015: 597-606.
[3] 杜佳锦, 柏正尧, 刘旭珩, 等. 融合几何注意力和多尺度特征点云配准网络[J]. 计算机工程与应用, 2024, 60(12): 234-244.
DU J J, BAI Z Y，LIU X H, et al. Fusion of geometric attention and multi-scale feature network for point cloud registration[J]. Computer Engineering and Applications, 2024, 60(12): 234-244.
[4] QU L Q, HE S F, ZHANG J W, et al. RGBD salient object detection via deep fusion. [J]. IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society, 2017, 26(5): 2274-2285.
[5] WANG N, GONG X. Adaptive fusion for RGB-D salient object detection[J]. IEEE Access, 2019, 7: 55277-55284.
[6] ZHANG M, REN W, PIAO Y, et al. Select, supplement and focus for RGB-D saliency detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 3472-3481.
[7] LI G, LIU Z, YE L, et al. Cross-modal weighting network for RGB-D salient object detection[C]//Proceedings of the 16th European Conference on Computer Vision (ECCV 2020), Glasgow, UK, August 23-28, 2020. [S.l.]: Springer International Publishing, 2020: 665-681.
[8] ZHANG J, FAN D P, DAI Y, et al. Uncertainty inspired RGB-D saliency detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(9): 5761-5779.
[9] FAN D P, LIN Z, ZHANG Z, et al. Rethinking RGB-D salient object detection: models, data sets, and large-scale benchmarks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(5): 2075-2089.
[10] ZHANG W, JI G P, WANG Z, et al. Depth quality-inspired feature manipulation for efficient RGB-D salient object detection[C]//Proceedings of the 29th ACM International Conference on Multimedia, 2021: 731-740.
[11] 程艳, 蔡壮, 吴刚, 等. 结合自注意力特征过滤分类器和双分支GAN的面部表情识别[J]. 模式识别与人工智能, 2022, 35(3): 11-17.
CHENG Y, CAI Z, WU G, et al. Facial expression recognition combining self-attention feature filtering classifier and two-branch GAN[J]. Pattern Recognition and Artificial Intelligence, 2022, 35(3): 11-17.
[12] 夏鸿斌, 肖奕飞, 刘渊. 融合自注意力机制的长文本生成对抗网络模型[J]. 计算机科学与探索, 2022, 16(7): 1603-1610.
XIA H B, XIAO Y F, LIU Y. Long text generation adversarial network model with self-attention mechanism[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1603-1610.
[13] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.
[14] YU C, WANG J, PENG C, et al. Learning a discriminative feature network for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 1857-1866.
[15] 沈怀艳, 吴云. 基于MSFA-Net的肝脏CT图像分割方法[J]. 计算机科学与探索, 2023, 17(3): 646-656.
SHEN H Y, WU Y. Liver CT image segmentation method based on MSFA-Net[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(3): 646-656.
[16] JU R, LIU Y, REN T, et al. Depth-aware salient object detection using anisotropic center-surround difference[J]. Signal Process: Image Communication, 2015, 38: 115-126.
[17] PENG H, LI B, XIONG W, et al. RGBD salient object detection: a benchmark and algorithms[C]//Proceedings of the 13th European Conference on Computer Vision (ECCV 2014). Zurich, Switzerland: IEEE, 2014: 92-109.
[18] NIU Y, GENG Y, LI X, et al. Leveraging stereopsis for saliency analysis[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012: 454-461.
[19] CHENG Y, FU H, WEI X, et al. Depth enhanced saliency detection method[C]//Proceedings of International Conference on Internet Multimedia Computing and Service (ICIMCS’14), Xiamen, China, 2014: 23-27.
[20] LI N, YE J, JI Y, et al. Saliency detection on light field[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(8): 1605-1616.
[21] CHEN S, FU Y. Progressively guided alternate refinement network for RGB-D salient object detection[C]//Proceedings of the 16th European Conference on Computer Vision (ECCV 2020), Glasgow, UK, August 23-28, 2020. [S.l.]: Springer International Publishing, 2020: 520-538.
[22] FU K, FAN D P, JI G P, et al. Siamese network for RGB-D salient object detection and beyond[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(9): 5541-5559.
[23] PANG Y, ZHANG L, ZHAO X, et al. Hierarchical dynamic filtering network for RGB-D salient object detection[C]//Proceedings of the 16th European Conference on Computer Vision (ECCV 2020), Glasgow, UK, August 23-28, 2020. [S.l.]: Springer International Publishing, 2020: 235-252.
[24] LIU Z, TANG J, XIANG Q, et al. Salient object detection for RGB-D images by generative adversarial network[J]. Multimedia Tools and Applications, 2020, 79(35): 25403- 25425.
[25] JI W, LI J, ZHANG M, et al. Accurate RGB-D salient object detection via collaborative learning[C]//Proceedings of IEEE European Conference on Computer Vision. Cham: Springer, 2020: 52-69.
[26] LUO A, LI X, YANG F, et al. Cascade graph neural networks for RGB-D salient object detection[C]//Proceedings of the 16th European Conference on Computer Vision (ECCV 2020), Glasgow, UK, August 23-28, 2020. [S.l.]: Springer International Publishing, 2020: 346-364.
[27] ZHANG M, YAO S, HU B, et al. C2DFNet: criss-cross dynamic filter network for RGB-D salient object detection[J]. IEEE Transactions on Multimedia, 2024, 25: 5142-5154.
[28] ZHANG Z, LIN Z, XU J, et al. Bilateral attention network for RGB-D salient object detection[J]. IEEE Transactions on Image Processing, 2021, 30: 1949-1961.
[29] ZHAO X, PANG Y, ZHANG L, et al. Self-supervised pretraining for RGB-D salient object detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022: 3463-3471.
[30] SHU J, YUAN X, MENG D, et al. CMW-Net: learning a class-aware sample weighting mapping for robust deep learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(10): 11521-11539.