计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (10): 1-15.DOI: 10.3778/j.issn.1002-8331.2308-0206
于俊伟,郭园森,张自豪,母亚双
出版日期:
2024-05-15
发布日期:
2024-05-15
YU Junwei, GUO Yuansen, ZHANG Zihao, MU Yashuang
Online:
2024-05-15
Published:
2024-05-15
摘要: 显著性目标检测旨在准确检测和定位图像或视频中最引人注目的目标或区域,为更好地进行目标识别和场景分析提供帮助。尽管全监督显著性检测方法取得一定成效,但获取大规模像素级标注数据集十分困难且昂贵。弱监督检测方法利用相对容易获取的图像级标签或带噪声的弱标签训练模型,在实际应用中表现出良好效果。全面对比了全监督和弱监督显著性检测的主流方法和应用场景,重点分析了常用的弱标签数据标注方法及其对显著目标检测的影响。综述了弱监督条件下显著目标检测方法的最新研究进展,并在常用数据集上对不同弱监督方法的性能进行了比较。最后探讨了弱监督显著性检测在农业、医学和军事等特殊领域的应用前景,指出了该研究领域存在的问题及未来发展趋势。
于俊伟, 郭园森, 张自豪, 母亚双. 弱监督显著性目标检测研究进展[J]. 计算机工程与应用, 2024, 60(10): 1-15.
YU Junwei, GUO Yuansen, ZHANG Zihao, MU Yashuang. Process of Weakly Supervised Salient Object Detection[J]. Computer Engineering and Applications, 2024, 60(10): 1-15.
[1] FAN D P, LI T P, LIN Z, et al. Re-thinking co-salient object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(8): 4339-4354. [2] UTRERA F, KRAVITZ E, ERICHSON N B, et al. Adversarially-trained deep nets transfer better: illustration on image classification[J]. arXiv:2007.05869, 2020. [3] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[J]. arXiv:1412.7062, 2014. [4] LI G B, YU Y Z. Deep contrast learning for salient object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 478-487. [5] DAI K N, ZHAO J, WANG L J, et al. Video annotation for visual tracking via selection and refinement[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021: 10296-10305. [6] 杨艳艳, 李雷孝, 林浩. 提取驾驶员面部特征的疲劳驾驶检测研究综述[J]. 计算机科学与探索, 2023, 17(6): 1249-1267. YANG Y Y, LI L X, LIN H. Research review on fatigue driving detection based on facial features extraction[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(6): 1249-1267. [7] BELHARBI S, AYED I B, MCCAFFREY L, et al. Deep active learning for joint classification & segmentation with weak annotator[C]//Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision, 2021: 3337-3346. [8] HOOPER S, WORNOW M, SEAH Y H, et al. Cut out the annotator, keep the cutout: better segmentation with weak supervision[C]//Proceedings of the 9th International Conference on Learning Representations, 2020. [9] GOLUB D, MARTIN M R, EL K A, et al. Leveraging pretrained image classifiers for language-based segmentation[C]//Proceedings of the 2020 IEEE/CVF Winter Conference on Applications of Computer Vision, 2020: 2010-2019. [10] ZHOU Q, YU C H, WANG Z B, et al. Instant-teaching: an end-to-end semi-supervised object detection framework[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 4081-4090. [11] LI G B, XIE Y, LIN L. Weakly supervised salient object detection using image labels[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018. [12] LI J J, JI W, BI Q, et al. Joint semantic mining for weakly supervised RGB-D salient object detection[C]//Advances in Neural Information Processing Systems 34, 2021: 11945-11959. [13] HE S F, JIAO J B, ZHANG X D, et al. Delving into salient object subitizing and detection[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, 2017: 1059-1067. [14] ISLAM M A, KALASH M, BRUCE N D. Revisiting salient object detection: simultaneous detection, ranking, and subitizing of multiple salient objects[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7142-7150. [15] ZHENG X Y, TAN X, ZHOU J, et al. Weakly-supervised saliency detection via salient object subitizing[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(11): 4370-4380. [16] ZHOU B L, KHOSLA A, LAPEDRIZA A, et al. Learning deep features for discriminative localization[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2921-2929. [17] BILEN H, VEDALDI A. Weakly supervised deep detection networks[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2846-2854. [18] CINBIS R G, VERBEEK J, SCHMID C. Weakly supervised object localization with multi-fold multiple instance learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(1): 189-203. [19] TANG P, WANG X G, BAI X, et al. Multiple instance detection network with online instance classifier refinement[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2843-2851. [20] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, 2017: 618-626. [21] WANG X G, FENG J P, HU B, et al. Weakly-supervised instance segmentation via class-agnostic learning with salient images[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 10225-10235. [22] UNAL O, DAI D, VAN G L. Scribble-supervised lidar semantic segmentation[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 2697-2707. [23] PAN Z Y, JIANG P, WANG Y H, et al. Scribble-supervised semantic segmentation by uncertainty reduction on neural representation and self?supervision on neural eigenspace[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021: 7416-7425. [24] ZHANG J, YV X, LI A X, et al. Weakly-supervised salient object detection via scribble annotations[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 12546-12555. [25] YV S Y, ZHANG B F, XIAO J M, et al. Structure-consistent weakly supervised salient object detection with local saliency coherence[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence, 2021: 3234-3242. [26] HUANG Z, XIANG T Z, CHEN H X, et al. Scribble-based boundary-aware network for weakly supervised salient object detection in remote sensing images[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 191: 290-301. [27] 姚洁茹, 韩军伟, 张鼎文. 一种基于点标注的弱监督目标检测方法[J]. 中国科学: 信息科学, 2022, 52(3): 461-482. YAO J R, HAN J W, ZHANG D W. A weakly supervised target detection method based on point annotation[J]. Science in China: Information Sciences, 2022, 52(3): 461-482. [28] GAO S Y, ZHANG W, WANG Y, et al. Weakly-supervised salient object detection using point supervision[C]//Proceedings of the 36th AAAI Conference on Artificial Intelligence, 2022: 670-678. [29] ZHAO M B, FENG W, YIN F, et al. Texts as points: scene text detection with point supervision[J]. Pattern Recognition Letters, 2023, 170: 1-8. [30] GAO X, XIONG Y J, ZHANG G Y, et al. Exploiting key points supervision and grouped feature fusion for multiview pedestrian detection[J]. Pattern Recognition, 2022, 131: 108866. [31] CAO L B, XIAO Z H, LIAO X H, et al. Automated chicken counting in surveillance camera environments based on the point supervision algorithm: LC-DenseFCN[J]. Agriculture, 2021, 11(6): 493. [32] RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C]//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Oct 5-9, 2015: 234-241. [33] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834-848. [34] ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2881-2890. [35] WASAY A, IDREOS S. More or less: when and how to build convolutional neural network ensembles[C]//Proceedings of the 2020 International Conference on Learning Representations, 2020. [36] PAD P, NARDUZZI S, KUNDIG C, et al. Efficient neural vision systems based on convolutional image acquisition[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 12285-12294. [37] JI H Q, LU W Z, SHEN L L. Backbone based feature enhancement for object detection[C]//Proceedings of the 15th Asian Conference on Computer Vision, 2020: 56-70. [38] HONG W X, LAO J W, REN W, et al. Training object detectors from scratch: an empirical study in the era of vision transformer[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 4662-4671. [39] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. [40] HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, 2017: 2961-2969. [41] GOWAL S, REBUFFI S A, WILES O, et al. Improving robustness using generated data[C]//Advances in Neural Information Processing Systems 34, 2021: 4218-4233. [42] GUPTA S K. Reinforcement based learning on classification task yields better generalization and adversarial accuracy (student abstract)[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence, 2021: 15793-15794. [43] SRIRAMANAN G, ADDEPALLI S, BABURAJ A. Towards efficient and effective adversarial training[C]//Advances in Neural Information Processing Systems 34, 2021: 11821-11833. [44] JIANG M, HUANG S S, DUAN J Y, et al. SALICON: saliency in context[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, 2015: 1072-1080. [45] 赵珊, 郑爱玲, 刘子路, 等. 通道分离双注意力机制的目标检测算法[J]. 计算机科学与探索, 2023, 17(5): 1112-1125. ZHAO S, ZHENG A L, LIU Z L, et al. Object detection algorithm for dual attention mechanism of channel separation[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(5): 1112-1125. [46] KUMMERER M, WALLIS T S, BETHGE M. DeepGaze II: reading fixations from deep features trained on object recognition[J]. arXiv:1610.01563, 2016. [47] LIU N, HAN J W, YANG M H. PiCANet: learning pixel-wise contextual attention for saliency detection[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018: 3089-3098. [48] PIAO Y R, JI W, LI J J, et al. Depth-induced multi-scale recurrent attention network for saliency detection[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019: 7254-7263. [49] YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J]. arXiv:1511.07122, 2015. [50] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778. [51] YANG M K, YU K, ZHANG C, et al. DenseASPP for semantic segmentation in street scenes[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018: 3684-3692. [52] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[J]. arXiv:1706.05587, 2017. [53] CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the 15th European Conference on Computer Vision, 2018: 801-818. [54] DURANG T, MORDAN T, THOME N, et al. WILDCAT: weakly supervised learning of deep convnets for image classification, pointwise localization and segmentation[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 642-651. [55] ZHANG X L, WEI Y C, FENG J S, et al. Adversarial complementary learning for weakly supervised object localization[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018: 1325-1334. [56] HUANG Z Y, ZOU Y, KUMAR B V, et al. Comprehensive attention self-distillation for weakly-supervised object detection[C]//Advances in Neural Information Processing Systems 33, 2020: 16797-16807. [57] ZHAO W B, ZHANG J, LI L, et al. Weakly supervised video salient object detection[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 16826-16835. [58] ZENG Y, ZHUGE Y Z, LU H C, et al. Multi-source weak supervision for saliency detection[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 6074-6083. [59] PIAO Y R, WANG J, ZHANG M, et al. MFNet: multi-filter directive network for weakly supervised salient object detection[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021: 4136-4145. [60] DABKOWSKI P, GAL Y. Real time image saliency for black box classifiers[C]//Advances in Neural Information Processing Systems 30, 2017. [61] CONG R M, QIN Q, ZHANG C, et al. A weakly supervised learning framework for salient object detection via hybrid labels[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 33(2): 534-548. [62] WANG Q, BAI S K, GAO J Y, et al. Unsupervised domain adaptive learning via synthetic data for person re-identification[J]. arXiv:2109.05542, 2021. [63] SINGH M, GUSTAFSON L, ADCOCK A, et al. Revisiting weakly supervised pre-training of visual perception models[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 804-814. [64] RECASENS A, KELLNHOFER P, STENT S, et al. Learning to zoom: a saliency-based sampling layer for neural networks[C]//Proceedings of the 15th European Conference on Computer Vision, 2018: 51-66. [65] BOECKING B, ROBERTS N, NEISWANGER W, et al. Generative modeling helps weak supervision (and Vice Versa)[J]. arXiv:2203.12023, 2022. [66] LANG H, VIJAYARAGHAVAN A, SONTAG D. Training subset selection for weak supervision[C]//Advances in Neural Information Processing Systems 35, 2022: 16023-16036. [67] YAN Q, XU L, SHI J P, et al. Hierarchical saliency detection[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013: 1155-1162. [68] WANG L J, LU H C, WANG Y F, et al. Learning to detect salient objects with image-level supervision[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 136-145. [69] LI G B, YV Y Z. Visual saliency based on multiscale deep features[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, 2015: 5455-5463. [70] YANG C, ZHANG L H, LU H C, et al. Saliency detection via graph-based manifold ranking[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013: 3166-3173. [71] LI Y, HOU X D, KOCH C, et al. The secrets of salient object segmentation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014: 280-287. [72] BOLLIS E, PEDRINI H, AVILA S. Weakly supervised learning guided by activation mapping applied to a novel citrus pest benchmark[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020: 70-71. [73] BHUGRA S, KAUSHIK V, GUPTA A, et al. AnoLeaf: unsupervised leaf disease segmentation via structurally robust generative inpainting[C]//Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision, 2023: 6415-6424. [74] YAN C, YAO J, LI R, et al. Weakly supervised deep learning for thoracic disease classification and localization on chest X-rays[C]//Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, 2018: 103-110. [75] WANG X, CHEN H, GAN C, et al. Weakly supervised deep learning for whole slide lung cancer image analysis[J]. IEEE Transactions on Cybernetics, 2019, 50(9): 3950-3962. [76] ZAHEER M Z, MAHMOOD A, ASTRID M, et al. CLAWS: clustering assisted weakly supervised learning with normalcy suppression for anomalous event detection[C]//Proceedings of the 16th European Conference Computer Vision, Glasgow, Aug 23-28, 2020: 358-376. |
[1] | 杨鑫, 朱恒亮, 毛国君. 双特征流融合和边界感知的显著性目标检测[J]. 计算机工程与应用, 2024, 60(10): 227-236. |
[2] | 张建贺, 姜晓燕. 结合双路网络和多标签分类的弱监督行人搜索[J]. 计算机工程与应用, 2023, 59(9): 159-166. |
[3] | 陈慧, 彭力. 基于全局响应的多级融合监督显著性目标检测[J]. 计算机工程与应用, 2023, 59(24): 238-247. |
[4] | 方金生, 陶余昊, 朱古沛, 陈彦佑. 复杂场景下显著性目标检测注意力金字塔网络[J]. 计算机工程与应用, 2023, 59(22): 259-267. |
[5] | 李俊文, 张红英, 韩宾. 深层特征聚合引导的轻量级显著性目标检测[J]. 计算机工程与应用, 2023, 59(19): 122-129. |
[6] | 杨永胜, 邓淼磊, 李磊, 张德贤. 基于深度学习的行人重识别综述[J]. 计算机工程与应用, 2022, 58(9): 51-66. |
[7] | 栾晓梅, 刘恩海, 武鹏飞, 张军. 基于边缘增强的遥感图像弱监督语义分割方法[J]. 计算机工程与应用, 2022, 58(20): 188-196. |
[8] | 袁铭阳,黄宏博,周长胜. 全监督学习的图像语义分割方法研究进展[J]. 计算机工程与应用, 2021, 57(4): 43-54. |
[9] | 杨辉,权冀川,梁新宇,王中伟. 基于弱监督学习的目标检测研究进展[J]. 计算机工程与应用, 2021, 57(16): 40-49. |
[10] | 项前,唐继婷,吴建国. 多级上采样融合的强监督RGBD显著性目标检测[J]. 计算机工程与应用, 2020, 56(19): 182-188. |
[11] | 刘 根1,蔡 念1,2,肖 盼1,2,林健发2. 基于光度立体和图像显著性的皮革缺陷检测[J]. 计算机工程与应用, 2019, 55(8): 215-219. |
[12] | 王英博1,刘 健2. 幂律变换和IGLC算法的显著性目标检测方法[J]. 计算机工程与应用, 2019, 55(14): 168-176. |
[13] | 陈 燕1,2,耿国华1,贾 晖1,2. 基于密度中心图的弱监督分类方法[J]. 计算机工程与应用, 2015, 51(6): 6-10. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||