计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (19): 182-188.DOI: 10.3778/j.issn.1002-8331.1908-0372

• 图形图像处理 • 上一篇    下一篇

多级上采样融合的强监督RGBD显著性目标检测

项前,唐继婷,吴建国   

  1. 安徽大学 计算机科学与技术学院,合肥 230601
  • 出版日期:2020-10-01 发布日期:2020-09-29

Super Supervised RGBD Salient Object Detection with Multi-level Upsampling Fusion

XIANG Qian, TANG Jiting, WU Jianguo   

  1. Institute of Computer Science and Technology, Anhui University, Hefei 230601, China
  • Online:2020-10-01 Published:2020-09-29

摘要:

有效的多模态特征融合在RGBD显著性目标检测领域中发挥着重要的作用,但如何学习到有效的多模态特征融合在目前仍然是一个挑战性的任务。不同于利用多模态显著图加权融合的传统方法,基于卷积神经网络的方法使用简单的卷积操作融合多模态特征,但这对于大量的跨模态数据融合来说是不够的。为了解决这个问题,提出了一种新颖的上采样融合模块,它不仅具有多尺度的感知能力,还同时进行全局和局部上下文推理,此外强监督残差模块增强了网络训练的稳定和有效性。与现有的方法相比,提出的方法能够提供更稳定、更灵活的融合流,从而实现了RGB和Depth信息充分、高效的融合。在三个广泛使用的RGBD显著性目标检测数据集上的大量实验证明了该方法的有效性。

关键词: 多模态, RGBD显著性目标检测, 强监督

Abstract:

Effective multi-modal feature fusion plays an important role in RGBD salient object detection, but how to learn effective multi-modal feature fusion is still a challenging task. Unlike traditional methods that use multimodal saliency map weighted fusion, convolutional neural network-based methods use simple convolution operations to fuse multi-modal features, but this is not sufficient for large cross-modal data fusion. In order to solve this problem, a novel upsampling fusion module is proposed, which not only has multi-scale perception, but also performs global and local context reasoning. In addition, the application of super supervised residual module enhances training stability and effectiveness. The proposed method can provide more stable and flexible fusion stream in comparison with the existing methods, thus achieving full and efficient fusion of RGB and Depth information. A large number of experimental results on three widely used RGBD salient object detection datasets demonstrate that the proposed method is effectiveness.

Key words: multi-modal, RGBD salient object detection, super supervised