计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (10): 228-237.DOI: 10.3778/j.issn.1002-8331.2401-0200

• 图形图像处理 • 上一篇    下一篇

融合多尺度交叉注意力和边缘感知的伪装目标检测

郝子强,张庆宝,赵世豪,王焯豪,詹伟达   

  1. 长春理工大学 电子信息工程学院,长春 130022
  • 出版日期:2025-05-15 发布日期:2025-05-15

Camouflaged Object Detection Based on Multi-Scale Cross Attention and Edge Perception

HAO Ziqiang, ZHANG Qingbao, ZHAO Shihao, WANG Zhuohao, ZHAN Weida   

  1. College of Electronic Information Engineering, Changchun University of Science and Technology, Changchun 130022, China
  • Online:2025-05-15 Published:2025-05-15

摘要: 针对当前伪装目标检测算法无法准确、完整地检测出目标对象和其边缘的问题,提出了一种融合多尺度交叉注意力和边缘感知的伪装目标检测网络(multi-scale cross attention and edge perception network,MAEP-Net)。利用Res2Net-50提取图像的原始特征,并采用融合了多尺度交叉注意力的特征金字塔结构从通道、空间两个维度挖掘目标位置信息和凸显伪装目标区域特征;使用定位模块对目标的大致位置进行准确定位;边缘感知模块抑制低级特征中背景的噪声,融合边缘特征以获取更多的边缘细节信息;细化模块通过注意力机制分别从前景和背景两个方向关注目标线索,利用边缘先验、语义先验、领域先验、区域先验知识进一步细化目标结构和边缘轮廓。在3个公开数据集上的实验表明,所提算法相较于12种主流算法在4个客观评价指标上均取得了最优表现,尤其是在COD10K数据集上所提算法的加权平均值F-measure和平均绝对误差(mean absolute error,MAE)分别达到0.797和0.031。由此可见,所提算法在COD任务上具有较好的检测效果。

关键词: 多尺度交叉注意力, 边缘感知, 伪装目标检测, 特征金字塔结构

Abstract: Aiming at the problem that current camouflaged object detection algorithms being unable accurately and completely detect target objects and their edges, a camouflaged object detection network integrating multi-scale cross attention and edge perception (MAEP-Net) is proposed. Use Res2Net-50 to extract the original features of the image, use a feature pyramid structure that integrates multi-scale cross attention to mine target position information and highlight disguised target area features from both channel and spatial dimensions. Use a positioning module to accurately locate the approximate position of the target. The edge perception module suppresses background noise in low-level features and fuses edge features to obtain more edge detail information. The final refining module concentrates on target clues from both foreground and background directions through attention mechanisms, and further refines the target structure and edge contours by using edge prior, semantic prior, domain prior, and region prior knowledge. Experimental results on 3 public datasets have shown that the proposed algorithm outperforms 12 mainstream algorithms in all 4 objective evaluation metrics, especially on the COD10K dataset where the weighted average F-measure and mean absolute error (MAE) of the proposed algorithm reach 0.797 and 0.031 respectively. It follows that the proposed algorithm has good detection performance in COD tasks.

Key words: multi-scale cross attention, edge perception, camouflaged object detection, feature pyramid structure