计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (1): 245-253.DOI: 10.3778/j.issn.1002-8331.2208-0240

• 图形图像处理 • 上一篇    下一篇

面向遥感影像目标检测的ACFEM-RetinaNet算法

林文龙,阿里甫·库尔班,陈一潇,袁旭   

  1. 新疆大学 软件学院,乌鲁木齐 830046
  • 出版日期:2024-01-01 发布日期:2024-01-01

ACFEM-RetinaNet Algorithm for Remote Sensing Image Target Detection

LIN Wenlong, Alifu·Kuerban, CHEN Yixiao, YUAN Xu   

  1. School of Software, Xinjiang University, Urumqi 830046, China
  • Online:2024-01-01 Published:2024-01-01

摘要: 针对RetinaNet在遥感目标检测任务中多尺度、密集小目标问题,提出了ACFEM-RetinaNet遥感目标检测算法。针对原主干特征提取不充分的问题,采用Swin Transformer作为主干网络,以提升算法的特征提取能力,提高检测精度。针对遥感图像多尺度问题,提出自适应上下文特征提取模块,使用SK注意力引导不同空洞率的可变形卷积自适应调整感受野、提取上下文特征,改善多尺度目标检测效果。针对遥感图像中密集小目标问题,引入FreeAnchor模块,从极大释然估计的角度设计优化锚框匹配策略,提高检测精度。实验结果表明,在公共遥感图像目标检测数据集RSOD上,ACFEM-RetinaNet算法取得了91.1%的检测精度,相较于原算法提高了4.6个百分点,能更好地应用于遥感图像目标检测。

关键词: 深度学习, RetinaNet, 遥感目标检测, Swin Transformer

Abstract: Aiming at the problem that RetinaNet is difficult to detect multi-scale targets and dense small targets in remote sensing target detection task, an ACFEM-RetinaNet remote sensing target detection algorithm is proposed. To solve the problem that the original backbone network extraction is not sufficient, the algorithm adopts Swin Transformer as the backbone network to improve the feature extraction ability of the algorithm and improve the detection accuracy. For the problem of dense small targets in remote sensing images, an adaptive context feature extraction module is proposed, which uses SK attention to guide deformable convolution with different dilation rates to adaptively adjust the receptive field and extract context features. Aiming at the problem of dense small targets in remote sensing images, the FreeAnchor module is introduced to design and optimize the anchor matching strategy from the perspective of a maximum likelihood estimation (MLE) procedure, so as to improve the detection accuracy. The experimental results show that the ACFEM-RetinaNet algorithm achieves 91.1% detection accuracy on the public remote sensing image target detection dataset RSOD, which is 4.6 percentage points higher than the original algorithm. The ACFEM-RetinaNet can be better applied to remote sensing image target detection.

Key words: deep learning, RetinaNet,  , remote sensing target detection, Swin Transformer