计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (19): 202-213.DOI: 10.3778/j.issn.1002-8331.2406-0063

• 图形图像处理 • 上一篇    下一篇

融合岛式双向特征金字塔的遥感图像目标检测

梁礼明,冯耀,龙鹏威,王泽欣   

  1. 江西理工大学 电气工程与自动化学院,江西 赣州 341000
  • 出版日期:2025-10-01 发布日期:2025-09-30

Fusion of Island Bi-Directional Feature Pyramid for Remote Sensing Image Object Detection

LIANG Liming, FENG Yao, LONG Pengwei, WANG Zexin   

  1. School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000, China
  • Online:2025-10-01 Published:2025-09-30

摘要: 针对遥感图像目标检测存在复杂背景干扰、目标多尺度差异等问题,提出一种融合岛式双向特征金字塔网络的遥感图像目标检测算法(IFD-YOLOv8s)。设计岛式双向特征金字塔网络,增强模型对目标尺度变化的适应性,减少多层次特征融合过程中信息丢失,有助于深层语义和细粒度信息的高效传播;提出特征上下文增量模块,对地物目标特征进行更全面的捕获,提高模型检测能力;设计双线池化注意力模块,抑制非目标噪声干扰,增强遥感目标特征可辨别性。在公共数据集RSOD和NWPU VHR-10上进行消融和对比实验,平均准确率均值分别为98.2%和91.4%,相较于基线算法YOLOv8s分别提升1.8和2.1个百分点。与主流目标检测算法相比,IFD-YOLOv8s对复杂背景目标和多尺度目标的检测更有效。在公共数据集DOTA上进行泛化实验,平均准确率均值为78.7%,相比原模型提高1.8个百分点。

关键词: 遥感图像, 目标检测, 特征金字塔网络, 上下文信息, 注意力机制

Abstract: Aiming at the problems of object detection in remote sensing images, such as complex background interference and multi-scale differences of targets, a remote sensing image object detection that integrates island bi-directional feature pyramid network, referred to as IFD-YOLOv8s, is proposed. Firstly, an island bi-directional feature pyramid network is designed to enhance the adaptability of the model to target scale changes, reduce the information loss in the process of multilevel feature fusion, and contribute to the efficient propagation of deep semantic and fine-grained information; then a feature context incremental module is proposed to capture the feature of the feature target in a more comprehensive way, and to improve the model detection capability; and finally, a dual path pooling attention module is designed to inhibit the interference of non-target noise that enhances the remote sensing target feature discriminability. The ablation and comparison experiments are conducted on the public datasets RSOD and NWPU VHR-10, and the mean average precision are 98.2% and 91.4%, respectively, which are improved by 1.8 and 2.1 percentage points compared with the baseline algorithm YOLOv8s. Compared with mainstream object detection algorithms, IFD-YOLOv8s is more effective in detecting complex background targets and multi-scale targets. Generalization experiments on the public dataset DOTA show an mean average precision of 78.7%, which is a 1.8 percentage points improvement over the original model.

Key words: remote sensing image, object detection, feature pyramid network, contextual information, attention mechanism