Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (23): 249-256.DOI: 10.3778/j.issn.1002-8331.2403-0239

• Graphics and Image Processing • Previous Articles     Next Articles

Research on Object Detection Algorithm for Remote Sensing Images Based on Multi-Scale Feature Fusion

XU Siyuan, WU Weilin   

  1. 1.The center for Applied Mathematics of Guangxi, School of Electronic Information, Guangxi Minzu University, Nanning 530006, China
    2.Guangxi Key Laboratory of Machine Vision and Intelligent Control, Wuzhou University, Wuzhou, Guangxi 543003,China
  • Online:2024-12-01 Published:2024-11-29

多尺度特征融合的遥感图像目标检测算法研究

许思源,吴伟林   

  1. 1.广西民族大学 电子信息学院 广西应用数学中心,南宁 530006
    2.梧州学院 广西机器视觉与智能控制重点实验室,广西 梧州 543003

Abstract: In order to tackle the challenges of complex background, feature conflict and variable target scale in remote sensing images, a target detection method based on multi-scale feature fusion using YOLOv8s as the baseline model is proposed for remote sensing images. Firstly, RepVGG network is used as the feature extraction network to improve the feature extraction capability and effectively achieve the capture of global semantic information. Secondly, a tripolar integrative fusion (TIF) module is designed in the neck network to improve the detection accuracy of targets at all scales through the effective fusion of positional and semantic information. Finally, the SlideLoss function is used as the classification loss function to enhance the detection ability of difficult targets and enhance the precision of target detection. The experimental results indicate that the improved model achieves 94.4%, 93.0% and 95.5% detection accuracies on the NWPU VHR-10, RSOD and UCAS-AOD datasets respectively, which are 5.1, 6.0 and 4.4?percentape points higher than the baseline model, which is superior to the other methods in terms of accuracy and is able to better complete target detection task in remote sensing images.

Key words: deep learning, object detection, remote sensing image, multi-scale feature fusion

摘要: 针对遥感图像背景复杂、特征冲突以及目标尺度多变等问题,以YOLOv8s为基线模型,提出一种基于多尺度特征融合的遥感图像目标检测方法。将RepVGG网络作为模型的特征提取网络,提高模型的特征提取能力,有效实现全局语义信息的捕获;在颈部网络提出了一种三极综合性融合模块(tripolar integrative fusion,TIF),通过位置信息和语义信息的有效融合,提高对各个尺度目标的检测精度;将SlideLoss滑动损失函数作为模型的分类损失函数,增强模型对困难目标的检测能力,从而提高目标检测的准确性。实验结果表明,改进后的模型在NWPU VHR-10、RSOD和UCAS-AOD数据集上分别取得了94.4%、93.0%和95.5%的检测精度,相较基线模型分别提升了5.1、6.0、4.4个百分点,在准确性方面优于其他方法,能更好地完成遥感图像目标检测任务。

关键词: 深度学习, 目标检测, 遥感图像, 多尺度特征融合