计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (2): 200-210.DOI: 10.3778/j.issn.1002-8331.2211-0037

• 图形图像处理 • 上一篇    下一篇

融合卷积通道注意力的遥感图像目标检测方法

王怀济,李广明,张红良,申京傲,吴京   

  1. 东莞理工学院 计算机科学与技术学院,广东 东莞 523808
  • 出版日期:2024-01-15 发布日期:2024-01-15

Rotating Object Detection Method Based on Convolutional Block Channel Attention in Remote Sensing Images

WANG Huaiji, LI Guangming, ZHANG Hongliang, SHEN Jing’ao, WU Jing   

  1. School of Computer Science and Technology, Dongguan University of Technology, Dongguan, Guangdong 523808, China
  • Online:2024-01-15 Published:2024-01-15

摘要: 针对遥感目标检测中,目标分布不均匀、排列杂乱、大长宽比和尺寸变化剧烈等导致目标定位困难的问题,提出了一种融合卷积通道注意力的旋转目标检测方法。基于[k]-means进行改进,设计了在最优解下增加聚类簇之间距离的锚框设计方法;基于YOLOv5进行改进,设计融合卷积通道注意力的网络模型,增强主干网络传达给特征金字塔顶层和底层的语义和定位特征;设计包含覆盖面积、中心点距离、宽高比和角度损失四种要素的目标框损失函数;优化YOLOv5的目标框宽高回归函数,自适应生成回归预测范围。实验在两个遥感公共数据集UCAS-AOD和HRSC2016上分别与5种具有代表性的方法进行比较,在UCAS-AOD数据集上,平均精度mAP达到了95.9%,相比于CSL方法,mAP提升了0.8个百分点;在HRSC2016数据集上,平均精度mAP达到了96.3%,速度FPS达到了77.5,相比于R3Det方法,mAP提升了0.3个百分点,FPS提升了5.46倍。实验结果表明,方法的整体性能超过了近年来一些代表性的方法,在两个遥感数据集中验证了方法的有效性。

关键词: 旋转目标检测, YOLO, 锚框, 卷积通道注意力, 回归函数优化, 损失函数重构

Abstract: In order to solve the problem of object location in remote sensing object detection, which is caused by uneven object distribution, complex environment, arbitrary object angle, large aspect ratio, and size change dramatically, a rotating object detection method integrating convolutional block channel attention is proposed. Based on [k]-means, an anchor design method is designed to increase the distance between clusters under the optimal solution. Based on YOLOv5, a network model integrating the channel attention of convolutional block is designed to enhance the semantics and positioning features conveyed by backbone to the top and bottom layers of the feature pyramid. The object box loss function is designed, which includes four elements:coverage loss, center distance loss, aspect ratio loss and angle loss. Optimize the regression function of the width and height of the object box of YOLOv5, and adapt the regression prediction range of the width and height. The experiment is compared with five representative methods on two remote sensing public data sets UCAS-AOD and HRSC2016. On the UCAS-AOD data set, mAP reaches 95.9%, and compared with the CSL method, mAP is improved by 0.8 percentage points. On the HRSC2016 data set, mAP reaches 96.3% and the speed FPS reaches 77.5, compared with the R3Det method, mAP increases by 0.3 percentage points and the speed FPS increases by 5.46 times. The experimental results show that the overall performance of the method exceeds that of some representative methods in recent years, and the effectiveness of the method is verified in remote sensing data sets with complex scenes.

Key words: rotating object detection, YOLO, anchor, convolution channel attention, regression function optimization, loss function reconstruction