计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (4): 272-281.DOI: 10.3778/j.issn.1002-8331.2407-0399

• 图形图像处理 • 上一篇    下一篇

改进RT-DETR的航拍小目标检测算法

刘思元,高凯,雍龙泉   

  1. 陕西理工大学 数学与计算机科学学院,陕西 汉中 723001
  • 出版日期:2025-02-15 发布日期:2025-02-14

Improved RT-DETR Algorithm for Aerial Small Object Detection

LIU Siyuan, GAO Kai, YONG Longquan   

  1. School of Mathematics and Computer Science, Shaanxi University of Technology, Hanzhong, Shaanxi 723001, China
  • Online:2025-02-15 Published:2025-02-14

摘要: 针对现有的目标检测算法在航拍图像中的小目标上易出现的漏检和误检问题,提出了基于改进RT-DETR(real-time detection transformer)的算法。在主干网络中引入了部分卷积(partial convolution,PConv),设计了PConvBlock结构,并通过由PConvBlock组成的BasicBlock-PConvBlock模块替代原有BasicBlock,有效减少了模型参数。采用双向特征金字塔网络(bidirectional feature pyramid network,BiFPN)结构优化特征融合模块,并引入S2特征进一步提升小目标的检测能力。引入CARAFE上采样算子,增强了多尺度特征的快速融合。实验表明,在VisDrone测试集上,改进后的模型在参数量上比RT-DETR模型降低了13.9%,同时在mAP0.5和mAP0.5:0.95指标上分别提升了2.4和1.9个百分点。在TT100K和DOTA数据集上均优于RT-DETR算法。改进模型在保持较小参数量和计算量的同时,提高了检测精度,满足了无人机航拍图像实时检测的应用需求。

关键词: 小目标检测, 轻量化, RT-DETR, 部分卷积

Abstract: Aiming to address the issue of missed and false detection of small objects in aerial photography images by existing object detection algorithms, an improved algorithm based on RT-DETR (real-time detection transformer) is proposed. Partial convolution (PConv) is introduced into the backbone network, and a PConvBlock structure is designed. Then, a BasicBlock-PConvBlock module composed of PConvBlocks replaces the original BasicBlock, effectively reducing the number of model parameters. The bidirectional feature pyramid network (BiFPN) structure is adopted to optimize the feature fusion module. The S2 feature is introduced to enhance the detection ability of small objects. The CARAFE upsampling operator is introduced to strengthen the fast fusion of multi-scale features. Experimental results show that the improved model has a 13.9% reduction in parameter number compared to the RT-DETR model, and the mAP0.5 and mAP0.5:0.95 indicators are improved by 2.4 and 1.9 percentage points, respectively on the VisDrone test set. On the TT100K and DOTA datasets, the improved model outperforms the RT-DETR algorithm. The improved model significantly enhances detection accuracy while maintaining a smaller parameter number and computational cost, meeting the real-time detection application requirements for drone aerial photography images.

Key words: small object detection, lightweight, RT-DETR, partial convolution