计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (15): 144-155.DOI: 10.3778/j.issn.1002-8331.2501-0293

• 目标检测专题 • 上一篇    下一篇

改进RT-DETR的小目标检测方法研究

程鑫淼,张雪松,曹冰洁,宋存利   

  1. 大连交通大学 轨道智能工程学院,辽宁 大连 116052
  • 出版日期:2025-08-01 发布日期:2025-07-31

Research on Small Object Detection Method of Improved RT-DETR

CHENG Xinmiao, ZHANG Xuesong, CAO Bingjie, SONG Cunli   

  1. School of Railway Intelligent Engineering, Dalian Jiaotong University, Dalian, Liaoning 116052, China
  • Online:2025-08-01 Published:2025-07-31

摘要: 针对复杂场景小目标检测中存在的背景干扰严重、特征表达能力不足等问题,提出了一种基于改进RT-DETR的小目标检测模型DA-DETR。在骨干网络中引入了一种多阶门控聚合模块(multi-order gated aggregation block),通过增强局部与全局特征的差异性使目标检测器能更好地区分前景物体和嘈杂背景。引入了卷积加性标记混合器(convolutional additive token mixer,CATM),进一步减少了特征丢失,提升了模型的全局与局部信息整合能力。提出了一种改进的损失函数CoreProximity-IoU,其对于小目标检测的IoU变化更敏感。实验结果表明,DA-DETR模型在VisDrone2019数据集上的mAP@50和mAP@50:95分别提升了2.8和2.3个百分点,在KITTI数据集上的mAP@50和mAP@50:95分别比RT-DETR提升了0.6和0.4个百分点。此外,模型计算量和参数量均有显著的减少,进一步验证了所提出方法的有效性和优越性。

关键词: 小目标检测, RT-DETR, 复杂场景, 背景干扰

Abstract: To address the challenges of severe background interference and insufficient feature representation in small object detection within complex scenarios, an improved RT-DETR-based model, DA-DETR, is proposed. A multi-order gated aggregation block is introduced into the backbone network to enhance the distinction between local and global features, enabling the detector to better differentiate foreground objects from noisy backgrounds. The convolutional additive token mixer is incorporated to reduce feature loss and improve the integration of global and local information. Finally, an improved loss function, CoreProximity-IoU, is designed to be more sensitive to IoU variations in small object detection.  Experimental results demonstrate that the DA-DETR model achieves a 2.8 and 2.3 percentage points improvement in mAP@50 and mAP@50:95, respectively, on the VisDrone2019 dataset. On the KITTI dataset, mAP@50 and mAP@50:95 increase by 0.6 and 0.4 percentage points, respectively, compared to RT-DETR. Additionally, the model significantly reduces computational complexity and parameter count, further validating its effectiveness and superiority.

Key words: small target detection, RT-DETR, complex scenes, background interference