Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (1): 236-244.DOI: 10.3778/j.issn.1002-8331.2208-0025

• Graphics and Image Processing • Previous Articles     Next Articles

Improved DDETR UAV Target Detection Algorithm Incorporating Occlusion Information

ZHOU Jianting, XUAN Shibin, WANG Ting   

  1. 1. College of Electronic Information, Guangxi Minzu University, Nanning 530006, China
    2. College of Artificial Intelligence, Guangxi Minzu University, Nanning 530006, China
    3. Guangxi Key Laboratory of Hybrid Computation and IC Design and Analysis, Nanning 530006, China
  • Online:2024-01-01 Published:2024-01-01



  1. 1.广西民族大学 电子信息学院,南宁 530006
    2.广西民族大学 人工智能学院,南宁 530006
    3.广西混杂计算与集成电路设计分析重点实验室,南宁 530006

Abstract: For the problem of complex target scenes, many small targets and severe occlusion in UAV aerial images, an improved DDETR (deformable DETR) UAV target detection algorithm that incorporates target occlusion information is proposed. The proposed model replaces the residual network in the DDETR model with Swin Transformer to obtain richer multi-level semantic features; increases the use of low-level features in the DDETR model to improve the detection of small and medium-sized targets; and uses the proposed occlusion degree estimation module to assist the model in solving the occlusion problem, so that the model can better detect the heavily occluded targets. The mean average precision (AP) of 32.3% is achieved on the VisDrone dataset, which is 3.3 percentage points higher than the AP value of the standard DDETR model, and reaches the current advanced level compared with the mainstream UAV aerial image target detection methods.

Key words: unmanned aerial vehicle (UAV) target detection, deep learning, cross attention, deformable convolution

摘要: 针对无人机航拍图像中目标场景复杂、小目标多、遮挡严重的问题,提出了一种融合目标遮挡信息的改进DDETR(deformable DETR)的无人机目标检测算法。提出模型用Swin Transformer代替DDETR模型中残差网络来获得更丰富的多层次语义特征;增加DDETR模型对低层次特征的使用来提高对中小目标的检测效果;利用提出的遮挡程度估计模块来辅助模型解决遮挡问题,使模型能更好地检测出遮挡严重的目标。在VisDrone数据集上达到32.3%的平均准确度均值(mean average precision,AP),比标准DDETR模型AP值提高了3.3个百分点,与主流无人机航拍图像目标检测方法相比,达到了当前先进水平。

关键词: 无人机目标检测, 深度学习, 交叉注意力, 可变形卷积