UAV Image Small Object Detection on Complex Background

doi:10.3778/j.issn.1002-8331.2303-0154

Abstract

Abstract: Algorithm for small object detection, called EMT-ECoTNet, has been proposed. It is based on the improved YOLOv7-w6 and aims to address the issue of low detection accuracy resulting from complex backgrounds and small object features in UAV images. The ECoT Block is used to construct the algorithm, which consists of CoT modules with global modeling advantages and MA-ECA channel attention modules. This block is beneficial for small object feature extraction by increasing the maximum pooling layer MaxPool to extract more texture information from small object. Additionally, the M-SPPFCSPC, which has a large receptive field, is used to further enhance the small object features. The EIoU loss function is used to penalize the predicted width and height between the predicted and ground truth boxes, which helps to improve the convergence speed and accuracy. The experimental results demonstrate that EMT-ECoTNet achieves an mAP50 of 62.8% on the VisDrone dataset, which is 3.2?percentage points higher than the original baseline model YOLOv7-w6. Furthermore, it has better detection performance than mainstream algorithms in UAV small object detection tasks.

Key words: UAV images, complex background, small object detection, attention mechanism, spatial pyramid pooling

摘要： 针对无人机航拍图像背景复杂、目标特征小而导致检测精度低的问题，提出了一种基于YOLOv7-w6改进的小目标检测算法EMT-ECoTNet。采用具有全局建模优势的CoT模块和增加最大池化层MaxPool用以挖掘小目标更多纹理信息的MA-ECA通道注意力模块构建的ECoT Block，有利于小目标特征提取；通过具有大感受野的空间金字塔池化结构M-SPPFCSPC对小目标特征进一步增强；使用EIoU损失函数分别对预测框和真实框之间宽和高的预测结果进行惩罚来提高收敛速度和准确率。实验结果表明，EMT-ECoTNet在VisDrone数据集上mAP50达到62.8%，较原始基线模型YOLOv7-w6提高了3.2个百分点，比主流算法在无人机小目标检测任务上具有更好的检测性能。

关键词: 无人机图像, 复杂背景, 小目标检测, 注意力机制, 空间金字塔池化

WANG Xiaohong, HU Yu. UAV Image Small Object Detection on Complex Background[J]. Computer Engineering and Applications, 2023, 59(15): 107-114.

王晓红, 胡豫. 复杂背景下的无人机图像小目标检测[J]. 计算机工程与应用, 2023, 59(15): 107-114.

References

[1] ZONG H S，PU H B，ZHANG H L，et al.Small object detection in UAV image based on slicing aided module[C]//2022 IEEE 4th International Conference on Power，Intelligent Computing and Systems，2022：366-370.
[2] KUMAR R，DEB A K.A sparse-dense HOG window sampling technique for fast pedestrian detection in aerial images[C]//Lecture Notes in Electrical Engineering，2022：437-450.
[3] ZHAO H S，YANG D D，YU J K.3D target detection using dual domain attention and SIFT operator in indoor scenes[J].Visual Computer，2022，38（11）：3765-3774.
[4] 黄广俊，邓元龙.融合改进LBP和SVM的偏光片外观缺陷检测与分类[J].计算机工程与应用，2020，56（22）：251-255.
HUANG G J，DENG Y L.Polarizer visual defect detection and classification based on improved LBP and SVM algorithm[J].Computer Engineering and Applications，2020，56（22）：251-255.
[5] 余震，何留杰，王振飞.基于中智理论与方向α-均值的图像边缘检测算法[J].电子测量与仪器学报，2020，32（3）：8-16.
YU Z，HE L J，WANG Z F.Image edge detection based on intelligence theory and direction α-mean[J].Journal of Electronic Measurement and Instrumentation，2020，32（3）：8-16.
[6] REN S Q，HE K M，GIRSHICK R，et al.Faster R-CNN：towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2017，39（6）：1137-1149.
[7] LIU W，ANGUELOV D，ERHON D，et al.SSD：single shot multibox detector[C]//Lecture Notes in Computer Science，Amsterdam，2016：21-37.
[8] REDMON J，DIVVALA S，GIRSHICK R，et al.You only look once：unified，real-time object detection[C]//Proceeings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition，2016：779-788.
[9] REDMON J，FARHADIA.YOLOv3：an incremental improvement[EB/OL].（2018-04）[2022-11].https：//arxiv.org/abs/1804.02767.
[10] CAO Z M，HAN Y，KONG L J，et al.Multi-scene small object detection with modified YOLOv4[C]//Journal of Physics：Conference Series，2022.
[11] QIU M L，HUANG L，TANG B H.ASFF-YOLOv5：multielement detection method for road traffic in UAV images based on multiscale feature fusion[J].Remote Sensing，2022，14（14）.
[12] GONG H，MU T K，LI Q X，et al.Swin-transformer-enabled YOLOv5 with attention mechanism for small object detection on satellite images[J].Remote Sensing，2022，14（12）：2861.
[13] WANG C Y，BOCHKOVSKIY A，LIAO H Y M.YOLOv7：Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[EB/OL].（2022-07）[2022-11].https：//arxiv.org-/abs/-2-207.02696.
[14] 徐光达，毛国君.多层级特征融合的无人机航拍图像目标检测[J].计算机科学与探索，2023，17（3）：635-645.
XU G D，MAO G J.Aerial image object detection of uav based on multi-level feature fusion[J].Journal of Frontiers of Computer Science and Technology，2023，17（3）：635-645.
[15] LI Y H，YAO T，PAN Y W，et al.Contextual transformer networks for visual recognition[J].arXiv：2107.12292v1，2021.
[16] 王剑哲，吴秦.坐标注意力特征金字塔的显著性目标检测算法[J].计算机科学与探索，2023，17（1）：154-165.
WANG J Z，WU Q.Salient object detection based on coordinate attention feature pyramid[J].Journal of Frontiers of Computer Science and Technology，2023，17（1）：154-165.
[17] WANG Q L，WU B G，ZHU P F，et al.ECA-Net：efficient channel attention for deep convolutional neural networks[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition，2020，11531-11539.
[18] ZHANG Y F，REN W Q，ZHENG Z，et al.Focal and efficient IOU loss for accurate bounding box regresion[J].Neurocomputing，2022，506：146-157.
[19] GEVORGYAN Z.SIoU loss：more powerful learning for bounding box regression[EB/OL].（2022-05）[2022-11].https：//arxiv.org/abs/2205.12740.
[20] ZHU J R，WANG X D，LIU Y，et al.UavTinyDet：tiny object detection in UAV scenes[C]//2022 7th Interntional Conference on Image，Vision and Computing，2022：195-200.
[21] HUANG H，LI L L，MA H B.An improved cascade R-CNN-based target detection algorithm for UAV aerial images[C]//2022 7th International Conference on Image，Vision and Computing，2022：232-237.
[22] MEHTA S，RASTEGARI M.MobileViT：light-weight，generalpurpose，and mobile-friendly vision transformer[EB/OL].（2021-10）[2022-11].https：//arxiv.org/abs/2110.02178.
[23] CHALAVADI V，JERIPOTHULA P，DATLA R，et al.mSODANet：a network for multi-scale object detection in aerial images using hierarchical dilated convolutions[J].Pattern Recognition，2022，126.
[24] 陈旭，彭冬亮，谷雨.基于改进YOLOv5s的无人机图像实时目标检测[J].光电工程，2022，49（3）：69-81.
CHEN X，PENG D L，GU Y.Real-time object detection for UAV images based on improved YOLOv5s[J].Opto-Electronic Engineering，2022，49（3）：69-81.