Algorithm for Real-Time Vehicle Detection from UAVs Based on Optimizing and Improving YOLOv8

doi:10.3778/j.issn.1002-8331.2312-0291

Abstract

Abstract: To address the problems of low accuracy, easy interference from background environment and difficulty in detecting small target vehicles of existing UAV vehicle detection algorithms, an improved UAV vehicle detection algorithm YOLOv8-CX is proposed based on YOLOv8. By integrating the advantages of Deformable Convolutional Networks v1-3, a C2f-DCN module is proposed to flexibly sample features and better extract features between vehicles of different sizes. Utilizing the idea of large separable kernel attention, a SPPF-LSKA module is proposed with long-range dependency and self-adaptability, which can effectively reduce background interference on vehicle detection. In the neck network, a CF-FPN (ment network for tiny object deteciton) feature fusion structure is adopted to enhance the detection accuracy of small targets by combining contextual information and suppressing conflicts between features at different scales. Finally, the original YOLOv8 head is replaced with a Dynamic Head detection head. By unifying scale, space and task, the three types of attention mechanisms, the model detection performance is further improved. Experimental results show that on the Mapsai dataset, compared with the original algorithm, the improved algorithm increases the accuracy (P), recall (R) and mean average precision (mAP) by 8.5, 11.2 and 6.2 percentage points respectively, and the algorithm detection speed reaches 72.6 FPS, meeting the real-time requirements of UAV vehicle detection. By comparing with other mainstream target detection algorithms, the effectiveness and superiority of this method are validated.

Key words: unmanned vehicle detection, YOLOv8, deformable convolution, attention mechanism, feature fusion

摘要： 针对现有无人机车辆检测算法精度低、易受背景环境干扰、难以检测微小目标车辆问题，提出了一种改进YOLOv8的无人机车辆检测算法YOLOv8-CX。结合Deformable Convolutional Networks v1-3的优点，提出一种能够灵活采样特征的C2f-DCN模块，以更好地提取不同尺寸大小车辆之间的特征。利用Large Separable Kernel Attention的思想，提出了具有长程依赖性和自适应能力的SPPF-LSKA模块，可以有效减少背景对于车辆检测的干扰。在颈部网络，采用CF-FPN（ment network for tiny object deteciton）特征融合结构，通过结合上下文信息和抑制不同尺度特征之间的冲突信息，提升了对小目标的检测精度。最后，将原始YOLOv8的头部替换为Dynamic Head检测头。通过将尺度、空间和任务三种注意力机制结合统一，进一步提升了模型的检测性能。实验结果表明，在Mapsai数据集上，改进算法与原算法相比准确率（P）、召回率（R）、平均精度（mAP）分别提升了8.5、11.2和6.2个百分点，且算法检测速度达到72.6?FPS，满足无人机车辆检测实时性的要求。通过与其他主流目标检测算法比较，验证了该方法的有效性和卓越性。

关键词: 无人机车辆检测, YOLOv8, 可变形卷积, 注意力机制, 特征融合

SHI Tao, CUI Jie, LI Song. Algorithm for Real-Time Vehicle Detection from UAVs Based on Optimizing and Improving YOLOv8[J]. Computer Engineering and Applications, 2024, 60(9): 79-89.

史涛, 崔杰, 李松. 优化改进YOLOv8实现实时无人机车辆检测的算法[J]. 计算机工程与应用, 2024, 60(9): 79-89.

References

[1] 关晓斌, 李战明. 基于SIFT和HOG特征融合的视频车辆检测算法[J]. 计算机与数字工程, 2021, 49(6): 1113-1117.
GUAN X B, LI Z M. Vehicle detection algorithm based on video SIFT and HOG feature fusion[J]. Computer & Digital Engineering, 2021, 49(6): 1113-1117.
[2] 张凯, 李华文. 一种基于SVM和HOG特征的视频车辆识别算法[J]. 电子世界, 2019(7): 74-75.
ZHANG K, LI H W. A video vehicle recognition algorithm based on SVM and HOG features[J]. Electronics World, 2019 (7): 74-75.
[3] 魏相站, 邵丽萍, 周骅. 基于改进的Faster RCNN模型在车辆类型检测中的应用[J]. 智能计算机与应用, 2020, 10(7): 97-100.
WEI X Z, SHAO L P, ZHOU Y. Application of improved Faster RCNN model in vehicle type detection[J]. Intelligent Computer and Applications, 2020, 10(7): 97-100.
[4] 曹磊, 王强, 史润佳, 等. 基于改进RPN的Faster-RCNN网络SAR图像车辆目标检测方法[J]. 东南大学学报(自然科学版), 2021, 51(1): 87-91.
CAO L, WANG Q, SHI R J, et al. Method for vehicle target detection on SAR image based on improved RPN in Faster-RCNN [J]. Journal of Southeast University(Natural Science Edition), 2021, 51(1): 87-91.
[5] 赵宇航, 左辰煜, 朱俊杰, 等. 基于YOLO V3的无人机航拍车辆检测方法[J]. 电子世界, 2020(13): 110-111.
ZHAO Y H, ZUO C Y, ZHU J J, et al. Unmanned aerial vehicle detection method based on YOLO V3[J]. Electronic World, 2020 (13): 110-111.
[6] 宋世奇, 李旭, 祝雪芬, 等. 基于改进SSD的航拍城市道路车辆检测方法[J]. 传感器与微系统, 2021, 40(1): 114-117.
SONG S Q, LI X, ZHU X F, et al. Urban road vehicle detection method by aerial photography based on improved SSD[J]. Transducer and Microsystem Technologies, 2021, 40(1): 114-117.
[7] 范江霞, 张文豪, 张丽丽, 等. 改进YOLOv5的无人机影像车辆检测方法[J]. 遥感信息, 2023, 38(3): 114-121.
FAN J X, ZHANG W H, ZHANG L L, et al. Vehicle detection method of UAV imagery based on YOLOv5[J]. Remote Sensing Information, 2023, 38(3): 114-121.
[8] 赵倩, 杨一聪. 多重金字塔的轻量化遥感车辆小目标检测算法[J]. 电子测量技术, 2023, 46(13): 88-94.
ZHAO Q, YANG Y C. Small object detection algorithm for lightweight remote sensing vehicles with multiple pyramids[J]. Electronic Measurement Technology, 2023, 46(13): 88-94.
[9] 张利丰, 田莹. 改进YOLOv8的多尺度轻量型车辆目标检测算法[J]. 计算机工程与应用, 2024, 60(3): 129-137.
ZHANG L F, TIAN Y. Improved YOLOv8 multi-scale and lightweight vehicle object detection algorithm[J]. Computer Engineering and Applications, 2024, 60(3): 129-137.
[10] 孙庆. 基于Transformer和BiFPN的轻量化车辆检测算法研究[D]. 西安: 长安大学, 2023.
SUN Q. Research on a lightweight vehicle detection algorithm based on Transformer and BiFPN[D]. Xi’an: Chang’an University, 2023.
[11] 张河山, 范梦伟, 谭鑫, 等. 基于改进YOLOX的无人机航拍图像密集小目标车辆检测[J]. 吉林大学学报(工学版): 1-13[2023-12-28]. https://doi.org/10.13229/j.cnki.jdxbgxb.
20230779.
ZHANG H S, FAN M W, TAN X, et al. Vehicle detection of dense small targets in UAV aerial images based on improved YOLOX[J]. Journal of Jilin University (Engineering Science Edition): 1-13[2023-12-28]. https://doi.org/10.13229/j.cnki.jdxbgxb.20230779.
[12] DAI J, QI H, XIONG Y, et al. Deformable convolutional networks[C]//2017 IEEE International Conference on Computer Vision (ICCV), 2017: 764-773.
[13] ZHU X Z, HU H, LIN S, et al. Deformable ConvNets V2: more deformable, better results[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, June 15-20, 2019. New York: IEEE Press, 2019: 9300-9308.
[14] WANG W H, DAI J F, CHEN Z, et al. InternImage: exploring large-scale vision foundation models with deformable convolutions[J]. arXiv:2211.05778, 2022.
[15] LAU K W, PO L M, UR REHMAN Y A. Large separable kernel attention: rethinking the large kernel attention design in CNN[J]. arXiv:2309.01439, 2023.
[16] XIAO J S, ?ZHAO T, ?YAO Y T, et al. Context augmentation and feature refinement network for tiny object detection[C]//Under Review As a Conference Paper at ICLR 2022, 2022.
[17] DAI X, CHEN Y, XIAO B, et al. Dynamic head: unifying object detection heads with attentions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 7373-7382.
[18] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[19] REDMON J, FARHADI A. YOL9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7272.
[20] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018.
[21] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[J]. arXiv:2004.10934, 2020.
[22] 王鹏飞, 黄汉明, 王梦琪. 改进YOLOv5的复杂道路目标检测算法[J]. 计算机工程与应用, 2022 , 58(17): 81-92.
WANG P F, HUANG H M, WANG M Q. Complex road target detection algorithm based on improved YOLOv5[J]. Computer Engineering and Applications, 2022, 58(17): 81-92.
[23] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for realtime object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 7464-7475.
[24] 刘卫光, 刘东, 王璐. 可变形卷积网络研究综述[J]. 计算机科学与探索, 2023, 17(7): 1549-1564.
LIU W G, LIU D, WANG L. Survey of deformable convolutional networks[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(7): 1549-1564.
[25] 赵珊, 郑爱玲, 刘子路, 等. 通道分离双注意力机制的目标检测算法[J]. 计算机科学与探索, 2023, 17(5): 1112-1125.
ZHAO S, ZHENG A L, LIU Z L, et al. Object detection algorithm based on channel separation dual attention mechanism[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(5): 1112-1125.
[26] 赵振兵, 王帆帆, 刘良帅, 等. 基于注意力特征融合YOLOv5模型的无人机输电线路航拍图像金具检测方法[J]. 电测与仪表, 2023, 60(3): 145-152.
ZHAO Z B, WANG F F, LIU L S, et al. Transmission line image fitting detection method based on attention feature fusion YOLOv5 model[J]. Electrical Measurement & Instrumentation, 2023, 60(3): 145-152.