Improved YOLOv8 Multi-Scale and Lightweight Vehicle Object Detection Algorithm

doi:10.3778/j.issn.1002-8331.2309-0145

Abstract

Abstract: To address issues such as high hardware requirements, low detection accuracy, and a high rate of missed overlapping targets in traditional vehicle object detection models, a modified vehicle object detection algorithm called RBT-YOLO based on YOLOv8 is proposed. The main network is reconstructed using a multi-scale fusion approach. BiFPN is improved by adding convolutional operations and adjusting input/output channel numbers to adapt to YOLOv8, enhancing its feature fusion capability. After the feature maps are output from the Neck section, a lightweight attention mechanism called Triplet Attention is introduced to enhance the feature extraction ability of the model. To address the issue of high target overlap in real scenarios, SoftNMS (soft non-maximum suppression) is used to replace the original NMS, making the model to handle the candidate boxes more gentle, thereby strengthening detection capabilities of the model and improving recall rates. Experimental results on the Pascal VOC and MS COCO datasets demonstrate that the proposed RBT-YOLO outperforms the original model, reducing parameters and computations by approximately 60%, the mAP improved by 2.6 and 3.0 percentage points, and excelling in both size and precision compared to other classic detection models, thus demonstrating strong practical utility.

Key words: vehicle detection, multi-scale, attention mechanism, YOLOv8, non-maximum suppression

摘要： 针对传统车辆目标检测模型设备需求高、检测精度低、重叠目标漏检率高等问题，提出了一种改进YOLOv8的车辆目标检测算法RBT-YOLO。采用多尺度融合的方式对主干网络进行重构。对BiFPN进行改进，增加卷积操作以及调整输入输出通道个数以适应YOLOv8，加强其特征融合能力。在Neck部分输出的特征图之后加入轻量型注意力机制Triplet Attention，提升模型的特征提取能力。针对真实情况下车辆目标重叠度较高的问题，使用SoftNMS（soft non-maximum suppression）替换原有NMS，使模型对候选框的处理方式更为温和，增强了模型对目标的检测能力，提升了召回率。在Pascal VOC和MS COCO数据集上进行实验，结果表明提出的RBT-YOLO性能超越原始模型，参数量和计算量下降60%左右，mAP分别提高了2.6和3.0个百分点，并在体积和精度上优于其他经典检测模型，具有很强的实用性。

关键词: 车辆检测, 多尺度, 注意力机制, YOLOv8, 非极大值抑制

ZHANG Lifeng, TIAN Ying. Improved YOLOv8 Multi-Scale and Lightweight Vehicle Object Detection Algorithm[J]. Computer Engineering and Applications, 2024, 60(3): 129-137.

张利丰, 田莹. 改进YOLOv8的多尺度轻量型车辆目标检测算法[J]. 计算机工程与应用, 2024, 60(3): 129-137.

References

[1] 茅智慧, 朱佳利, 吴鑫, 等. 基于YOLO的自动驾驶目标检测研究综述[J]. 计算机工程与应用, 2022, 58(15): 68-77.
MAO Z H, ZHU J L, WU X, et al. Review of YOLO based target detection for autonomous driving[J]. Computer Engineering and Applications, 2022, 58(15): 68-77.
[2] SIVARAMAN S, TRIVEDI M M. Looking at vehicles on the road: a survey of vision-based vehicle detection, tracking, and behavior analysis[J]. IEEE Transactions on Intelligent Transportation Systems, 2013, 14(4): 1773-1795.
[3] TUERMER S, KURZ F, REINARTZ P, et al. Airborne vehicle detection in dense urban areas using HOG features and disparity maps[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2013, 6(6): 2327-2337.
[4] GIRSHICK R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440-1448.
[5] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015.
[6] HE K, GKIOXARI G, DOLLáR P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2961-2969.
[7] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[8] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.
[9] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018.
[10] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[J]. arXiv:2004.10934, 2020.
[11] GE Z, LIU S, WANG F, et al. Yolox: exceeding yolo series in 2021[J]. arXiv:1606.08415, 2021.
[12] LI C, LI L, JIANG H, et al. YOLOv6: a single-stage object detection framework for industrial applications[J]. arXiv:2209.02976, 2022.
[13] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[J]. arXiv:2207.02696, 2022.
[14] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//14th European Conference on Computer Vision, Amsterdam, 2016: 21-37.
[15] 郑玉珩, 黄德启. 改进MobileViT与YOLOv4的轻量化车辆检测网络[J]. 电子测量技术, 2023, 46(2): 175-183.
ZHENG Y H, HUANG D Q. Lightweight vehicle detection network based on MobileViT and YOLOv4[J]. Electronic Measurement Technology, 2023, 46(2): 175-183.
[16] 刘浩翰, 樊一鸣, 贺怀清, 等. 改进YOLOv7-tiny的目标检测轻量化模型[J]. 计算机工程与应用, 2023, 59(14): 166-175.
LIU H H, FAN Y M, HE H Q, et al. Improved YOLOv7-tiny’s object detection lightweight model[J]. Computer Engineering and Applications, 2023, 59(14): 166-175.
[17] DONG X D, YAN S, DUAN C Q. A lightweight vehicles detection network model based on YOLOv5[J]. Engineering Applications of Artificial Intelligence, 2022, 113: 104914.
[18] TAN M, PANG R, LE Q V. Efficientdet: scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 10781-10790.
[19] BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS: improving object detection with one line of code[C]//2017 IEEE International Conference on Computer Vision, Venice, Italy, October 22-29, 2017. New York: IEEE Press, 2017: 5562-5570.
[20] CAI Y, ZHOU Y, HAN Q, et al. Reversible column networks[J]. arXiv:2212.11696, 2022.
[21] LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 8759-8768.
[22] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2117-2125.
[23] MISRA D, NALAMADA T, ARASANIPALAI A U, et al. Rotate to attend: convolutional triplet attention module[C]//2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2021: 3138-3147.
[24] 荆修平, 田莹. 采用长距离依赖和多尺度表达的轻量化车辆检测[J]. 光学精密工程, 2023, 31(6): 950-961.
JING X P, TIAN Y. Lightweight vehicle detection using long-distance dependence and multi-scale representation[J]. Optics and Precision Engineering, 2023, 31(6): 950-961.
[25] JING X P, TIAN Y. Lightweight vehicle detection based on improved Yolox-nano[J]. IAENG International Journal of Computer Science, 2023, 50(1).