YOLO系列目标检测算法研究进展

doi:10.3778/j.issn.1002-8331.2301-0081

摘要/Abstract

摘要： YOLO算法是目标检测中研究的热点方向之一。近几年，随着YOLO系列算法及其改进模型的不断提出，使其在目标检测领域取得了优异的成绩，被广泛应用于现实中各个领域。针对YOLO系列目标检测算法，整理了目标检测典型数据集及评价指标；回顾了YOLO整体框架以及YOLOv1~YOLOv7目标检测算法的发展历程；总结了在输入、特征提取和预测这三个阶段下的数据增强、轻量化网络构建和IOU损失优化等八个改进方向的模型及性能；介绍了YOLO算法应用领域；结合目标检测目前存在的实际问题，总结并展望了YOLO算法的发展方向。

关键词: 计算机视觉, 目标检测, YOLO, 改进模型

Abstract: The YOLO-based algorithm is one of the hot research directions in target detection. In recent years, with the continuous proposition of YOLO series algorithms and their improved models, the YOLO-based algorithm has achieved excellent results in the field of target detection and has been widely used in various fields in reality. This article first introduces the typical datasets and evaluation index for target detection and reviews the overall YOLO framework and the development of the target detection algorithm of YOLOv1~YOLOv7. Then, models and their performance are summarized across eight improvement directions, such as data augmentation, lightweight network construction, and IOU loss optimization, at the three stages of input, feature extraction, and prediction. Afterwards, the application fields of YOLO algorithm are introduced. Finally, combined with the actual problems of target detection, it summarizes and prospects the development direction of the YOLO-based algorithm.

Key words: computer vision, object detection, YOLO, improved model

王琳毅, 白静, 李文静, 蒋金哲. YOLO系列目标检测算法研究进展[J]. 计算机工程与应用, 2023, 59(14): 15-29.

WANG Linyi, BAI Jing, LI Wenjing, JIANG Jinzhe. Research Progress of YOLO Series Target Detection Algorithms[J]. Computer Engineering and Applications, 2023, 59(14): 15-29.

参考文献

[1] GIRSHICK R，DONAHUE J，DARRELL T，et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2014：580-587.
[2] REDMON J，DIVVALA S，GIRSHICK R，et al.You only look once：unified，real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：779-788.
[3] WANG C Y，BOCHKOVSKIY A，LIAO H.YOLOv7：trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2023：7464-7475.
[4] 叶赵兵，段先华，赵楚.改进YOLOv3-SPP水下目标检测研究[J].计算机工程与应用，2023，59（6）：231-240.
YE Z B，DUAN X H，ZHAO C.Research on underwater target detection by improved YOLOv3-SPP[J].Computer Engineering and Applications，2023，59（6）：231-240.
[5] 王建波，武友新.改进YOLOv4-tiny的安全帽佩戴检测算法[J].计算机工程与应用，2023，59（4）：183-190.
WANG J B，WU Y X.Helmet wearing detection algorithm of improved YOLOv4-tiny[J].Computer Engineering and Applications，2023，59（4）：183-190.
[6] TAN Y，CAI R，LI J，et al.Automatic detection of sewer defects based on improved you only look once algorithm[J].Automation in Construction，2021，131（6）：103912-103928.
[7] EVERINGHAM M，VAN GOOL L，WILLIAMS C K I，et al.The pascal visual object classes（VOC） challenge[J].International Journal of Computer Vision，2010，88（2）：303-338.
[8] EVERINGHAM M，ESLAMI S M A，VAN GOOL L，et al.The pascal visual object classes challenge：a retrospective[J].International Journal of Computer Vision，2015，111（1）：98-136.
[9] LIN T Y，MAIRE M，BELONGIE S，et al.Microsoft COCO：common objects in context[C]//European Conference on Computer Vision.Cham：Springer，2014：740-755.
[10] RUSSAKOVSKY O，DENG J，SU H，et al.ImageNet large scale visual recognition challenge[J].International Journal of Computer Vision，2015，115（3）：211-252.
[11] KUZNETSOVA A，ROM H，ALLDRIN N，et al.The open images dataset v4[J].International Journal of Computer Vision，2020，128（7）：1956-1981.
[12] XIA G S，BAI X，DING J，et al.Dota：a large-scale dataset for object detection in aerial images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：3974-3983.
[13] ZAIDI S，ANSARI M S，ASLAM A，et al.A survey of modern deep learning based object detection models[J].Digital Signal Processing，2022，126：103514-103530.
[14] ZOU Z X，SHI Z W，GUO Y H，et al.Object detection in 20 years：a survey[J].Proceedings of the IEEE，2023，111（3）：257-276.
[15] REDMON J，FARHADI A.Yolo9000：better，faster，stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：7263-7271.
[16] NEUBECK A，GOOL L.Efficient non-maximum sup-pression[C]//International Conference on Pattern Recog-nition，2006：850-855.
[17] REDMON J，FARHADI A.Yolov3：an incremental improvement[J].arXiv：1804.02767，2018.
[18] BOCHKOVSKIY A，WANG C Y，LIAO H Y M.Yolov4：optimal speed and accuracy of object detection[J].arXiv：2004.10934，2020.
[19] JOCHER G.Yolov5[EB/OL].[2023-03-20].https：//github.com/ultralytics/yolov5.
[20] LI C Y，LI L L，JIANG H L，et al.Yolov6：a single-stage object detection framework for industrial applications[J].arXiv：2209.02976，2022.
[21] 王鑫鹏，王晓强，林浩，等.深度学习典型目标检测算法的改进综述[J].计算机工程与应用，2022，58（6）：42-57.
WANG X P，WANG X Q，LIN H，et al.Review on improvement of typical object detection algorithms in deep learning[J].Computer Engineering and Applications，2022，58（6）：42-57.
[22] 冷坤，秦伦明，王悉.基于CA-ASFF-YOLOv4的交通标志识别研究[J/OL].计算机工程与应用（2022-12-09）[2022-12-30].https：//kns.cnki.net/kcms/detail//11.2127.TP.20221208.
1745.004.html.
LENG K，QIN L M，WANG X.Research on traffic sign recognition based on CA-ASFF-YOLOv4[J/OL].Computer Engineering and Applications（2022-12-09）[2022-12-30].https：//kns.cnki.net/kcms/detail//11.2127.TP.20221208.1745.
004.html.
[23] 张欣怡，张飞，郝斌，等.基于改进YOLOv5的口罩佩戴检测算法[J/OL].计算机工程（2022-12-09）[2022-12-30].https：//doi.org/10.19678/j.issn.1000-3428.0065701.
ZHANG X Y，ZHANG F，HAO B，et al.Improved YOLOv5s in mask wearing detection algorithm[J/OL].Computer Engineering（2022-12-09）[2022-12-30].https：//doi.org/10.19678/j.issn.1000-3428.0065701.
[24] 郭明镇，汪威，申红婷，等.改进型YOLOv4-tiny的轻量级目标检测算法[J/OL].计算机工程与应用（2022-11-26）[2022-12-30].https：//kns.cnki.net/kcms/detail/11.2127.TP.
20221125.1132.016.html.
GUO M Z，WANG W，SHEN H T，et al.Improved lightweight target detection algorithm for YOLOv4-tiny[J/OL].Computer Engineering and Applications（2022-11-26）[2022-12-30].https：//kns.cnki.net/kcms/detail/11.2127.TP.
20221125.1132.016.html.
[25] ZHANG D Y，CHEN X Y，REN Y M，et al.Smart-YOLO：a light-weight real-time object detection network[J].Journal of Physics：Conference Series，2021，1757（1）：012096.
[26] 何自芬，陈光晨，陈俊松，等.多尺度特征融合轻量化夜间红外行人实时检测[J].中国激光，2022，49（17）：130-139.
HE Z F，CHEN G C，CHEN J S，et al.Multi-scale feature fusion lightweight real-time infrared pedestrain detection at night[J].Chinese Journal of Lasers，2022，49（17）：130-139.
[27] WU T H，WANG T W，LIU Y Q.Real-time vehicle and distance detection based on improved YOLOv5 network[C]//2021 IEEE World Symposium on Artificial Intelligence（WSAI），2021：24-28.
[28] 赵凤，李永恒，李晶，等.基于改进YOLOv4-tiny的轻量化室内人员目标检测算法[J].电子与信息学报，2022，44（11）：3815-3824.
ZHAO F，LI Y H，LI J，et al.Lightweight indoor personnel detection algorithm based on improved YOLOv4-tiny[J].Journal of Electronics & Information Technology，2022，44（11）：3815-3824.
[29] YAN F，XU Y.Improved target detection algorithm based on YOLO[C]//2021 IEEE International Conference on Robotics，Control and Automation Engineering（RCAE），2021：21-25.
[30] LI J C，WANG H Z，XU Y，et al.Road object detection of YOLO algorithm with attention mechanism[J].Frontiers in Signal Processing，2021，5（1）：9-16.
[31] MA Y J，ZHANG S H.Feature selection module for CNN based object detector[J].IEEE Access，2021，9：69456-69466.
[32] JU M R，LUO J N，WANG Z B，et al.Adaptive fea-ture fusion with attention mechanism for multi-scale target detection[J].Neural Computing and Applications，2020，33（7）：2769-2781.
[33] 陈思雨，付章杰.融合高效注意力的多尺度输电线路部件检测[J/OL].计算机工程与应用（2023-01-03）[2023-01-18].https：//kns.cnki.net/kcms/detail//11.2127.TP.20230103.1221.
002.html.
CHEN S Y，FU Z J，et al.Multi-scale transmission line component detection incorporating efficient attention[J/OL].Computer Engineering and Applications（2023-01-03） [2023-01-18].https：//kns.cnki.net/kcms/detail//11.2127.TP.
20230103.1221.002.html.
[34] HUANG Z C，WANG J L，FU X S，et al.DC-SPP-YOLO：dense connection and spatial pyramid pooling based YOLO for object detection[J].Information Sciences，2020，522：241-258.
[35] 钱伍，王国中，李国平.改进YOLOv5的交通灯实时检测鲁棒算法[J].计算机科学与探索，2022，16（1）：231-241.
QIAN W，WANG G Z，LI G P.Improved YOLOv5 traffic light real-time detection robust algorithm[J].Journal of Frontiers of Computer Science and Technology，2022，16（1）：231-241.
[36] 王志欣，万绍俊，马晓莹．改进锚点框与融合多尺度特征的光学遥感目标检测[J].无线电工程，2021，51（9）：915-920.
WANG Z X，WAN S J，MA X Y.Optical remote sensing target detection based on improved anchor frames and fused multi-scale features[J].Radio Engineering，2021，51（9）：915-920．
[37] 杨锦辉，李鸿，杜芸彦，等.基于改进YOLOv5s的轻量化目标检测算法[J].电光与控制，2023，30（2）：24-30.
YANG J H，LI H，DU Y Y，et al.Lightweight object detection algorithm based on improved YOLOv5s[J].Electronics Optics & Control，2023，30（2）：24-30.
[38] YANG Y H，LI B.Water area object detection based on YOLO-fusion network[J].International Core Journal of Engineering，2021，7（5）：100-107.
[39] 宋艳艳，谭励，马子豪，等.改进YOLOV3算法的视频目标检测[J].计算机科学与探索，2021，15（1）：163-172.
SONG Y Y，TAN L，MA Z H，et al.Video target detection based on improved YOLOV3 algorithm[J].Journal of Frontiers of Computer Science and Technology，2021，15（1）：163-172.
[40] ZHANG Z，LU X，CAO G，et al.ViT-YOLO：transformer based YOLO for object detection[C]//2021 IEEE International Conference on Computer Vision（ICCV），2021：2799-2808.
[41] ZHU X K，LYU S C，WANG X，et al.TPH-YOLOv5：improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//2021 IEEE/CVF International Conference on Computer Vison Workshops（ICCVW），2021：2778-2788.
[42] 汤寓麟，李厚朴，张卫东，等.侧扫声纳检测沉船目标的轻量化DETR-YOLO法[J].系统工程与电子技术，2022，44（8）：2427-2436.
TANG Y L，LI H P，ZHANG W D，et al.Lightweight DETR-YOLO method for detecting shipwreck target in side-scan sonar[J].Systems Engineering and Electronics，2022，44（8）：2427-2436.
[43] AKSOY T，HALICI U.Analysis of visual reasoning on one-stage object detection[J].arXiv：2202.13115，2022.
[44] OUYANG H.DEYO：DETR with YOLO for step-by-step object detection[J].arXiv：2211.06588，2022.
[45] JU M R，LUO H B，WANG Z B，et al.The application of improved YOLOv3 in multi-scale target detection[J].Applied Sciences，2019，9（18）：3775-3788.
[46] 姜文志，李炳臻，顾佼佼，等.基于改进YOLO V3的舰船目标检测算法[J].电光与控制，2021，28（6）：52-56.
JIANG W Z，LI B Z，GU J J，et al.A ship target detection algorithm based on improved YOLO V3[J].Electronics Optics ＆ Control，2021，28（6）：52-56.
[47] YING Z P，LIN Z T，WU Z Y，et al.A modified-YOLOv5s model for detection of wire braided hose defects[J].Measurement，2022，190：110683-110693.
[48] LIU T，PANG B，AI S M，et al.Study on visual detection algorithm of sea surface targets based on improved YOLOv3[J].Sensors，2020，20（24）：7263-7276.
[49] BODLA N，SINGH B，CHELLAPPA R，et al.Soft-NMS-improving object detection with one line of code[C]//Proceedings of the IEEE International Conference on Computer Vision，2017：5561-5569.
[50] LIU S，HUANG D，WANG Y.Adaptive NMS：refining pedestrian detection in a crowd[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：6459-6468.
[51] BOLYA D，ZHOU C，XIAO F，et al.YOLACT：real-time instance segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision，2019：9157-9166.
[52] ZHENG Z，WANG P，REN D，et al.Enhancing geo-metric factors in model learning and inference for object detection and instance segmentation[J].arXiv：2005.03572，2020.
[53] REZATOFIGHI H，TSOI N，GWAK J Y，et al.Generalized intersection over union：a metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：658-666.
[54] ZHENG Z，WANG P，LIU W，et al.Distance-IoU loss：faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2020：12993-13000.
[55] ZHANG Y F，REN W，ZHANG Z，et al.Focal and efficient IOU loss for accurate bounding box regression[J].Neurocomputing，2022，506：146-157.
[56] GEVORGYAN Z.SIoU loss：more powerful learning for bounding box regression[J].arXiv：2205.12740，2022.
[57] 许德刚，王露，李凡.深度学习的典型目标检测算法研究综述[J].计算机工程与应用，2021，57（8）：10-25.
XU D G，WANG L，LI F.Review of typical object detection algorithms for deep learning[J].Computer Engineering and Applications，2021，57（8）：10-25.
[58] 金雨芳，吴祥，董辉，等.基于改进YOLOv4的安全帽佩戴检测算法[J].计算机科学，2021，48（11）：268-275.
JIN Y F，WU X，DONG H，et al.Improved YOLOv4 algorithm for safety helmet wearing detection[J].Computer Science，2021，48（11）：268-275.
[59] 冯晨光，魏巍，陈灯，等.基于SlimYOLO的控制箱零件检测方法[J].电子测量技术，2022，45（17）：120-126.
FENG C G，WEI W，CHEN D，et al.Detection method of electrical cabinet parts based on SlimYOLO[J].Electronic Measurement Technology，2022，45（17）：120-126.
[60] 胡欣，周运强，肖剑，等.基于改进YOLOv5的螺纹钢表面缺陷检测[J/OL].图学学报（2023-01-06）[2023-03-17].https：//kns.cnki.net/kcms/detail//10.1034.T.20230106.1212.003.html.
HU X，ZHOU Y Q，XIAO J，et al.Surface defect detection of threaded steel based on improved YOLOv5[J/OL].Journal of Graphics（2023-01-06）[2023-03-17].https：//kns.cnki.net/kcms/detail//10.1034.T.20230106.1212.003.html.
[61] 邓杰，万旺根.基于改进YOLOv3的密集行人检测[J].电子测量技术，2021，44（11）：90-95.
DENG J，WAN W G.Dense pedestrian detection based on improved YOLOv3[J].Electronic Measurement Technology，2021，44（11）：90-95.
[62] 常青，韩文，王清华，等.改进YOLO轻量化网络的行人检测算法[J].光学技术，2022，48（1）：80-85.
CHANG Q，HAN W，WANG Q H，et al.Pedestrian detection algorithm based on improved YOLO lightweight network[J].Optical Technique，2022，48（1）：80-85.
[63] 向南，王璐，贾崇柳，等.改进YOLO的遮挡行人检测仿真[J].系统仿真学报，2023，35（2）：286-299.
XIANG N，WANG L，JIA C L，et al.Simulation of occluded pedestrian detection based on improved YOLO[J].Journal of System Simulation，2023，35（2）：286-299.
[64] 张帆，郭思媛，任方涛，等.基于改进YOLO v3的玉米叶片气孔自动识别与测量方法[J].农业机械学报，2023，54（2）：216-222.
ZHANG F，GUO S Y，REN F T，et al.Automatic identification and measurement of maize leaves stomata based on YOLO v3[J].Transactions of the Chinese Society for Agricultural Machinery，2023，54（2）：216-222.
[65] 郝鹏飞，刘立群，顾任远.YOLO-RD-Apple果园异源图像遮挡果实检测模型[J/OL].图学学报（2023-02-01）[2023-03-17].http：//kns.cnki.net/kcms/detail/10.1034.T.20230201.
1105.001.html.
HAO P F，LIU L Q，GU R Y.YOLO-RD-Apple orchard heterogenous image obscured fruit detection model[J/OL].Journal of Graphics（2023-02-01）[2023-03-17].http：//kns.cnki.net/kcms/detail/10.1034.T.20230201.1105.001.html.
[66] 冯娟，梁翔宇，曾立华，等.基于改进YOLO v4的单环刺螠洞口识别方法[J].农业机械学报，2023，54（2）：265-274.
FENG J，LIANG X Y，ZENG L H，et al.Urechis unicinctus burrows recognition method based on improved YOLO v4[J].Transactions of the Chinese Society for Agricultural Machinery，2023，54（2）：265-274.
[67] ZHUANG Z，LIU G，DING W，et al.Cardiac VFM visualization and analysis based on YOLO deep learning model and modified 2D continuity equation[J].Computerized Medical Imaging and Graphics，2020，82：101732-101743.
[68] SHARIF M，AMIN J，SIDDIQA A，et al.Recognition of different types of leukocytes using YOLOv2 and optimized bag-of-features[J].IEEE Access，2020，8：167448-167459.
[69] 王榆锋，李大海.改进YOLO框架的血细胞检测算法[J].计算机工程与应用，2022，58（12）：191-198.
WANG Y F，LI D H.Improved YOLO framework blood cell detection algorithm[J].Computer Engineering and Applications，2022，58（12）：191-198.
[70] 陈静，陈静波，孟瑜，等.尺度和密度约束下基于YOLOv3的风电塔架遥感检测方法[J].自然资源遥感，2021，33（3）：54-62.
CHEN J，CHEN J B，MENG Y，et al.Detection of wind turbine towers in remote sensing based on YOLOv3 model under scale and density constraints[J].Remote Sensing for Natural Resources，2021，33（3）：54-62．
[71] 肖振久，杨玥莹，孔祥旭.基于改进YOLOv4的遥感图像目标检测方法[J].激光与光电子学进展，2023，60（6）：407-415.
XIAO Z J，YANG Y Y，KONG X X.Object detection method based on improved YOLOv4 network for remote sensing images[J].Laser & Optoelectronics Progress，2023，60（6）：407-415.
[72] 闫钧华，张琨，施天俊，等.融合多层级特征的遥感图像地面弱小目标检测[J].仪器仪表学报，2022，43（3）：221-229.
YAN J H，ZHANG K，SHI T J，et al.Multi-level feature fusion based dim small ground target detection in remote sensing images[J].Chinese Journal of Scientific Instrument，2022，43（3）：221-229.
[73] 邵延华，张铎，楚红雨，等.基于深度学习的YOLO目标检测综述[J].电子与信息学报，2022，44（10）：3697-3708.
SHAO Y H，ZHANG D，CHU H Y，et al.A review of YOLO object detection based on deep learning[J].Journal of Electronics & Information Technology，2022，44（10）：3697-3708.
[74] 李科岑，王晓强，林浩，等.深度学习中的单阶段小目标检测方法综述[J].计算机科学与探索，2022，16（1）：41-58.
LI K C，WANG X Q，LIN H，et al.Survey of one-stage small object detection methods in deep learning[J].Journal of Frontiers of Computer Science and Technology，2022，16（1）：41-58.
[75] WANG Y，SHEN X，HU S X，et al.Self-supervised transformers for unsupervised object discovery using normalized cut[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2022：14523-14533.