Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (21): 38-54.DOI: 10.3778/j.issn.1002-8331.2404-0130
• Research Hotspots and Reviews • Previous Articles Next Articles
MI Zeng, LIAN Zhe
Online:
2024-11-01
Published:
2024-10-25
米增,连哲
MI Zeng, LIAN Zhe. Review of YOLO Methods for Universal Object Detection[J]. Computer Engineering and Applications, 2024, 60(21): 38-54.
米增, 连哲. 面向通用目标检测的YOLO方法研究综述[J]. 计算机工程与应用, 2024, 60(21): 38-54.
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2404-0130
[1] PAPAGEORGIOU C P, OREN M, POGGIO T. A general framework for object detection[C]//Proceedings of the IEEE Sixth International Conference on Computer Vision, 1998: 555-562. [2] VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), 2001. [3] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 2005: 886-893. [4] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems, 2012. [5] GIRSHICK R, DONAHUE J, DARRELL T, et al. Region-based convolutional networks for accurate object detection and segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 38(1): 142-158. [6] GIRSHICK R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440-1448. [7] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788. [8] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271. [9] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018. [10] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[J]. arXiv:2004.10934, 2020. [11] NELSON J, SOLAWETZ J. YOLOv5 is here: state-of-the-art object detection at 140 FPS[EB/OL].[2020-06-10]. https://blog.roboflow.com/yolov5-is-here/. [12] LI C, LI L, JIANG H, et al. YOLOv6: a single-stage object detection framework for industrial applications[J]. arXiv:2209.02976, 2022. [13] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 7464-7475. [14] GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021[J]. arXiv:2107.08430, 2021. [15] WANG C Y, YEH I H, LIAO H Y M. You only learn one representation: unified network for multiple tasks[J]. arXiv:2105.04206, 2021. [16] XU S, WANG X, LV W, et al. PP-YOLOE: an evolved version of YOLO[J]. arXiv:2203.16250, 2022. [17] GALLAGHER J. How to train an ultralytics YOLOv8 oriented bounding box (OBB) model[EB/OL]. [2024-02-06]. https://blog.roboflow.com/train-yolov8-obb-model/. [18] WANG C Y, YEH I H, LIAO H Y M. YOLOv9: learning what you want to learn using programmable gradient information[J]. arXiv:2402.13616, 2024. [19] CHEN Y, YUAN X, WU R, et al. YOLO-MS: rethinking multi-scale representation learning for real-time object detection[J]. arXiv:2308.05480, 2023. [20] WANG A, CHEN H, LIU L, et al. YOLOv10: real-time end-to-end object detection[J]. arXiv:2405.14458, 2024. [21] WANG C, HE W, NIE Y, et al. Gold-YOLO: efficient object detector via gather-and-distribute mechanism[C]//Advances in Neural Information Processing Systems, 2024. [22] FANG Y, LIAO B, WANG X, et al. You only look at one sequence: rethinking transformer in vision through object detection[C]//Advances in Neural Information Processing Systems, 2021: 26183-26197. [23] XU X, JIANG Y, CHEN W, et al. DAMO-YOLO: a report on real-time object detection design[J]. arXiv:2211.15444, 2022. [24] SKALSKI P. How to train YOLO-NAS on a custom dataset[EB/OL].[2023-05-16]. https://blog.roboflow.com/yolo-nas-how-to-train-on-custom-dataset/. [25] 王琳毅, 白静, 李文静, 等. YOLO系列目标检测算法研究进展[J]. 计算机工程与应用, 2023, 59(14): 15-29. WANG L Y, BAI J, LI W J, et al. Research progress of YOLO series target detection algorithms[J]. Computer Engineering and Applications, 2023, 59(14): 15-29. [26] 茅智慧, 朱佳利, 吴鑫, 等. 基于YOLO的自动驾驶目标检测研究综述[J]. 计算机工程与应用, 2022, 58(15): 68-77. MAO Z H, ZHU J L, WU X, et al. Review of YOLO based target detection for autonomous driving[J]. Computer Engineering and Applications, 2022, 58(15): 68-77. [27] 朱弥雪, 刘志强, 张旭, 等. 林火视频烟雾检测算法综述[J]. 计算机工程与应用, 2022, 58(14): 16-26. ZHU M X, LIU Z Q, ZHANG X, et al. Review of research on video-based smoke detection algorithms[J]. Computer Engineering and Applications, 2022, 58(14): 16-26. [28] JIANG P, ERGU D, LIU F, et al. A review of YOLO algorithm developments[J]. Procedia Computer Science, 2022, 199: 1066-1073. [29] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (voc) challenge[J]. International Journal of Computer Vision, 2010, 88: 303-338. [30] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]//Proceedings of 13th European Conference on Computer Vision (ECCV 2014), Zurich, Switzerland, September 6-12, 2014. Cham: Springer International Publishing, 2014: 740-755. [31] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2117-2125. [32] WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020: 390-391. [33] WU Y, CHEN Y, YUAN L, et al. Rethinking classification and localization for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 10186-10195. [34] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Proceedings of European Conference on Computer Vision. Cham: Springer International Publishing, 2020: 213-229. [35] CHU X, LI L, ZHANG B. Make RepVGG greater again: a quantization-aware approach[J]. arXiv:2212.01593, 2022. [36] 许晓阳, 高重阳. 改进YOLOv7-tiny的轻量级红外车辆目标检测算法[J]. 计算机工程与应用, 2024, 60(1): 74-83. XU X Y, GAO C Y. Improved YOLOv7-tiny lightweight infrared vehicle target detection algorithm[J]. Computer Engineering and Applications, 2024, 60(1): 74-83. [37] 张华卫, 张文飞, 蒋占军, 等. 引入上下文信息和Attention Gate的GUS-YOLO遥感目标检测算法[J]. 计算机科学与探索, 2024, 18(2): 453-464. ZHANG H W, ZHANG W F, JIANG Z J, et al. GUS-YOLO remote sensing target detection algorithm introducing context information and Attention Gate[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(2):453-464. [38] 何湘杰, 宋晓宁. YOLOv4-Tiny的改进轻量级目标检测算法[J]. 计算机科学与探索, 2024, 18(1): 138-150. HE X J, SONG X Y. Improved YOLOv4-Tiny lightweight target detection algorithm[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(1):138-150. [39] ZHOU H, JIANG F, LU H. SSDA-YOLO: semi-supervised domain adaptive YOLO for cross-domain object detection[J]. Computer Vision and Image Understanding, 2023, 229: 103649. [40] WEI J, WANG Q, ZHAO Z. YOLO-G: improved YOLO for cross-domain object detection[J]. Plos One, 2023, 18(9): e0291241. [41] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144. [42] HO J, JAIN A, ABBEEL P. Denoising diffusion probabil-istic models[C]//Advances in Neural Information Processing Systems, 2020: 6840-6851. [43] ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2223-2232. [44] YOON J, JARRETT D, VAN DER SCHAAR M. Time-series genera-tive adversarial networks[C]//Advances in Neural Information Processing Systems, 2019. [45] KARRAS T, LAINE S, AILA T. A style-based generator architecture for generative adversarial networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 4401-4410. [46] LI H, YANG Y, CHANG M, et al. Srdiff: single image super-resolution with diffusion probabilistic models[J]. Neurocomputing, 2022, 479: 47-59. [47] KHADER F, MUELLER-FRANZES G, ARASTEH S T, et al. Medical diffusion: denoising diffusion probabilistic models for 3d medical image generation[J]. arXiv:2211.03364, 2022. [48] ZHENG Q, TIAN X, YU Z, et al. MobileRaT: a lightweight radio transformer method for automatic modulation classification in drone communication systems[J]. Drones, 2023, 7(10): 596. [49] ZHENG Q, SAPONARA S, TIAN X, et al. A real-time constellation image classification method of wireless communication signals based on the lightweight network MobileViT[J]. Cognitive Neurodynamics, 2024, 18: 659-671. [50] 王春梅, 刘欢. YOLOv8-VSC: 一种轻量级的带钢表面缺陷检测算法[J]. 计算机科学与探索, 2024, 18(1): 151-160. WANG C M, LIU H. YOLOv8-VSC: lightweight algorithm for strip surface defect detection[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(1): 151-160. [51] ZHOU J, ZHANG B, YUAN X, et al. YOLO-CIR: the network based on YOLO and ConvNeXt for infrared object detection[J]. Infrared Physics & Technology, 2023, 131: 104703. [52] 连哲, 殷雁君, 云飞, 等. 基于深度学习的自然场景文本检测综述[J]. 计算机工程, 2024, 50(3): 16-27. LIAN Z, YIN Y J, YUN F, et al. Review of natural scene text detection based on deep learning[J]. Computer Engineering, 2024, 50(3): 16-27. |
[1] | WANG Cailing, YAN Jingjing, ZHANG Zhidong. Review on Human Action Recognition Methods Based on Multimodal Data [J]. Computer Engineering and Applications, 2024, 60(9): 1-18. |
[2] | LIAN Lu, TIAN Qichuan, TAN Run, ZHANG Xiaohang. Research Progress of Image Style Transfer Based on Neural Network [J]. Computer Engineering and Applications, 2024, 60(9): 30-47. |
[3] | YANG Chenxi, ZHUANG Xufei, CHEN Junnan, LI Heng. Review of Research on Bus Travel Trajectory Prediction Based on Deep Learning [J]. Computer Engineering and Applications, 2024, 60(9): 65-78. |
[4] | OUYANG Bo, ZHU Yongjian, YANG Likang, WANG Benyuan. FA-SORT:Lightweight Multi-Vehicle Tracking Algorithm [J]. Computer Engineering and Applications, 2024, 60(9): 122-134. |
[5] | CAI Teng, CHEN Cifa, DONG Fangmin. Low-Light Object Detection Combining Transformer and Dynamic Feature Fusion [J]. Computer Engineering and Applications, 2024, 60(9): 135-141. |
[6] | PAN Wei, WEI Chao, QIAN Chunyu, YANG Zhe. Improved YOLOv8s Model for Small Object Detection from Perspective of Drones [J]. Computer Engineering and Applications, 2024, 60(9): 142-150. |
[7] | SONG Jianping, WANG Yi, SUN Kaiwei, LIU Qilie. Short Text Classification Combined with Hyperbolic Graph Attention Networks and Labels [J]. Computer Engineering and Applications, 2024, 60(9): 188-195. |
[8] | LI Houjun, WEI Boquan. Attribute Distillation for Zero-Shot Recognition [J]. Computer Engineering and Applications, 2024, 60(9): 219-227. |
[9] | CHE Yunlong, YUAN Liang, SUN Lihui. 3D Object Detection Based on Strong Semantic Key Point Sampling [J]. Computer Engineering and Applications, 2024, 60(9): 254-260. |
[10] | QIU Yunfei, WANG Yifan. Multi-Level 3D Point Cloud Completion with Dual-Branch Structure [J]. Computer Engineering and Applications, 2024, 60(9): 272-282. |
[11] | YE Bin, ZHU Xingshuai, YAO Kang, DING Shangshang, FU Weiwei. Binocular Depth Measurement Method for Desktop Interaction Scene [J]. Computer Engineering and Applications, 2024, 60(9): 283-291. |
[12] | LI Zhonghua, LIN Chujun, ZHU Hengliang, LIAO Shiyu, BAI Yunqi. Small Object Detection Based on Structure Perception and Global Context Information [J]. Computer Engineering and Applications, 2024, 60(9): 292-298. |
[13] | ZHOU Dingwei, HU Jing, ZHANG Liangrui, DUAN Feiya. Collaborative Correction Technology of Label Omission in Dataset for Object Detection [J]. Computer Engineering and Applications, 2024, 60(8): 267-273. |
[14] | XU Kui, LI Xinzhuo, ZHANG Li, ZHANG Junjie, YANG Ning. Safety Helmet Wearing Detection Algorithm for Distribution Network Construction in Natural Scenarios [J]. Computer Engineering and Applications, 2024, 60(8): 320-328. |
[15] | ZHOU Bojun, CHEN Zhiyu. Survey of Few-Shot Image Classification Based on Deep Meta-Learning [J]. Computer Engineering and Applications, 2024, 60(8): 1-15. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||