[1] ZOU Z X, CHEN K Y, SHI Z W, et al. Object detection in 20 years: a survey[J]. Proceedings of the IEEE, 2023, 111(3): 257-276.
[2] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587.
[3] GIRSHICK R. Fast R-CNN[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1440-1448.
[4] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[5] CAI Z W, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6154-6162.
[6] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer, 2016: 21-37.
[7] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2999-3007.
[8] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788.
[9] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6517-6525.
[10] LI C, LI L, JIANG H, et al. YOLOv6: a single-stage object detection framework for industrial applications[J]. arXiv:2209.02976, 2022.
[11] WANG C Y, BOCHKOVSKIY A, LIAO H M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 7464-7475.
[12] WANG C Y, YEH I H, LIAO H Y M. YOLOv9: learning what you want to learn using programmable gradient information[J]. arXiv:2402.13616, 2024.
[13] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer, 2020: 213-229.
[14] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems, 2017.
[15] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[J]. arXiv:2010.11929, 2020.
[16] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
[17] TANG F L, XU Z X, HUANG Q M, et al. DuAT: dual-aggregation transformer network for medical image segmentation[C]//Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision. Singapore: Springer, 2024: 343-356.
[18] LI C, ZHOU A, YAO A. Omni-dimensional dynamic convolution[J]. arXiv:2209.07947, 2022.
[19] WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2020: 1571-1580.
[20] WANG C Y, LIAO H Y M, YEH I H. Designing network design strategies through gradient path analysis[J]. arXiv:2211.04800, 2022.
[21] ZHANG X, SONG Y Z, SONG T T, et al. LDConv: linear deformable convolution for improving convolutional neural networks[J]. Image and Vision Computing, 2024, 149: 105190.
[22] WANG A, CHEN H, LIU L H, et al. YOLOv10: real-time end-to-end object detection[J]. arXiv:2405.14458, 2024.
[23] WANG C C, HE W, NIE Y, et al. Gold-YOLO: efficient object detector via gather-and-distribute mechanism[C]//Advances in Neural Information Processing Systems, 2024.
[24] GE Z, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021[J]. arXiv:2107.08430, 2021.
[25] CHIEN C T, JU R Y, CHOU K Y, et al. YOLOv8-AM: YOLOv8 with attention mechanisms for pediatric wrist fracture detection[J]. arXiv:2402.09329, 2024.
[26] 胡峻峰, 李柏聪, 朱昊, 等. 改进YOLOv8的轻量化无人机目标检测算法[J]. 计算机工程与应用, 2024, 60(8): 182-191.
HU J F, LI B C, ZHU H, et al. Improved YOLOv8 lightweight UAV target detection algorithm[J]. Computer Engineering and Applications, 2024, 60(8): 182-191.
[27] BOLYA D, FOLEY S, HAYS J, et al. TIDE: a general toolbox for identifying object detection errors[C]//Proceedings of the 16th European Conference on Computer Vision. Cham: Springer, 2020: 558-573.
[28] XIE G B, XU Z J, LIN Z Y, et al. GRFS-YOLOv8: an efficient traffic sign detection algorithm based on multiscale features and enhanced path aggregation[J]. Signal, Image and Video Processing, 2024, 18(6): 5519-5534.
[29] ZHANG F F, LEONG L V, YEN K S, et al. An enhanced lightweight model for small-scale pedestrian detection based on YOLOv8s[J]. Digital Signal Processing, 2025, 156: 104866.
[30] 高德勇, 陈泰达, 缪兰. 改进YOLOv8n的道路目标检测算法[J]. 计算机工程与应用, 2024, 60(16): 186-197.
GAO D Y, CHEN T D, MIAO L. Improved road object detection algorithm for YOLOv8n[J]. Computer Engineering and Applications, 2024, 60(16): 186-197.
[31] LI Z X, HE Q H, ZHAO H, et al. Doublem-net: multi-scale spatial pyramid pooling-fast and multi-path adaptive feature pyramid network for UAV detection[J]. International Journal of Machine Learning and Cybernetics, 2024, 15(12): 5781-5805.
[32] 许德刚, 王双臣, 王再庆, 等. 改进YOLOv8算法的城市车辆目标检测[J]. 计算机工程与应用, 2024, 60(18): 136-146.
XU D G, WANG S C, WANG Z Q, et al. Improved YOLOv8 urban vehicle target detection algorithm[J]. Computer Engineering and Applications, 2024, 60(18): 136-146. |