Improved Small Object Detection Method of YOLOv7

doi:10.3778/j.issn.1002-8331.2405-0393

Abstract

Abstract: Aiming at the challenging problems of scale variation, complex background interference, missed detection, and false detection in the field of small object detection, an improved YOLOv7 small object detection method is proposed. Based on the YOLOv7 object detection framework, a new adaptive feature collection and redistribution module (AFCR) is added, which can effectively fuse multi-scale features, enhance the detection ability of the model for small objects, and enrich the contextual information of output features. By utilizing feature distillation techniques, the student model can learn key feature representations from the teacher model, avoiding the negative impact of semantic differences across stages, thereby significantly improving the generalization and robustness of the model. The experimental results on three publicly available small object detection datasets, CCTSDB, FloW-Img and TinyPerson, show that the proposed method achieves detection accuracies of 96.4%, 84.9% and 33.0%, respectively. Compared with the original YOLOv7 method, mAP@0.5 increases by 6.5, 3.9 and 2.9 percentage points, respectively.

Key words: small object detection, YOLOv7, knowledge distillation, multi-scale feature fusion

摘要： 针对小目标检测领域中的尺度变化、复杂背景干扰、漏检和误检等挑战性问题，提出了改进YOLOv7的小目标检测方法。在YOLOv7目标检测框架的基础上，加入了新的自适应特征收集再分配模块（adaptive feature collection and redistribution，AFCR）。该模块能够实现对多尺度特征的有效融合，从而增强模型对小目标的检测能力，并丰富输出特征的上下文信息。进一步地，运用特征蒸馏技术，使得学生模型能够从教师模型中学习关键特征表示，避免跨阶段的语义差异带来的负面影响，从而显著提升模型的泛化性和鲁棒性。在CCTSDB、FloW-Img和TinyPerson三个公开小目标检测数据集上的实验结果表明，提出的方法分别实现了96.4%、84.9%和33.0%的检测准确率，相较于原始YOLOv7方法，mAP@0.5分别提升了6.5、3.9和2.9个百分点。

关键词: 小目标检测, YOLOv7, 知识蒸馏, 多尺度特征融合

FENG Tailai, ZHANG Xuesong, SONG Cunli, LI Guangyu, JIN Hua. Improved Small Object Detection Method of YOLOv7[J]. Computer Engineering and Applications, 2025, 61(10): 203-213.

冯泰梾, 张雪松, 宋存利, 李光宇, 金花. 改进YOLOv7的小目标检测方法[J]. 计算机工程与应用, 2025, 61(10): 203-213.

References

[1] LIAN J, YIN Y H, LI L H, et al. Small object detection in traffic scenes based on attention feature fusion[J]. Sensors, 2021, 21(9): 3031.
[2] LIU H S, FAN K G, OUYANG Q H, et al. Real-time small drones detection based on pruned YOLOv4[J]. Sensors, 2021, 21(10): 3374.
[3] BENJUMEA A, TEETI I, CUZZOLIN F, et al. YOLO-Z: improving small object detection in YOLOv5 for autonomous vehicles[J]. arXiv:2112.11798, 2021.
[4] 吴慧, 徐学红, 冯晓娟, 等. 全球视角下的中国生物多样性监测进展与展望[J]. 生物多样性, 2022, 30(10): 196-210.
WU H, XU X H, FENG X J, et al. Progress and prospect of China biodiversity monitoring from a global perspective[J]. Biodiversity Science, 2022, 30(10): 196-210.
[5] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587.
[6] GIRSHICK R. Fast R-CNN[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1440-1448.
[7] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[8] HE K M, GKIOXARI G, DOLLáR P, et al. Mask R-CNN[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988.
[9] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788.
[10] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018.
[11] BOCHKOVSKIY A, WANG C Y, LIAO H M. YOLOv4: optimal speed and accuracy of object detection[J]. arXiv:2004.10934, 2020.
[12] WANG C Y, BOCHKOVSKIY A, LIAO H M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 7464-7475.
[13] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision. Cham: Springer, 2016: 21-37.
[14] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2999-3007.
[15] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]//Proceedings of the 13th European Conference on Computer Vision. Cham: Springer, 2014: 740-755.
[16] GHIASI G, LIN T Y, LE Q V. NAS-FPN: learning scalable feature pyramid architecture for object detection[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 7029-7038.
[17] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944.
[18] SINGH B, NAJIBI M, DAVIS L S, et al. SNIPER[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018: 9333-9343.
[19] YANG C, HUANG Z H, WANG N Y. QueryDet: cascaded sparse query for accelerating high-resolution small object detection[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 13658-13667.
[20] BAI Y C, ZHANG Y Q, DING M L, et al. SOD-MTGAN: small object detection via multi-task generative adversarial network[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 206-221.
[21] HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network[J]. arXiv:1503.02531, 2015.
[22] CHEN D F, MEI J P, ZHANG Y, et al.Cross-layer distillation with semantic calibration[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(8): 7028-7036.
[23] LIU D Q, LI W T, ZHOU W, et al. Semantic stage-wise learning for knowledge distillation[C]//Proceedings of the 2023 IEEE International Conference on Multimedia and Expo. Piscataway: IEEE, 2023: 816-821.
[24] UNEL F O, OZKALAYCI B O, CIGLA C. The power of tiling for small object detection[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2019: 582-591.
[25] KISANTAL M, WOJNA Z, MURAWSKI J, et al. Augmentation for small object detection[J]. arXiv:1902.07296, 2019.
[26] CHEN C R, ZHANG Y, LV Q X, et al. RRNet: a hybrid detector for object detection in drone-captured images[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop. Piscataway: IEEE, 2019: 100-108.
[27] LIM J S, ASTRID M, YOON H J, et al. Small object detection using context and attention[C]//Proceedings of the 2021 International Conference on Artificial Intelligence in Information and Communication. Piscataway: IEEE, 2021: 181-186.
[28] GONG Y Q, YU X H, DING Y, et al. Effective fusion factor in FPN for tiny object detection[C]//Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 1159-1167.
[29] 戚玲珑, 高建瓴. 基于改进YOLOv7的小目标检测[J]. 计算机工程, 2023, 49(1): 41-48.
QI L L, GAO J L. Small object detection based on improved YOLOv7[J]. Computer Engineering, 2023, 49(1): 41-48.
[30] 田鹏, 毛力. 改进YOLOv8的道路交通标志目标检测算法[J]. 计算机工程与应用, 2024, 60(8): 202-212.
TIAN P, MAO L. Improved YOLOv8 object detection algorithm for traffic sign target[J]. Computer Engineering and Applications, 2024, 60(8): 202-212.
[31] GANESH P, CHEN Y, YANG Y, et al. YOLO-ReT: towards high accuracy real-time object detection on edge GPUs[C]//Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2022: 1311-1321.
[32] LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768.
[33] TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10778-10787.
[34] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141.
[35] ZHANG J, ZOU X, KUANG L D, et al. CCTSDB 2021: a more comprehensive traffic sign detection benchmark[J]. Human-centric Computing and Information Sciences, 2022. DOI:10.22967/HCIS.2022.12.023.
[36] CHENG Y W, ZHU J N, JIANG M X, et al. FloW: a dataset and benchmark for floating waste detection in inland waters[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10933-10942.
[37] YU X H, GONG Y Q, JIANG N, et al. Scale match for tiny person detection[C]//Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2020: 1246-1254.
[38] ZHU X K, LYU S C, WANG X, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops. Piscataway: IEEE, 2021: 2778-2788.
[39] WANG C Y, YEH I H, LIAO H M. YOLOv9: learning what you want to learn using programmable gradient information[J]. arXiv:2402.13616, 2024.
[40] WANG A, CHEN H, LIU L H, et al. YOLOv10: real-time end-to-end object detection[J]. arXiv:2405.14458, 2024.
[41] SUNKARA R, LUO T. No more strided convolutions or pooling: a new cnn building block for low-resolution images and small objects[C]//Proceedings of the 2023 Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Cham: Springer, 2023: 443-459.
[42] 许晓阳, 高重阳. 改进YOLOv7-tiny的轻量级红外车辆目标检测算法[J]. 计算机工程与应用, 2024, 60(1): 74-83.
XU X Y, GAO C Y. Improved YOLOv7-tiny lightweight infrared vehicle target detection algorithm[J]. Computer Engineering and Applications, 2024, 60(1): 74-83.
[43] 潘玮, 韦超, 钱春雨, 等. 面向无人机视角下小目标检测的YOLOv8s改进模型[J]. 计算机工程与应用, 2024, 60(9): 142-150.
PAN W, WEI C, QIAN C Y, et al. Improved YOLOv8s model for small object detection from perspective of drones[J]. Computer Engineering and Applications, 2024, 60(9): 142-150.
[44] 李安达, 吴瑞明, 李旭东. 改进YOLOv7的小目标检测算法研究[J]. 计算机工程与应用, 2024, 60(1): 122-134.
LI A D, WU R M, LI X D. Research on improving YOLOv7’s small target detection algorithm[J]. Computer Engineering and Applications, 2024, 60(1): 122-134.
[45] LIANG S Y, WU H, ZHEN L, et al. Edge YOLO: real-time intelligent object detection system based on edge-cloud cooperation in autonomous vehicles[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(12): 25345-25360.