计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (1): 122-134.DOI: 10.3778/j.issn.1002-8331.2307-0004

• YOLO系列改进及应用专题 • 上一篇    下一篇

改进YOLOv7的小目标检测算法研究

李安达,吴瑞明,李旭东   

  1. 浙江科技学院 浙江省食品物流装备技术研究重点实验室,杭州 310023
  • 出版日期:2024-01-01 发布日期:2024-01-01

Research on Improving YOLOv7’s Small Target Detection Algorithm

LI Anda, WU Ruiming, LI Xudong   

  1. Zhejiang Provincial Key Laboratory of Food Logistics Equipment and Technology, Zhejiang University of Science and Technology, Hangzhou 310023, China
  • Online:2024-01-01 Published:2024-01-01

摘要: 随着深度学习在国内目标检测的不断应用,常规的大、中目标检测已经取得惊人的进步,但由于卷积网络本身的局限性,针对小目标检测依然会出现漏检、误检的问题,以数据集Visdrone2019和数据集FloW-Img为例,对YOLOv7模型进行研究,在网络结构上对骨干网的ELAN模块进行改进,将Focal NeXt block加入到ELAN模块的长短梯度路径中融合来强化输出小目标的特征质量和提高输出特征包含的上下文信息含量,在头部网络引入RepLKDeXt模块,该模块不仅可以取代SPPCSPC模块来简化模型整体结构还可以利用多通道、大卷积核和Cat操作来优化ELAN-H结构,最后引入SIOU损失函数取代CIOU函数以此提高该模型的鲁棒性。结果表明改进后的YOLOv7模型参数量减少计算复杂性降低并在小目标密度高的Visdrone 2019数据集上的检测性能近似不变,在小目标稀疏的FloW-Img数据集上涨幅9.05个百分点,进一步简化了模型并增加了模型的适用范围。

关键词: YOLOv7模型, 小目标检测, 大卷积核, 损失函数

Abstract: With the continuous application of deep learning in domestic object detection, conventional large and medium object detection has made astonishing progress. However, due to the limitations of convolutional networks themselves, there are still issues of missed and false detections in small object detection. Taking dataset Visdrone 2019 and dataset FloW-Img as examples, the YOLOv7 model is studied, and the ELAN module of the backbone network is improved in the network structure. The Focal NeXt block is integrated into the long and short gradient paths of the ELAN module to enhance the feature quality of small targets and improve the contextual information content contained in the output features. The RepLKDeXt module is introduced into the head network, which not only replaces the SPPCSPC module to simplify the overall structure of the model, but also optimizes the ELAN-H structure using multi-channel, large convolutional kernels, and Cat operations. Finally, the SIOU loss function is introduced to replace the CIOU function to improve the robustness of the model. The results show that the improved YOLOv7 model reduces the number of parameters and computational complexity, and its detection performance remains approximately unchanged on the Visdrone 2019 dataset with high small target density. It increases by 9.05 percentage points on the sparse FloW-Img dataset with small targets, further simplifying the model and increasing its applicability.

Key words: YOLOv7 model, small target detection, large convolutional kernels, loss function