计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (19): 130-139.DOI: 10.3778/j.issn.1002-8331.2304-0385

• 模式识别与人工智能 • 上一篇    下一篇

改进YOLOv7的复杂环境下铅封小目标检测

张海镔,裴斐,雷帮军,夏平   

  1. 1.三峡大学 计算机与信息学院,湖北 宜昌 443002
    2.水电工程智能视觉监测湖北省重点实验室,湖北 宜昌 443002
  • 出版日期:2023-10-01 发布日期:2023-10-01

Improved YOLOv7 for Lead-Sealed Small Target Detection in Complex Environments

ZHANG Haibin, PEI Fei, LEI Bangjun, XIA Ping   

  1. 1.School of Computer and Information Science, China Three Gorges University, Yichang, Hubei 443002, China.
    2.Hubei Provincial Key Laboratory of Intelligent Visual Monitoring of Hydropower Engineering, Yichang, Hubei 443002, China
  • Online:2023-10-01 Published:2023-10-01

摘要: 针对海港集装箱运输场景复杂、受光强弱程度不同、视角远近不同、铅封与背景颜色相近等情况导致的小目标铅封检测困难问题,提出了一种改进的YOLOv7集装箱上铅封检测方法。采用一种将上下文信息直接融入目标检测任务的方法,结合自顶向下的特征金字塔网络(path aggregation feature pyramid network,PAFPN)结构进行不同尺度的特征信息融合,提高辨别准确度;针对小铅封特征在训练过程中出现消失的问题,为骨干网络的最后一个MPConv与E-EALN模块嵌入可变形卷积模块(deformable convolution v3),适应形状大小不同输入的铅封特征图,在特征融合时,保证更多包含浅层语义信息的特征图被送入分类网络,增加模型复杂场景下的学习能力;在Neck部分融入自注意力机制(SimAM),自适应地选择输入中的重要信息,进一步提高在复杂多变背景下模型表现能力;针对数据集中集装箱上铅封距离远近不同,采用Focal Loss分类损失函数替换交叉熵损失,平衡高质量样本和低质量样本对Loss贡献,采用引入超参数的EIoU、CIoU Loss定位损失改进CIoU损失,使模型更关注预测框与真实框的重叠度,提高损失计算的准确性,同时适用于目标形状大小的变化性,提高鲁棒性。结果显示,改进后的YOLOv7算法相较于原始算法,可以达到81.6%的平均精度(mAP),检测效果优于其他经典目标检测网络和原始网络,在时间性能上,平均每张图像的识别时间为0.058?s,符合集装箱港口铅封检测的实时性要求。

关键词: 铅封, 小目标检测, YOLOv7, 上下文信息, 可变形卷积, 注意力机制, 损失函数

Abstract: An improved YOLOv7 method for detecting container seal on top of the container is proposed to address the difficulties in detecting small target seals due to the complexity of the harbor container transportation scenario, varying degrees of light intensity, different perspectives, and similar colors between the seal and the background. Firstly, a method is used to directly integrate contextual information into the object detection task, combined with the PAFPN structure for feature fusion at different scales to improve discrimination accuracy. Secondly, to address the issue of small seal features disappearing during training, the last MPConv and E-EALN modules of the backbone network are embedded with deformable convolution v3, which can adapt to seal feature maps of different shapes and sizes. During feature fusion, more feature maps containing shallow semantic information are sent to the classification network to increase the model’s learning ability in complex scenes. At the same time, a self-attention mechanism(SimAM) is incorporated into the Neck section to adaptively select important information from the input and further improve the model’s performance in complex and changing backgrounds. Finally, in view of the different lead seal distances on containers in the dataset, Focal Loss classification loss function is used to replace the cross entropy loss, balance the contribution of high-quality samples and low-quality samples to Loss, and introduce the EIoU and CIoU Loss positioning loss of hyperparameter to improve the CIoU Loss, so that the model pays more attention to the overlap between the prediction box and the real box, improves the accuracy of loss calculation, and applies to the variability of target shape and size, which improves the accuracy of loss calculation and model robustness. Experimental results show that the improved YOLOv7 algorithm achieves an average precision(mAP) of 81.6%, which is superior to other classical object detection networks and the original network. In terms of time performance, the average recognition time per image is 0.058 s, meeting the real-time requirements for container port seal detection.

Key words: lead sealing, small object detection, YOLOv7, context information, deformable convolution, attention mechanism, loss function