计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (16): 223-231.DOI: 10.3778/j.issn.1002-8331.2304-0243

• 图形图像处理 • 上一篇    下一篇

基于改进YOLOv5算法的口罩检测研究

段必冲,马明涛   

  1. 1.吉林化工学院 信息与控制工程学院,吉林 132022
    2.吉林农业科技学院 电气与信息工程学院,吉林 132101
  • 出版日期:2023-08-15 发布日期:2023-08-15

Research on Improved Mask Detection Method Based on YOLOv5 Algorithm

DUAN Bichong, MA Mingtao   

  1. 1.School of Information and Control Engineering, Jilin Institute of Chemical Technology, Jilin 132022, China
    2.College of Electrical and Information Engineering, Jilin University of Agricultural Science and Technology, Jilin 132101, China
  • Online:2023-08-15 Published:2023-08-15

摘要: 现有的口罩检测模型无法平衡检测精度和检测速度,参数量较大,为了解决这些问题,提出了一种基于改进YOLOv5的口罩检测算法。该算法主要包括以下四点改进:使用轻量化网络GhostNetV2替换YOLOv5s主干网络中的C3模块,以降低参数量;将YOLOv5s主干提取网络的最后一个C3模块和Neck最后一层的C3模块替换为Swin-Transformer结构,来获取更为完整的特征信息,提高检测效果;引入CBAM注意力机制以更好地聚焦于关键信息,从而提高检测效率和检测精度;损失函数使用EIoU替换掉GIoU来提高定位准确度,加快收敛速度。在AIZOO数据集上的实验结果表明,所提出的改进算法的mAP值达到了96.2%,Params降低为6.6×106,FPS高达136,验证数据集上的性能也有很好的提升,相比其他算法,改进算法的性能更优,更适用于口罩检测。

关键词: 口罩检测, GhostNetV2, Swin-Transformer, 注意力机制

Abstract: The existing mask detection models cannot balance detection accuracy and detection speed, and have a large number of parameters. In order to solve these problems, an improved YOLOv5 based mask detection algorithm is proposed. It mainly includes the following four improvements:Firstly, replace the C3 module in the YOLOv5s backbone network with the lightweight network GhostNetV2 to reduce the number of parameters; Secondly, replace the last C3 module of the YOLOv5s backbone extraction network and the C3 module of the last layer of Neck with a Swin Transformer structure to obtain more complete feature information and improve detection performance; Thirdly, introduce CBAM attention mechanism to better focus on key information, thereby improving detection efficiency and accuracy; Fourth, the loss function replaces GIoU with EIoU to improve positioning accuracy and speed up convergence. The experimental results on the AIZOO dataset show that the mAP value of the proposed impoved algorithm reaches 96.2%, Params is reduced to 6.6×106, and FPS is as high as 136. There is also a good improvement on the validation dataset, and compared to other algorithms, the performance is better. The proposed improved algorithm is more suitable for mask detection.

Key words: mask detection, GhostNetV2, Swin-Transformer, attention mechanism