计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (17): 157-168.DOI: 10.3778/j.issn.1002-8331.2104-0200

• 模式识别与人工智能 • 上一篇    下一篇

YOLOv4口罩检测算法的轻量化改进

叶子勋,张红英   

  1. 1.西南科技大学 信息工程学院,四川 绵阳 621010
    2.西南科技大学 特殊环境机器人技术四川省重点实验室,四川 绵阳 621010
  • 出版日期:2021-09-01 发布日期:2021-08-30

Lightweight Improvement of YOLOv4 Mask Detection Algorithm

YE Zixun, ZHANG Hongying   

  1. 1.School of Information Engineering, Southwest University of Science and Technology, Mianyang, Sichuan 621010, China
    2.Robot Technology Used for Special Environment Key Laboratory of Sichuan Province, Southwest University of Science and Technology, Mianyang, Sichuan 621010, China
  • Online:2021-09-01 Published:2021-08-30

摘要:

针对当前YOLOv4目标检测算法网络模型庞大、特征提取不充分且易受光线环境影响的缺点,提出了一种优化了特征提取网络和一般卷积块的轻量化YOLOv4-Lite网络模型。使用改进的MobileNetv3替换原有的主干特征提取网络,减小了网络模型的参数量,提高了检测精度。提出了使用深度可分离卷积块代替原网络中的普通卷积块,使得模型的参数量进一步降低。结合了标签平滑、学习率余弦退火衰减算法,新增了SiLU激活函数代替MobileNetv3浅层网络的ReLU激活函数,优化了模型的收敛效果。优化了Mosaic数据增强方法,提升了模型的鲁棒性。在人脸口罩佩戴任务中与原算法相比,牺牲了1.68%的mAP,但在检测效率(FPS)上提升约180%。

关键词: 口罩检测, 深度学习, YOLOv4, MobileNetv3, 深度可分离卷积

Abstract:

To overcome the drawbacks of the current YOLOv4 target detection algorithm, such as large network model, insufficient feature extraction, and susceptibility to light environmental impact, a lightweight YOLOv4-Lite network model with optimized feature extraction network and general convolution block is presented. Replacing the original trunk feature extraction network with MobileNetv3, which reduces the parameters of the network model and improves the detection accuracy. A deep detachable convolution block is proposed to replace the common convolution block in the original network, which further reduces the amount of parameters in the model. Combining label smoothing and learning rate cosine annealing decay algorithm, a new SiLU activation function is added to replace the ReLU activation function of MobileNetv3 shallow network, which optimizes the convergence effect of the model. Mosaic data enhancement method is optimized to improve the robustness of the model. In the face mask wearing task, compared with the original algorithm, the model sacrifices 1.68% of the mAP, but improves the detection efficiency(FPS) by about 180%.

Key words: mask detection, deep learning, YOLOv4, MobileNetv3, depthwise separable convolution