计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (15): 221-233.DOI: 10.3778/j.issn.1002-8331.2401-0130

• 图形图像处理 • 上一篇    下一篇

应用动态激活函数的轻量化YOLOv8行人检测算法

王晓军,陈高宇,李晓航   

  1. 上海工程技术大学 电子电气工程学院,上海 201620
  • 出版日期:2024-08-01 发布日期:2024-07-30

Lightweight YOLOv8 Pedestrian Detection Algorithm Using Dynamic Activation Function

WANG Xiaojun, CHEN Gaoyu, LI Xiaohang   

  1. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
  • Online:2024-08-01 Published:2024-07-30

摘要: 针对传统激活函数不能特异性匹配每张特征图以达到最好的激活效果,设计一种动态激活函数,为特征图上的每个像素值添加各自的偏移量,以达到更优的区分目标和背景的效果;为使模型更好地关注目标,在主干加入注意力机制,以提高模型的准确性。针对需要监测行人流量和进行交通管理的场景,如闯红灯检测、自动驾驶等实时性高,硬件条件有限的场景,应用通道剪枝技术对模型低权重参数进行修剪,为适应硬件加速特性,改进了剪枝方法,使保留通道数始终为8的整数倍。在推理部署阶段,融合Conv和BatchNorm权重,进一步缩小模型,减少参数量和浮点运算量。最终实验表明,改进的模型性能比其他目标检测模型均有一定提升,其中,比YOLOv8原模型在AP0.5:0.95上提升了0.013,在AP0.5上提升了0.005,参数量减少了4.8×106。

关键词: YOLOv8, 行人检测, 激活函数, 剪枝, 权重融合

Abstract: Since the traditional activation function can not match each feature map specifically to achieve the best activation effect, a dynamic activation function is designed to add its own offset to each pixel value on the feature map to achieve a better effect of distinguishing target and background. In order to make the model better focus on the target, an attention mechanism is added to the backbone to improve the accuracy of the model. For scenarios requiring pedestrian flow monitoring and traffic management, such as red-light detection, automatic driving and other scenarios with high real-time performance and limited hardware conditions, channel pruning technology is applied to trim the low-weight parameters of the model. In order to adapt to the hardware acceleration characteristics, the pruning method is improved, so that the number of retained channels is always an integer multiple of 8. In the inference deployment phase, Conv and BatchNorm weights are integrated to further shrink the model and reduce the number of parameters and floating point computation. The final experiment shows that the performance of the improved model is improved to some extent compared with other object detection models, among which, the performance of the improved model is improved by 0.013 in AP0.5:0.95 and 0.005 in AP0.5 compared with the original model of YOLOv8. The number of parameters is reduced by 4.8×106.

Key words: YOLOv8, pedestrian detection, activation function, pruning, weight fusion