Lightweight YOLOv8 Pedestrian Detection Algorithm Using Dynamic Activation Function

doi:10.3778/j.issn.1002-8331.2401-0130

Abstract

Abstract: Since the traditional activation function can not match each feature map specifically to achieve the best activation effect, a dynamic activation function is designed to add its own offset to each pixel value on the feature map to achieve a better effect of distinguishing target and background. In order to make the model better focus on the target, an attention mechanism is added to the backbone to improve the accuracy of the model. For scenarios requiring pedestrian flow monitoring and traffic management, such as red-light detection, automatic driving and other scenarios with high real-time performance and limited hardware conditions, channel pruning technology is applied to trim the low-weight parameters of the model. In order to adapt to the hardware acceleration characteristics, the pruning method is improved, so that the number of retained channels is always an integer multiple of 8. In the inference deployment phase, Conv and BatchNorm weights are integrated to further shrink the model and reduce the number of parameters and floating point computation. The final experiment shows that the performance of the improved model is improved to some extent compared with other object detection models, among which, the performance of the improved model is improved by 0.013 in AP0.5：0.95 and 0.005 in AP0.5 compared with the original model of YOLOv8. The number of parameters is reduced by 4.8×106.

Key words: YOLOv8, pedestrian detection, activation function, pruning, weight fusion

摘要： 针对传统激活函数不能特异性匹配每张特征图以达到最好的激活效果，设计一种动态激活函数，为特征图上的每个像素值添加各自的偏移量，以达到更优的区分目标和背景的效果；为使模型更好地关注目标，在主干加入注意力机制，以提高模型的准确性。针对需要监测行人流量和进行交通管理的场景，如闯红灯检测、自动驾驶等实时性高，硬件条件有限的场景，应用通道剪枝技术对模型低权重参数进行修剪，为适应硬件加速特性，改进了剪枝方法，使保留通道数始终为8的整数倍。在推理部署阶段，融合Conv和BatchNorm权重，进一步缩小模型，减少参数量和浮点运算量。最终实验表明，改进的模型性能比其他目标检测模型均有一定提升，其中，比YOLOv8原模型在AP0.5：0.95上提升了0.013，在AP0.5上提升了0.005，参数量减少了4.8×106。

关键词: YOLOv8, 行人检测, 激活函数, 剪枝, 权重融合

WANG Xiaojun, CHEN Gaoyu, LI Xiaohang. Lightweight YOLOv8 Pedestrian Detection Algorithm Using Dynamic Activation Function[J]. Computer Engineering and Applications, 2024, 60(15): 221-233.

王晓军, 陈高宇, 李晓航. 应用动态激活函数的轻量化YOLOv8行人检测算法[J]. 计算机工程与应用, 2024, 60(15): 221-233.

References

[1] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[J]. arXiv:1311.2524, 2013.
[2] GIRSHICK R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440-1448.
[3] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]// Advances in Neural Information Processing Systems, 2015.
[4] HE K, GKIOXARI G, DOLLáR P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2961-2969.
[5] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shotmultiBox detector[C]//European Conference on Computer Vision, 2016: 21-37.
[6] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[7] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2980-2988.
[8] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018.
[9] 徐增敏, 陆光建, 陈俊彦, 等. 基于通道特征聚合的行人重识别算法[J]. 应用科学学报, 2023, 41(1): 107-120.
XU Z M, LU G J, CHEN J Y, et al. Person re-identification algorithm based on channel feature aggregation[J]. Journal of Applied Sciences, 2023, 41(1): 107-120.
[10] 夏正新, 苏翀. 一种多参数学习的门控激活函数[J]. 南京邮电大学学报 (自然科学版), 2022, 42(5): 83-90.
XIA Z X, SU C. A multi-parameterized gated activation function[J]. Journal of Nanjing University of Posts and Telecommunications (Natural Science Edition), 2022, 42(5): 83-90.
[11] 徐静萍, 王芳. 基于改进的S-ReLU激活函数的图像分类方法[J]. 科学技术与工程, 2022, 22(29): 12963-12968.
XU J P, WANG F. Image classification method based on improved S-ReLU activation function[J]. Science Technology and Engineering, 2022, 22(29): 12963-12968.
[12] 杜圣杰, 贾晓芬, 黄友锐, 等. 面向CNN模型图像分类任务的高效激活函数设计[J]. 红外与激光工程, 2022, 51(3): 493-501.
DU S J, JIA X F, HUANG Y R, et al. High efficient activation function design for CNN model image classification task[J]. Infrared and Laser Engineering, 2022, 51(3): 493-501.
[13] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2117-2125.
[14] WANG X, ZHANG S, YU Z, et al. Scale-equalizing pyramid convolution for object detection[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, June 13-19, 2020. Piscataway: IEEE, 2020: 13359-13368.
[15] SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, June 15-20, 2019. Piscataway: IEEE, 2019: 5693-5703.
[16] LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 8759-8768.
[17] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
[18] 张华卫, 张文飞, 蒋占军, 等. 引入上下文信息和Attention Gate的GUS-YOLO遥感目标检测算法[J]. 计算机科学与探索, 2024, 18(2): 453-464.
ZHANG H W, ZHANG W F, JIANG Z J, et al. GUS-YOLO remote sensing target detection algorithm introducing context information and Attention Gate[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(2): 453-464.
[19] 何湘杰, 宋晓宁. YOLOv4-Tiny的改进轻量级目标检测算法[J]. 计算机科学与探索, 2024, 18(1): 138-150.
HE X J, SONG X N. Improved YOLOv4-Tiny lightweight target detection algorithm[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(1): 138-150.
[20] QI C, GAO J, PEARSON S, et al. Tea chrysanthemum detection under unstructured environments using the TC-YOLO model[J]. Expert Systems with Applications, 2022, 193: 116473.
[21] HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 13713-13722.