计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (13): 138-150.DOI: 10.3778/j.issn.1002-8331.2411-0437

• 目标检测专题 • 上一篇    下一篇

改进YOLOv8的轻量化密集行人检测方法

姚聪,方遒,郭星浩   

  1. 1.厦门理工学院 机械与汽车工程学院,福建 厦门 361024
    2.厦门大学 航空航天学院,福建 厦门 361005
  • 出版日期:2025-07-01 发布日期:2025-06-30

Improved Lightweight Dense Pedestrian Detection Method Based on YOLOv8

YAO Cong, FANG Qiu, GUO Xinghao   

  1. 1.School of Mechanical and Automotive Engineering, Xiamen University of Technology, Xiamen, Fujian 361024, China
    2.School of Aerospace Engineering, Xiamen University, Xiamen, Fujian 361005, China
  • Online:2025-07-01 Published:2025-06-30

摘要: 针对密集行人检测存在小目标检测精度低、模型复杂的问题,提出一种改进YOLOv8的轻量化密集行人检测方法。引入DualConv模块替换原始Conv模块,帮助更深的卷积层更有效地提取信息,减少计算冗余并提高检测精度;通过融合RepViTBlock结构和分离与增强注意力机制SEMA(separated and enhancement attention)改进C2f,构建RS-C2f结构,提升模型的泛化和特征融合能力,并降低参数量;设计全新的空间金字塔模块SPPELAN_BiFPN,使模型对小目标行人检测精度显著提高,同时优化计算效率;采用Focal_Shape-IoU作为边界框回归损失函数,加快网络的收敛速度,提高对小目标的检测准确率。实验结果表明,改进模型的mAP@0.5、Precision和Recall在CrowdHuman数据集上提升2.4、1.1和2.1个百分点,在WiderPerson数据集上提升1.3、1.0和1.7个百分点,同时参数量下降39.6%。在嵌入式设备上单帧图像平均运行时间为55.1 ms,平均精度为90.7%,召回率为82.9%,表明改进模型在保证轻量化的同时提升了检测精度和速度。

关键词: 密集行人检测, 轻量化, RS-C2f, SPPELAN_BiFPN模块, Focal_Shape-IoU

Abstract: To address the challenges of low small object detection accuracy and model complexity in dense pedestrian detection, a lightweight improved YOLOv8 method for dense pedestrian detection is proposed. The DualConv module is introduced to replace the original Conv module, helping deeper convolutional layers more effectively extract information, reducing computational redundancy, and improving detection accuracy. By integrating the RepViTBlock structure and the separated and enhancement attention (SEMA) mechanism to improve C2f, the RS-C2f structure is constructed to enhance the model’s generalization and feature fusion capabilities while reducing the parameter count. A new spatial pyramid module, SPPELAN_BiFPN, is designed to significantly improve the model’s small object pedestrian detection accuracy while optimizing computational efficiency. Focal_Shape-IoU is employed as the bounding box regression loss function to accelerate network convergence and improve small object detection accuracy. Experimental results show that the improved model achieves a 2.4, 1.1, and 2.1 percentage points improvement in mAP@0.5, Precision, and Recall on the CrowdHuman dataset, respectively, and a 1.3, 1.0, and 1.7 percentage points improvement on the WiderPerson dataset, while the parameter count decreases by 39.6%. On embedded devices, the average inference time per image is 55.1 ms, with an average accuracy of 90.7% and recall rate of 82.9%, demonstrating that the improved model enhances detection accuracy and speed while maintaining lightweight characteristics.

Key words: dense pedestrian detection, lightweighting, RS-C2f, SPPELAN_BiFPN module, Focal_Shape-IoU