Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (9): 238-245.DOI: 10.3778/j.issn.1002-8331.2201-0058

• Graphics and Image Processing • Previous Articles     Next Articles

CA-YOLOv5 for Crowded Pedestrian Detection

CHEN Yixiao, Alifu·Kuerban, LIN Wenlong, YUAN Xu   

  1. School of Software, Xinjiang University, Urumqi 830046, China
  • Online:2022-05-01 Published:2022-05-01

面向拥挤行人检测的CA-YOLOv5

陈一潇,阿里甫·库尔班,林文龙,袁旭   

  1. 新疆大学 软件学院,乌鲁木齐 830046

Abstract: Aiming at the problem of high miss-detection rate and insufficient feature fusion of YOLOv5 in crowded pedestrian detection task, the CA-YOLOv5 pedestrian detection algorithm is proposed. To solve the problem of insufficient fine-grained feature fusion in the original backbone network, Res2Block is used to rebuild the backbone network of YOLOv5, so as to improve the fine-grained feature fusion ability of the network and improve the detection accuracy. For the large change of target scale in dataset, coordinate attention is introduced to enhance the receptive field and the model’s ability to accurately locate the target. Aiming at the problem that FPN structure reduces the multi-scale feature expression ability during feature fusion, the feature enhancement module is proposed to enhance the multi-scale feature expression ability. Through the structural re-parameterization method to reduce the number of parameters and computation in the model, and speed up the detection. Aiming at the common problem of crowded pedestrians in pedestrian detection task, EViT is proposed to enhance the ability of the model to pay attention to local information and improve the detection accuracy. Experimental results show that in the crowded pedestrian detection task, the detection accuracy of CA-YOLOv5 reaches 84.86%, 3.75% higher than the original algorithm, and the detection speed can reach 51?FPS, which has good detection accuracy and real-time. Therefore, it can be better applied to real-time pedestrian detection task.

Key words: deep learning, YOLOv5, crowded pedestrian detection, Res2Net

摘要: 针对YOLOv5在拥挤行人检测任务中漏检率高、特征融合不充分等问题,提出了CA-YOLOv5行人检测算法。针对原主干网络对细粒度特征融合不充分的问题,采用Res2Block重建YOLOv5的主干网络,以提升网络的细粒度特征融合能力,提高检测精度。针对数据集目标尺度变化大的问题,引入coordinate attention(CA)模块增强感受野,增强模型对目标的精确定位能力。针对FPN结构在特征融合时导致多尺度特征表达能力下降的问题,提出特征增强模块,以增强多尺度特征的表达能力。通过结构重参数化的方法减少模型的计算量与参数量,加快目标检测速度。针对行人检测任务中普遍存在的拥挤行人问题,提出EViT模块,增强模型关注局部信息的能力,提高检测精度。实验证明,在拥挤行人检测任务中,CA-YOLOv5的检测精度达到84.86%,相较于原算法提高了3.75%,检测速度可以达到51?FPS,具有较好的检测精度与实时性。因此,CA-YOLOv5可以更好地应用于实时行人检测任务中。

关键词: 深度学习, YOLOv5, 拥挤行人检测, Res2Net