多分支细化的拥挤行人检测算法

doi:10.3778/j.issn.1002-8331.2307-0283

摘要/Abstract

摘要： 拥挤行人检测是目前小目标检测领域的研究热点，针对拥挤行人检测场景中人物密集以及遮挡造成的漏检问题，提出一种改进SSD（single shot multibox detector）目标检测算法。将浅层Vgg（visual geometry group）网络平原结构使用多分支细化联合归一化（batch normalization，BN）操作增加分支结构，并重命名为多分支细化（multi-branch thinning）网络结构，使其可以细化浅层语义信息，提高网络泛化能力，充分表达行人信息；将改进后的Ghost模型替换多分支细化网络中的3×3卷积，利用Ghost模型中cheap_operation卷积降低因多分支结构增加的模型参数量，使用primary_conv提升浅层网络的特征提取能力，加强网络识别能力；使用二范式取代差值平方的形式改进Huber损失函数，增强网络训练的稳定性，使其达到较优的收敛效果。在Wider_Person拥挤行人检测数据集上的检测结果表明，提出的改进SSD目标检测算法MAP50达到72.9%，领先YOLO-X算法7.4个百分点，领先基线算法3.5个百分点，领先其他先进算法平均14.4个百分点，验证了该算法在行人检测中的可行性，满足遮挡行人场景的检测要求。

关键词: 行人检测, 目标检测, SSD, GhostModule

Abstract: Crowded pedestrian detection is a research hotspot in the field of small target detection. Aiming at the problem of missing detection caused by dense people and occlusion in crowded pedestrian detection scenes, an improved SSD (single shot multibox detector) target detection algorithm is proposed. Firstly, the shallow Vgg (visual geometry group) network plain structure uses batch normalization (BN) operation to increase the branch structure, and renames multi-branch thinning network structure, so that it can refine shallow semantic information, improve network generalization ability, and fully express pedestrian information. Secondly, the improved Ghost model is used to replace the 3×3 convolution in the multi-branch thinning network, the cheap_operation convolution in the Ghost model is used to reduce the number of model parameters increased due to the multi-branch structure, and the primary_conv is used to improve the feature extraction capability of shallow networks and strengthen the network recognition capability. Finally, the Huber loss function is improved by using the two-normal form instead of the difference square, which enhances the stability of network training and makes it achieve better convergence effect. The detection results on Wider_Person crowded pedestrian detection dataset show that the proposed improved SSD target detection algorithm MAP50 reaches 72.9%, which is 7.4 percentage points ahead of YOLO-X algorithm, 3.5 percentage points ahead of baseline algorithm, and 14.4 percentage points ahead of other advanced algorithms on average. The feasibility of the algorithm in pedestrian detection is verified, and it meets the detection requirements of the scene of blocking pedestrians.

Key words: pedestrian detection, target detection, single shot multibox detector (SSD), GhostModule

袁姮, 王嘉丽, 张晟翀. 多分支细化的拥挤行人检测算法[J]. 计算机工程与应用, 2024, 60(22): 230-239.

YUAN Heng, WANG Jiali, ZHANG Shengchong. Multi-Branch Thinning Congested Pedestrian Detection Algorithm[J]. Computer Engineering and Applications, 2024, 60(22): 230-239.

参考文献

[1] 罗艳, 张重阳, 田永鸿, 等. 深度学习行人检测方法综述[J]. 中国图象图形学报, 2022, 27(7): 2094-2111.
LUO Y, ZHANG C Y, TIAN Y H, et al. An overview of deep learning based pedestrian detection algorithms[J]. Journal of Image and Graphics, 2022, 27(7): 2094-2111.
[2] 石欣, 卢灏, 秦鹏杰, 等. 一种远距离行人小目标检测方法[J]. 仪器仪表学报, 2022, 43(5): 136-146.
SHI X, LU H, QIN P J, et al. A long-distance pedestrian small target detection method[J]. Chinese Journal of Scientific Instrument, 2022, 43(5): 136-146.
[3] 冯宇平, 管玉宇, 杨旭睿, 等. 融合注意力机制的实时行人检测算法[J]. 电子测量技术, 2021, 44(17): 123-130.
FENG Y P, GUAN Y Y, YANG X R. et al. Real-time pedestrian detection algorithm fused with attention mechanism[J]. Electronic Measurement Technology, 2021, 44(17): 123-130.
[4] 贾君霞, 史珂鑫. 改进型SSD道路行人目标检测算法[J]. 国外电子测量技术, 2022, 41(12): 26-32.
JIA J X, SHI K X. Modified SSD road pedestrian target detection algorithm[J]. Foreign Electronic Measurement Technology, 2022, 41(12): 26-32.
[5] 陈勇, 金曼莉, 刘焕淋, 等. 基于特征增强模块的小尺度行人检测[J]. 电子与信息学报, 2023, 45(4): 1445-1453.
CHEN Y, JIN M L, LIU H L, et al. Small-scale pedestrian detection based on feature enhancement strategy[J]. Journal of Electronics & Information Technology, 2023, 45(4): 1445-1453.
[6] 张云佐, 李文博, 郭威, 等. 面向多元场景的轻量级行人检测[J]. 光学精密工程, 2022, 30(14): 1764-1774.
ZHANG Y Z, LI W B, GUO W, et al. Lightweight pedestrian detection for multiple scenes[J]. Optics and Precision Engineering, 2022, 30(14): 1764-1774.
[7] PAPAGEORGIOU C, POGGIO T. A trainable system for object detection[J]. International Journal of Computer Vision, 2000, 38: 15-33.
[8] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005: 886-893.
[9] OJALA T, PIETIKAINEN M, MAENPAA T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 971-987.
[10] 郝帅, 高山, 马旭, 等. 基于跨尺度特征聚合与分层注意力映射的红外行人检测[J]. 光子学报, 2022, 51(6): 419-435.
HAO S, GAO S, MA X, et al. Infrared pedestrian detection based on cross-scale feature aggregation and hierarchical attention mapping[J]. Acta Photonica Sinica, 2022, 51(6): 419-435.
[11] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems 28, 2015: 91-99.
[12] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Oct 11-14, 2016. Cham: Springer, 2016: 21-37.
[13] REDMON J, DIVVALA S, GIRSHICK R. You only look once: unified, real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016: 779-788.
[14] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.
[15] 周永福, 李文龙, 胡冉冉. 多尺度特征融合的双通道SSD行人头部检测算法[J]. 激光与光电子学进展, 2021, 58(24): 383-394.
ZHOU Y F, LI W L, HU R R. Two-channel SSD pedestrian head detection algorithm based on multi-scale feature fusion[J]. Laser & Optoelectronics Progress, 2021, 58(24): 383-394.
[16] 孙好, 董兴法, 王军, 等. 基于改进YOLOv4-tiny轻量化校内行人目标检测算法[J]. 计算机工程与应用, 2023, 59(15): 97-106.
SUN H, DONG X F, WANG J, et al. Based on the improved YOLOv4-tiny lightweight pedestrian in school target detection algorithm[J]. Computer Engineering and Applications, 2023, 59(15): 97-106.
[17] 李明益, 贺敬良, 陈勇, 等. 红外交通场景下遮挡行人目标检测算法研究[J]. 激光与红外, 2022, 52(9): 1417-1424.
LI M Y, HE J L, CHEN Y, et al. Research on the detection algorithm of obscured pedestrian targets in infrared traffic scenes[J]. Laser & Infrared, 2022, 52(9): 1417-1424.
[18] 郝帅, 何田, 马旭, 等. 动态特征优化机制下的跨尺度红外行人检测[J]. 光学精密工程, 2022, 30(19): 2390-2403.
HAO S, HE T, MA X, et al. Cross-scale infrared pedestrian detection based on dynamic feature optimization mechanism[J]. Optics and Precision Engineering, 2022, 30(19): 2390-2403.
[19] 张印辉, 张朋程, 何自芬, 等. 红外行人目标精细尺度嵌入轻量化实时检测[J]. 光子学报, 2022, 51(9): 266-276.
ZHANG Y H, ZHANG P C, HE Z F, et al. Lightweight real-time detection model of infrared pedestrian embedded in fine-scale[J]. Acta Photonica Sinica, 2022, 51(9): 266-276.
[20] 谢斌红, 袁帅, 龚大立. 基于 RDB-YOLOv4 的煤矿井下有遮挡行人检测[J]. 计算机工程与应用, 2022, 58(5): 200-207.
XIE B H, YUAN S, GONG D L. Detection of blocked pedestrians based on RDB-YOLOv4 in coal mine[J]. Computer Engineering and Applications, 2022, 58(5): 200-207.
[21] 陈贵震, 邹国锋, 刘月, 等. 基于多尺度混合注意力与度量融合的小样本行人重识别[J]. 控制与决策, 2024, 39(5): 1441-1449.
CHEN G Z, ZOU G F, LIU Y, et al. Few-shot for person re-identification based on multi-scale mixed attention and metric fusion[J]. Control and Decision, 2024, 39(5): 1441-1449.
[22] ZHANG S, YANG J, SCHIELE B. Occluded pedestrian detection through guided attention in CNNs[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018: 6995-7003.
[23] 樊嵘, 马小陆. 面向拥挤行人检测的改进DETR算法[J]. 计算机工程与应用, 2023, 59(19): 159-165.
FAN R, MA X L. Improved DETR for crowded pedestrian detection[J]. Computer Engineering and Applications, 2023, 59(19): 159-165.
[24] HONG M, LI S, YANG Y, et al. SSPNet: scale selection pyramid network for tiny person detection from UAV images[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 19: 1-5.
[25] MEYER G P. An alternative probabilistic interpretation of the huber loss[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 5261-5269.
[26] 鲍文斌, 张冬泉. XSSD-P: 改进的SSD行人检测算法[J]. 计算机工程与应用, 2022, 58(23): 132-141.
BAO W B, ZHANG D Q. XSSD-P: improved SSD pedestrian detection algorithm[J]. Computer Engineering and Applications, 2022, 58(23): 132-141.
[27] 孙龙清, 王泊宁, 王嘉煜, 等. 基于G-RepVGG和鱼类运动行为的水质监测方法[J]. 农业机械学报, 2022, 53(S2): 210-218.
SUN L Q, WANG B N, WANG J Y, et. al. Water quality monitoring based on fish movement behavior and G-RepVGG[J]. Transactions of the Chinese Society for Agricultural Machinery, 2022, 53(S2): 210-218.
[28] HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 1580-1589.
[29] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 4700-4708.
[30] 梁继然, 陈壮, 董国军, 等. 基于Ghost卷积和通道注意力机制级联结构的车辆检测方法研究[J]. 天津大学学报 (自然科学与工程技术版), 2023, 56(2): 193-199.
LIANG J R, CHEN Z, DONG G J, et al. Research on the vehicle detection method based on the cascade structure of ghost convolution and channel attention mechanism[J]. Journal of Tianjin University (Science and Technology), 2023, 56(2): 193-199.
[31] 徐正军, 张强, 许亮. 一种基于改进YOLOv5s-Ghost网络的交通标志识别方法[J]. 光电子·激光, 2023, 34(1): 52-61.
XU Z J, ZHANG Q, XU L. A traffic sign recognition method based on improved YOLOv5s-Ghost network[J]. Journal of Optoelectronics·Laser, 2023, 34(1): 52-61.
[32] 程春阳, 吴小俊, 徐天阳. 基于GhostNet的端到端红外和可见光图像融合方法[J]. 模式识别与人工智能, 2021, 34(11): 1028-1037.
CHENG C Y, WU X J, XU T Y. End to end infrared and visible image fusion method based on GhostNet[J]. Pattern Recognition and Artificial Intelligence, 2021, 34(11): 1028-1037.
[33] 李现国, 曹明腾, 李滨, 等. GPNet: 轻量型红外图像目标检测算法[J]. 红外与毫米波学报, 2022, 41(6): 1092-1101.
LI X G, CAO M T, LI B, et al. GPNet: lightweight infrared image target detection algorithm[J]. Journal of Infrared and Millimeter Waves, 2022, 41(6): 1092-1101.
[34] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.
[35] 郑少佳. 基于傅里叶变换通道注意力网络的高光谱图像分类[D]. 上海: 华东师范大学, 2022.
ZHENG S J. Hyperspectral image classification based on Fourier transform channel attention network[D]. Shanghai: East China Normal University, 2022.
[36] 冯林娅, 姚力, 赵小杰. 引入Huber损失函数的睡眠脑电数据增强模型研究[J]. 北京师范大学学报 (自然科学版), 2021, 57(6): 875-882.
FENG L Y, YAO L, ZHAO X J. Study on sleep EEG data enhancement model by introducing Huber loss function[J]. Journal of Beijing Normal University (Natural Science), 2021, 57(6): 875-882.
[37] 俞搏天. p-Huber损失函数及其鲁棒性研究[D]. 杭州: 浙江师范大学, 2021.
YU B T. p-Huber loss functions and its robustness[D]. Hangzhou: Zhejiang Normal University, 2021.
[38] ZHANG S, XIE Y, WAN J, et al. Wider person: a diverse dataset for dense pedestrian detection in the wild[J]. IEEE Transactions on Multimedia, 2019, 22(2): 380-393.
[39] 邹梓吟, 盖绍彦, 达飞鹏, 等. 基于注意力机制的遮挡行人检测算法[J]. 光学学报, 2021, 41(15): 157-165.
ZOU Z Y, GAI S Y, DA F P, et al. Occluded pedestrian detection algorithm based on attention mechanism[J]. Acta Optica Sinica, 2021, 41(15): 157-165.
[40] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, 2017: 2980-2988.
[41] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018.
[42] GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021[J]. arXiv:2107.08430, 2021.