基于多重机制优化YOLOv8的复杂环境下安全帽检测方法

doi:10.3778/j.issn.1002-8331.2402-0147

摘要/Abstract

摘要： 为了解决建筑工地、隧道、煤矿等施工场景中现有安全帽检测算法对于小目标、密集目标以及复杂环境下的检测精度低的问题，提出了一种基于多重机制的安全帽检测方法。以YOLOv8n为基础将Backbone部分的C2f模块加入可扩张残差（DWR）注意力模块，使得网络能够更灵活地适应不同尺度的特征，以而更准确地识别图像中的物体；采用可变形卷积AKConv模块取代主干部分中的原始Conv，为卷积神经网络带来了显著的性能提升，从而实现更高效的特征提取。此外引用了大型可分离核注意力LSKA模块与SPPF结构相结合，大大增强了模型核心的融合能力。在Safety helmet数据集的实验结果表明，改进后的算法相较于原模型，mAP@0.5指标上提升了10.5个百分点，在mAP@0.5-0.95指标上提升了3.7个百分点，能有效提高复杂场景下的安全帽佩戴检测精度。

关键词: 安全帽, YOLOv8n, DWR模块, AKConv模块, LSKA模块

Abstract: In order to solve the problem of low detection accuracy of existing helmet detection algorithms for small targets, dense targets and complex environments in construction sites, tunnels, coal mines and other construction scenes, a helmet detection method based on multiple mechanisms is proposed. Firstly, the C2f module of the Backbone part of YOLOv8n is added with a scalable residual (DWR) attention module, which makes the network more flexible to adapt to features of different scales and accurately identify objects in the image. Secondly, the original Conv in the main part is replaced by the deformable convolution AKConv module, which brings significant performance improvement to the convolutional neural network and achieves more efficient feature extraction. In addition, the combination of large separable kernel attention LSKA module and SPPF structure is used to greatly enhance the fusion capability of the model core. The experimental results on the Safety helmet dataset show that compared with the original model, the improved algorithm has improvement of 10.5 percentage points in mAP@0.5 and 3.7 percentage points in mAP@0.5-0.95, which can effectively improve the accuracy of safety helmet wear detection in complex scenes.

Key words: safety helmet, YOLOv8n, dilation-wise residual (DWR), arbitrary kernel convolution (AKConv), large separable kernel attention (LSKA)

肖振久, 严肃, 曲海成. 基于多重机制优化YOLOv8的复杂环境下安全帽检测方法[J]. 计算机工程与应用, 2024, 60(21): 172-182.

XIAO Zhenjiu, YAN Su, QU Haicheng. Safety Helmet Detection Method in Complex Environment Based on Multi-Mechanism Optimization of YOLOv8[J]. Computer Engineering and Applications, 2024, 60(21): 172-182.

参考文献

[1] 牟亮, 赵红, 李燕, 等. 基于梯度压缩的YOLO v4算法车型识别[J]. 工程科学学报, 2022, 44(5): 940-950.
MU L, ZHAO H, LI Y, et al. Vehicle recognition based on gradient compression and YOLO v4 algorithm[J]. Chinese Journal of Engineering, 2022, 44(5): 940-950.
[2] 刘晓慧, 叶西宁. 肤色检测和Hu矩在安全帽识别中的应用[J]. 华东理工大学学报 (自然科学版), 2014, 40(3): 365-370.
LIU X H, YE X N. Skin color detection and Hu moments in helmet recognition research[J]. Journal of East China University of Science and Technology (Natural Sicence Edition), 2014, 40(3): 365-370.
[3] PARK M W, ELSAFTY N, ZHU Z. Hardhat-wearing detection for enhancing on-site safety of construction workers[J]. Journal of Construction Engineering and Management, 2015, 141(9): 04015024.
[4] MNEYMNEH B E, ABBAS M, KHOURY H. Automated hardhat detection for construction safety applications[J]. Procedia Engineering, 2017, 196: 895-902.
[5] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.
[6] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems, 2015.
[7] HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
[8] 徐守坤, 王雅如, 顾玉宛, 等. 基于改进Faster RCNN的安全帽佩戴检测研究[J]. 计算机应用研究, 2020, 37(3): 901-905.
XU S K, WANG Y R, GU Y W, et al. Safety helmet wearing detection study based on improved faster RCNN[J]. Application Research of Computers, 2019, 37(3): 901-905.
[9] 吴冬梅, 王慧, 李佳. 基于改进Faster RCNN的安全帽检测及身份识别[J]. 信息技术与信息化, 2020(1): 17-20.
WU D M, WANG H, LI J. Safety helmet detection and identification based on improved faster RCNN[J]. Information Technology and Informatization, 2020 (1): 17-20.
[10] 何湘杰, 宋晓宁. YOLOv4-Tiny的改进轻量级目标检测算法[J]. 计算机科学与探索, 2024, 18(1): 138-150.
HE X J, SONG X N. Improved YOLOv4-Tiny lightweight target detection algorithm[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(1): 138-150.
[11] 王春梅, 刘欢. YOLOv8-VSC: 一种轻量级的带钢表面缺陷检测算法[J]. 计算机科学与探索, 2024, 18(1): 151-160.
WANG C M, LIU H. YOLOv8-VSC: lightweight algorithm for strip surface defect detection[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(1): 151-160.
[12] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[13] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.
[14] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018.
[15] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[J]. arXiv: 2004.10934, 2020.
[16] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision (ECCV 2016), Amsterdam, The Netherlands, October 11-14, 2016.[S.l.]: Springer International Publishing, 2016: 21-37.
[17] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2980-2988.
[18] 施辉, 陈先桥, 杨英. 改进YOLO v3的安全帽佩戴检测方法[J]. 计算机工程与应用, 2019, 55(11): 213-220.
SHI H, CHEN X Q, YANG Y. Safety helmet wearing detection method of improved YOLO v3[J]. Computer Engineering and Applications, 2019, 55(11): 213-220.
[19] 杨永波, 李栋. 改进YOLOv5的轻量级安全帽佩戴检测算法[J]. 计算机工程与应用, 2022, 58(9): 201-207.
YANG Y B, LI D. Lightweight helmet wearing detection algorithm of improved YOLOv5[J]. Computer Engineering and Applications, 2022, 58(9): 201-207.
[20] MEHTA S, RASTEGARI M, CASPI A, et al. ESPNet: efficient spatial pyramid of dilated convolutions for semantic segmentation[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 552-568.
[21] MEHTA S, RASTEGARI M, SHAPIRO L, et al. ESPNetv2: a light-weight, power efficient, and general purpose convolutional neural network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 9190-9200.
[22] LI G, YUN I, KIM J, et al. DABNet: depth-wise asymmetric bottleneck for real-time semantic segmentation[J]. arXiv:1907.11357, 2019.
[23] WU T, TANG S, ZHANG R, et al. CGNet: a light-weight context guided network for semantic segmentation[J]. IEEE Transactions on Image Processing, 2020, 30: 1169-1179.
[24] DAI J, QI H, XIONG Y, et al. Deformable convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 764-773.
[25] ZHANG X, LIU C, YANG D, et al. RFAConv: innovating spatital attention and standard convolutional operation[J]. arXiv:2304.03198, 2023.
[26] HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
[27] ZHANG H, HU W, WANG X. ParC-Net: position aware circular convolution with merits from convnets and transformer[C]//Proceedings of the European Conference on Computer Vision, 2022: 613-630.
[28] TAN M, LE Q. EfficientNet: rethinking model scaling for convolutional neural networks[C]//Proceedings of the International Conference on Machine Learning, 2019: 6105-6114.
[29] AZAD R, NIGGEMEIER L, HüTTEMANN M, et al. Beyond self-attention: deformable large kernel attention for medical image segmentation[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024: 1287-1297.
[30] OUYANG D, HE S, ZHANG G, et al. Efficient multi-scale attention module with cross-spatial learning[C]//Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), 2023: 1-5.
[31] CHEN J, LI B, XUE X. Scene text telescope: text-focused scene image super- resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 12026-12035.
[32] SUNKARA R, LUO T. No more strided convolutions or pooling: a new CNN building block for low-resolution images and small objects[C]//Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2022: 443-459.
[33] ZHANG X, LIU C, YANG D, et al. RFAConv: innovating spatital attention and standard convolutional operation[J]. arXiv:2304.03198, 2023.
[34] LI C, ZHOU A, YAO A. Omni-dimensional dynamic convolution[J]. arXiv:2209.07947, 2022.
[35] DING X, ZHANG Y, GE Y, et al. UniRepLKNet: a universal perception large-kernel convnet for audio, video, point cloud, time-series and image recognition[J]. arXiv:2311.15599, 2023.
[36] LI C, LI L, GENG Y, et al. YOLOv6 v3.0: a full-scale reloading[J]. arXiv:2301.05586, 2023.
[37] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 7464-7475.
[38] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[J]. arXiv:1412.7062, 2014.
[39] 王坡, 罗红旗. 改进YOLOv5的安全帽佩戴检测算法[J]. 现代计算机, 2023, 29(24): 40-45.
WANG P, LUO H Q. Improved helmet wearing detectionmethod of YOLOv5[J]. Modern Computer, 2023, 29(24): 40-45.
[40] 韩锟栋, 张涛, 彭玻, 等. 基于改进YOLOv5的安全帽检测算法[J]. 现代电子技术, 2024, 47(5): 85-92.
HAN K D, ZHANG T, PENG B, et al. Safety helmet detection algorithm based on improved YOLOv5[J]. Modern Electronics Technique, 2024, 47(5): 85-92.
[41] 张欣毅, 张运楚, 王菲, 等. 改进YOLOv7的轧钢车间安全帽佩戴检测算法[J]. 计算机测量与控制, 2024(7): 15-22.
ZHANG X Y, ZHANG Y C, WANG F, et al. Helmet wearing detection algorithm in steel rolling workshops based on improved YOLOv7[J]. Computer Measurement & Control, 2024(7): 15-22.
[42] 丁玲, 缪小然, 胡建峰, 等. 改进YOLOv8s与DeepSORT的矿工帽带检测及人员跟踪[J]. 计算机工程与应用, 2024, 60(5): 328-335.
DING L, MIAO X R, HU J F, et al. Improved miner chin strap detection and personnel tracking with YOLOv8s and DeepSORT[J]. Computer Engineering and Applications, 2024, 60(5): 328-335.