基于YOLOv5的轻量级雾天目标检测方法

doi:10.3778/j.issn.1002-8331.2308-0029

摘要/Abstract

摘要： 针对雾天场景下目标检测算法精度较低、模型复杂度较高，提出一种基于YOLOv5的轻量级雾天目标检测方法。采用感受野注意力模块(RFAblock)通过交互感受野特征信息，对感受野添加注意力机制，提高特征提取能力；采用轻量化网络Slimneck作为颈部结构，在保持精度的同时降低模型参数和复杂度；在损失函数中引入真实框与预测框之间的角度向量，提高训练速度和推理的准确性；采用PNMS(precise non-maximum suppression)改进候选框选择机制，降低车辆遮挡情况下的漏检率。在真实雾天数据集RTTS和合成雾天数据集Foggy Cityscapes上进行测试，实验结果表明，与YOLOv5l相比mAP50分别提高了4.9和3.5个百分点，模型参数量仅为YOLOv5l的54.6%。

关键词: 目标检测, 深度学习, 雾天场景, 轻量化, 注意力机制

Abstract: Aiming at the low accuracy and high model complexity of object detection algorithms in foggy scenes, a lightweight foggy object detection method based on YOLOv5 is proposed. Firstly, this paper adopts the receptive field attention module (RFAblock) to add an attention mechanism to the receptive field by interacting with the receptive field feature information to improve the feature extraction ability. Secondly, the lightweight network Slimneck is used as the neck structure to reduce the model parameters and complexity while maintaining the accuracy. The angle vector between the real frame and the predicted frame is introduced in the loss function to improve the training speed and inference accuracy. PNMS (precise non-maximum suppression) is used to improve the candidate frame selection mechanism and reduce the leakage detection rate in the case of vehicle occlusion. Finally, the experimental results are tested on the real foggy day dataset RTTS and the synthetic foggy day dataset Foggy Cityscapes, and the experimental results show that the mAP50 is improved by 4.9 and 3.5 percengtage points, respectively, compared with YOLOv5l, and the number of model parameters is only 54.6% of that of YOLOv5l.

Key words: object detection, deep learning, foggy scenes, lightweight, attention mechanism

赖镜安, 陈紫强, 孙宗威, 裴庆祺. 基于YOLOv5的轻量级雾天目标检测方法[J]. 计算机工程与应用, 2024, 60(6): 78-88.

LAI Jing’an, CHEN Ziqiang, SUN Zongwei, PEI Qingqi. Lightweight Foggy Weather Object Detection Method Based on YOLOv5[J]. Computer Engineering and Applications, 2024, 60(6): 78-88.

参考文献

[1] TERVEN J, CORDOVA-ESPARZA D. A comprehensive review of YOLO: from YOLOv1 to YOLOv8 and beyond[J]. arXiv:2304.00501, 2023.
[2] CHEN Y, YUAN X, WU R, et al. YOLO-MS: rethinking multi-scale representation learning for real-time object detection[J]. arXiv:2308.05480, 2023.
[3] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[4] CAI Z, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection [C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 6154-6162.
[5] 院老虎, 翟柯嘉, 张泽鹏, 等.基于模拟雾天遥感数据集的飞机目标检测研究[J].南京邮电大学学报 (自然科学版), 2021, 41(3): 77-84.
YUAN L H, ZHAI K J, ZHANG Z P. Aircraft target detection based on fog simulation remote sensing image dataset[J]. Journal of Nanjing University of Posts and Telecommunications (Natural Science Edition), 2021, 41(3): 77-84.
[6] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[7] MA Y, CAI J, TAO J, et al. Foggy image detection based on DehazeNet with improved SSD[C]//Proceedings of 2021 the 5th International Conference on Innovation in Artificial Intelligence, 2021: 82-86.
[8] HUANG S C, LE T H, JAW D W. DSNet: joint semantic learning for object detection in inclement weather conditions[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43(8): 2623-2633.
[9] 院老虎, 常玉坤, 刘家夫.基于改进YOLOv5s的雾天场景车辆检测方法[J].郑州大学学报 (工学版), 2023, 44(3): 35-41.
YUAN L H, CHANG Y K, LIU J F. Vehicle detection method based on improved YOLOv5s in foggy scene[J]. Journal of Zhengzhou University (Engineering Science), 2023, 44(3): 35-41.
[10] 詹成祥, 孟庆岩, 安健健, 等.基于clear-SSD的单点多盒飞机目标检测天气适用性[J].科学技术与工程, 2020, 20(31): 12717-12723.
ZHAN C X, MENG Q Y, AN J J, et al. Weather applicability of single shot multibox aircraft target detection based on clear-SSD[J]. Science Technology and Engineering, 2020, 20(31): 12717-12723.
[11] 刘书刚, 张林坤, 杜昊东, 等.雾天条件下改进YOLOv4的目标检测[J].系统仿真学报, 2023, 35(8): 1681-1691.
LIU S G, ZHANG L K, DU H D. Improved object detection of YOLOv4 in foggy conditions[J].Journal of System Simulation, 2023, 35(8): 1681-1691.
[12] LI H, LI J, WEI H, et al. Slim-neck by GSConv: a better design paradigm of detector architectures for autonomous vehicles[J]. arXiv:2206.02424, 2022.
[13] ZHU X, LYU S, WANG X, et al.TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[J]. arXiv:2108.11539, 2021.
[14] ZHANG X, LIU C, YANG D, et al. RFAConv: innovating spatital attention and standard convolutional operation[J]. arXiv:2304.03198, 2023.
[15] SRINIVAS A, LIN T Y, PARMAR N, et al. Bottleneck transformers for visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 16519-16529.
[16] VASWANI A, RAMACHANDRAN P, SRINIVAS A, et al. Scaling local self-attention for parameter efficient visual backbones[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 12894-12904.
[17] ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020: 12993-13000.
[18] GEVORGYAN Z. SIoU loss: more powerful learning for bounding box regression[J]. arXiv:2205.12740, 2022.
[19] YU F, CHEN H, WANG X, et al. Bdd100k: a diverse driving dataset for heterogeneous multitask learning[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 2633-2642.
[20] LI B, REN W, FU D, et al. Benchmarking single-image dehazing and beyond[J]. IEEE Transactions on Image Processing, 2018, 28(1): 492-505.
[21] SAKARIDIS C, DAI D, VAN GOOL L. Semantic foggy scene understanding with synthetic data[J]. International Journal of Computer Vision, 2018, 126(9): 973-992.