Haar小波下采样优化YOLOv9的道路车辆和行人检测

doi:10.3778/j.issn.1002-8331.2406-0204

摘要/Abstract

摘要： 在当前智能化、信息化的大背景下，为了实现无人驾驶模式复杂环境中智能收集道路的行人和车辆目标，提出了一种基于Haar小波下采样（Haar wavelet downsampling，HWD）的YOLOv9算法（HWD_YOLOv9）用于车辆与行人目标检测。Haar小波的下采样操作，降低特征图的空间分辨率，尽可能保留了边缘、纹路等细节信息，有效降低了信息的不确定性。采用交叉熵损失和广义骰子损失之和作为网络的损失函数，可以有效地度量概率分布之间的差异，且逐像素进行骰子损失计算，便于优化网络。实验结果显示，在KITTY数据集上，所提模型的平均精度均值达到了95.86%，检测帧率达到了179?FPS。与YOLOv9相比，改进后的算法能够精确地识别出复杂道路上不同尺度的车辆与行人，改善了原检测算法中的计算容量的冗余和小目标的漏检问题，为智能化的无人驾驶提供了视觉技术支持。

关键词: 小目标检测, 车辆行人, YOLOv9, 深度学习, Haar小波下采样

Abstract: In the current background of intelligence and informatization, the YOLOv9 algorithm based on Haar wavelet downsampling (HWD) is proposed for vehicle and pedestrian target detection in complex environments with autonomous driving mode to intelligently collect pedestrian and vehicle targets on the road. The operation of Haar wavelet downsampling reduces the spatial resolution of feature maps and preserves detailed information such as edges and textures as much as possible, effectively reducing the uncertainty of information. By utilizing the sum of cross entropy loss and generalized dice loss as the loss function of the network, the difference between probability distributions can be effectively measured, and dice loss calculations can be performed pixel by pixel, making it easier to optimize the network. The experimental results show that the average accuracy of the proposed model reaches 95.86%, and the detection frame rate reaches 179 FPS on the KITTY dataset. Compared with YOLOv9, the improved algorithm can accurately identify vehicles and pedestrians of different scales on complex roads, which not only improves the redundancy of computational capacity and missed detection of small targets in the original detection algorithm, but also provides visual technology support for intelligent autonomous driving.

Key words: small object detection, vehicles and pedestrians, YOLOv9, deep learning, Haar wavelet downsampling (HWD)

李琳, 靳志鑫, 俞晓磊, 王安红. Haar小波下采样优化YOLOv9的道路车辆和行人检测[J]. 计算机工程与应用, 2024, 60(20): 207-214.

LI Lin, JIN Zhixin, YU Xiaolei, WANG Anhong. Road Vehicle and Pedestrian Detection Based on YOLOv9 for Haar Wavelet Downsampling[J]. Computer Engineering and Applications, 2024, 60(20): 207-214.

参考文献

[1] 李伟东, 黄振柱, 何精武, 等. 改进行为克隆与DDPG的无人驾驶决策模型[J]. 计算机工程与应用, 2024, 60(14): 86-95.
LI W D, HUANG Z Z, HE J W, et al. Improved behavioral cloning and DDPG’s driverless decision model[J]. Computer Engineering and Applications, 2024, 60(14): 86-95.
[2] WANG Z, ZHAN J, DUAN C, et al. A review of vehicle detection techniques for intelligent vehicles[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(8): 3811-3831.
[3] KARANGWA J, LIU J, ZENG Z. Vehicle detection for autonomous driving: a review of algorithms and datasets[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(11): 11568-11594.
[4] YIN T, CHEN W, LIU B, et al. Light “you only look once”: an improved lightweight vehicle-detection model for intelligent vehicles under dark conditions[J]. Mathematics, 2024, 12(1): 124.
[5] NAGARAJAN J, MANSOURIAN P, SHAHID M A, et al. Machine learning based intrusion detection systems for connected autonomous vehicles: a survey[J]. Peer-to-Peer Networking and Applications, 2023, 1(16): 2153-2185.
[6] FAN D P, JI G P, CHENG M M, et al. Concealed object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(10): 6024-6042.
[7] LI W, ZHAO D, YUAN B, et al. PETDet: proposal enhancement for two-stage fine-grained object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 1-14.
[8] QI H, SHI P, LIU Z, et al. TSF: two-stage sequential fusion for 3D object detection[J]. IEEE Sensors Journal, 2022, 12(22): 12163-12172.
[9] KIM J, KIM H, KIM T, et al. MLPD: multi-label pedestrian detector in multispectral domain[J]. IEEE Robotics and Automation Letters, 2021, 6(4): 7846-7853.
[10] 汪菊, 孙玉, 吴宜良. 改进Mask R-CNN的车辆检测算法[J]. 福州大学学报(自然科学版), 2024, 52(4): 421-429.
WANG J, SUN Y, WU Y L. Improved algorithm of mask R-CNN for vehicle detection[J]. Journal of Fuzhou University (Natural Science Edition), 2024, 52(4): 421-429.
[11] MA J, WAN H, WANG J, et al. An improved one-stage pedestrian detection method based on multi-scale attention feature extraction[J]. Journal of Real-Time Image Processing, 2021, 18: 1965-1978.
[12] WEI W, CHENG L, XIA Y, et al. Occluded pedestrian detection based on depth vision significance in biomimetic binocular[J]. IEEE Sensors Journal, 2019, 19(23): 11469-11474.
[13] 陈婷, 朱熟康, 高涛, 等. 基于自适应融合的实时车辆检测[J]. 同济大学学报 (自然科学版), 2024, 52(4): 532-540.
CHEN T, ZHU S K, GAO T, et al. Real-time vehicle detection based on adaptive fusion[J]. Journal of Tongji University (Natural Science), 2024, 52(4): 532-540.
[14] WEN L H, JO K H. Three-attention mechanisms for one-stage 3-D object detection based on LiDAR and camera[J]. IEEE Transactions on Industrial Informatics, 2021, 17(10): 6655-6663.
[15] LI C, WANG Y, LIU X. An improved YOLOv7 lightweight detection algorithm for obscured pedestrians[J]. Sensors, 2023, 23(13): 5912.
[16] LI F, ZHANG H, LIU S, et al. DN-DETR: accelerate DETR training by introducing query denoising[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(4): 2239-2251.
[17] 刘晶, 刘俊伟. 应用归一化通道注意力机制的YOLOv7交通标志检测算法[J/OL]. 计算机工程与应用: 1-11(2024-06-27) [2024-07-05]. http://kns.cnki.net/kcms/detail/11.2127.
tp.20240626.1452.009.html.
LIU J, LIU J W. YOLOv7 traffic sign detection algorithm with normalized channel attention mechanism[J/OL]. Computer Engineering and Applications: 1-11(2024-06-27) [2024-07-05]. http://kns.cnki.net/kcms/detail/11.2127.tp.20240626.
1452.009.html.
[18] 刘丽, 张硕, 白宇昂, 等. 改进YOLOv8的轻量级军事飞机检测算法[J]. 计算机工程与应用, 2024, 60(18): 114-125.
LIU L, ZHANG S, BAI Y A, et al. Improved lightweight military aircraft detection algorithm of YOLOv8[J]. Computer Engineering and Applications, 2024, 60(18): 114-125.
[19] SHI Y, LI S, LIU Z, et al. MTP-YOLO: you only look once based maritime tiny person detector for emergency rescue[J]. Journal of Marine Science and Engineering, 2024, 12(4): 669.
[20] XU G, LIAO W, ZHANG X, et al. Haar wavelet downsampling: a simple but effective downsampling module for semantic segmentation[J]. Pattern Recognition, 2023, 143: 109819.