计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (21): 251-257.DOI: 10.3778/j.issn.1002-8331.2206-0455

• 图形图像处理 • 上一篇    下一篇

基于先验显著性信息的道路场景目标检测

王钲棋,邵洁   

  1. 上海电力大学 电子与信息工程学院,上海 201306
  • 出版日期:2023-11-01 发布日期:2023-11-01

Road Scene Object Detection Based on Prior Saliency Information

WANG Zhengqi, SHAO Jie   

  1. School of Electronic and Information Engineering, Shanghai University of Electric Power, Shanghai 201306, China
  • Online:2023-11-01 Published:2023-11-01

摘要: 在自动驾驶领域中,道路目标检测与识别是关键一环,直接关系到智能汽车的行车安全。在驾驶场景下目标种类多,大小差距大,导致卷积网络无法充分提取目标的位置信息,是道路场景检测准确率较低的主要原因,针对该问题,提出一种基于显著性信息改进的Sa-YOLOV5s算法。利用改进的语义分割网络(SaNet)充分提取语义信息,获取显著性图像。将显著性图像与不同尺度的卷积层特征进行融合,增强背景与目标的辨别性。利用DIoU-NMS充分计算所有检测框的位置,进一步减少误检和漏检的情况。通过与BshapeNet+算法及DIDN算法进行对比实验,验证了该方法在Cityscapes数据集上检测性能优于BshapeNet+算法及DIDN算法,平均检测精度分别上升了0.024和0.072;检测实时性方面,推理速度为33?FPS,达到了实时检测24?FPS的标准。

关键词: 道路场景目标检测, 显著性信息, YOLOV5

Abstract: In the field of automatic driving, road object detection and recognition is a key link, which is directly related to the driving safety of intelligent vehicles. In the driving scene, there are many kinds of objects with large difference size, which makes the convolutional network unable to fully extract the object location information, cause the low accuracy of road scene detection. To solve this problem, an improved Sa-YOLOV5s algorithm based on saliency information is proposed. Firstly, the improved semantic segmentation model(SaNet) is used to fully extract semantic information and obtain salient image. Then the salient image is fused with the convolutional layer features of different scales to enhance the discrimination between the background and the target. Finally, DIoU-NMS is used to fully calculate the positions of all bounding boxes to further reduce the situation of false detection and missed detection. By comparing with BshapeNet+ algorithm and DIDN algorithm, it is verified that the detection performance of this method is better than BshapeNet+ algorithm and DIDN algorithm on Cityscapes dataset, and the mean average precision is increased by 0.024 and 0.072 respectively. In terms of real-time detection, the detection speed reaches 33?frames per second, which meets the standard of real-time detection of 24?frames per second.

Key words: road scene object detection, salience?information, YOLOV5