计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (19): 177-183.DOI: 10.3778/j.issn.1002-8331.2206-0433

• 图形图像处理 • 上一篇    下一篇

面向自动驾驶的轻量级道路场景语义分割

李顺新,吴桐   

  1. 1.武汉科技大学 计算机科学与技术学院,武汉 430065
    2.武汉科技大学 大数据科学与工程研究院,武汉 430065
    3.湖北智能信息处理与实时工业系统重点实验室,武汉 430065
  • 出版日期:2023-10-01 发布日期:2023-10-01

Lightweight Semantic Segmentation of Road Scenes for Autonomous Driving

LI Shunxin, WU Tong   

  1. 1.School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430065, China
    2.Big Data Science and Engineering Research Institute, Wuhan University of Science and Technology, Wuhan 430065, China
    3.Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial, Wuhan 430065, China
  • Online:2023-10-01 Published:2023-10-01

摘要: 自动驾驶领域中,现有的道路场景语义分割算法开销巨大,无法满足自动驾驶的实时性。基于DeepLabV3+的整体结构,提出了一种并行特征处理的轻量级图像语义分割模型,兼顾了高精度和实时性。采用MobileNetV2作为主干网络,精简上采样过程,提升分割速度,并减少网络参数量,以便于网络迁移和训练;引入双注意力机制,与空洞卷积空间金字塔模块结合组成并行特征处理结构,提高分割精度;最后,将MobileNetV2与该并行特征处理结构相结合,以完成对图像特征的提取。实验结果表明,相比于传统模型,所提出模型能以少量的系统开销和网络参数量保证高效且精准的图像分割。模型在Cityscapes数据集mIoU达到73.61%,处理一张512×512的图片仅需25 ms。

关键词: 语义分割, 自动驾驶, 轻量级, MobileNetV2, 注意力机制

Abstract: In the field of autonomous driving, existing semantic segmentation algorithms for road scenes have huge overhead and cannot meet the real-time performance of autonomous driving. Based on the overall structure of DeepLabV3+, this paper proposes a lightweight image semantic segmentation model with parallel feature processing, which takes into account both high accuracy and real-time performance. Firstly, MobileNetV2 is used as the backbone network to streamline the upsampling process and reduce the number of network parameters for network migration and training. Then, a dual-attention mechanism is introduced to combine with the null convolutional space pyramid module to form a parallel feature processing structure to improve segmentation accuracy. Finally, the parallel feature processing structure is then combined with MobileNetV2 to complete the extraction of image features. The experimental results show that the proposed model can guarantee efficient and accurate image segmentation with less system overhead and number of network parameters than the traditional model. The model achieves 73.61% mIoU in the Cityscapes dataset and processes a 512×512 image in 25 ms.

Key words: semantic segmentation, autopilot, lightweight, MobileNetV2, attention mechanism