Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (18): 157-164.DOI: 10.3778/j.issn.1002-8331.1911-0392

Previous Articles     Next Articles

Improved DeepLabv2 Real-time Image Semantic Segmentation Algorithm

MA Shuhao, AN Jubai, YU Bo   

  1. College of Information Science and Technology, Dalian Maritime University, Dalian, Liaoning 116026, China
  • Online:2020-09-15 Published:2020-09-10

改进DeepLabv2的实时图像语义分割算法

马书浩,安居白,于博   

  1. 大连海事大学 信息科学技术学院,辽宁 大连 116026

Abstract:

Image semantic segmentation is one of the important components of computer vision perception system. Aiming at the problem of slow segmentation speed of existing semantic segmentation algorithms, an improved real-time image semantic segmentation algorithm based on DeepLabv2 is proposed. Compared with DeepLabv2, this algorithm uses light-weight convolution neural network which is Xception as the encoder, adds the decode process by the Feature Pyramid Net (FPN), reduces the number of parameters of Atrous convolution Spatial Pyramid Pooling (ASPP), so that greatly compresses the algorithm model and improves the algorithm’s segmentation speed. In addition, this paper improves the problem that the Focal Loss function is difficult to select hyper-parameters in multi-classification tasks and applies it to the algorithm in this paper to improve the segmentation accuracy of the algorithm. The experimental results on Cityscapes and Pascal VOC2012 show that the proposed algorithm can achieve real-time segmentation speed and has the advantage of high segmentation accuracy. Meanwhile, it also shows that the proposed hyper-parameter selection method can further improve the segmentation accuracy of the algorithm.

Key words: semantic segmentation, Convolution Neural Network(CNN), image segmentation, unmanned

摘要:

图像语义分割是计算机视觉感知系统的重要组成之一,针对现有的语义分割算法存在分割速度慢的问题提出基于DeepLabv2改进的实时图像语义分割算法。与DeepLabv2相比,改进后的算法使用轻量卷积神经网络Xception作为编码器,增加特征金字塔网络(Feature Pyramid Net,FPN)解码特征的过程,减少空洞金字塔池化网络(Atrous convolution Spatial Pyramid Pooling,ASPP)参数的数量,进而大幅度压缩了算法模型,提升了算法分割速度。此外,还对Focal Loss损失函数在多分类任务中难以选择超参数的问题做出改进,并用于提升算法分割精度。在Cityscapes和Pascal VOC2012数据集上的实验结果表明改进后的算法可达到实时分割速度且具有分割精度高的优点,同时还表明提出的超参数选择方法可进一步提升算法分割精度。

关键词: 语义分割, 卷积神经网络, 图像分割, 无人驾驶