Computer Engineering and Applications ›› 2017, Vol. 53 ›› Issue (22): 8-15.DOI: 10.3778/j.issn.1002-8331.1708-0195

Previous Articles     Next Articles

Road scene understanding based on deep convolutional neural network

WU Zongsheng1,2, FU Weiping2, HAN Gaining1   

  1. 1.School of Computer, Xianyang Normal University, Xianyang, Shaanxi 712000, China
    2.School of Mechanical and Precision Instrumental Engineering, Xi’an University of Technology, Xi’an 710048, China
  • Online:2017-11-15 Published:2017-11-29



  1. 1.咸阳师范学院 计算机学院,陕西 咸阳 712000
    2.西安理工大学 机械与精密仪器工程学院,西安 710048

Abstract: In the self-driving technology, the road scene understanding is a very important task for environment perception, and it is a challenging topic. In this paper, a deep Road Scene Segmentation Network(RSSNet) is presented, which is a 32-layer full convolutional network composed of convolution encoded network and deconvolution decoded network. The batch normalization layer used in the RSSNet prevents the vanishing gradient problem from appearing during the training process; the activation layer using the Maxout function further weakens the vanishing gradient and avoids the network falling into a saturated mode and neuron death phenomenon; moreover, the RSSNet using dropout operation prevents the over-fitting phenomenon of the network model; the max-pool indices of the feature map saved by the encoded-network are used in the decoded-network to upsample the feature map, which keeps the important edge information down. The experimental results show that the RSSNet can greatly improve the training efficiency and the segmentation accuracy, effectively classify each pixel in the road scene image and smoothly segment the objects, and provide useful information of road environment for driverless cars.

Key words: deep learning, convolutional neural network, scenes understanding, semantic segmentation

摘要: 在无人驾驶技术中,道路场景的理解是一个非常重要的环境感知任务,也是一个很具有挑战性的课题。提出了一个深层的道路场景分割网络(Road Scene Segmentation Network,RSSNet),该网络为32层的全卷积神经网络,由卷积编码网络和反卷积解码网络组成。网络中采用批正则化层防止了深度网络在训练中容易出现的“梯度消失”问题;在激活层中采用了Maxout激活函数,进一步缓解了梯度消失,避免网络陷入饱和模式以及出现神经元死亡现象;同时在网络中适当使用Dropout操作,防止了模型出现过拟合现象;编码网络存储了特征图的最大池化索引并在解码网络中使用它们,保留了重要的边缘信息。实验证明,该网络能够大大提高训练效率和分割精度,有效识别道路场景图像中各像素的类别并对目标进行平滑分割,为无人驾驶汽车提供有价值的道路环境信息。

关键词: 深度学习, 卷积神经网络, 场景理解, 语义分割