Object Detection Method Based on Multi-scale Feature Fusion for Driving Scene

doi:10.3778/j.issn.1002-8331.2101-0430

Abstract

Abstract:

Aiming at the problem of low detection accuracy of convolutional neural network model for object detection in driving vision, a multi-scale feature fusion object detection method based on improved RefineDet is proposed. Firstly, the LFIP（Light-weight Featured Image Pyramid） network is embedded in the RefineDet, and the multi-scale feature map generated by LFIP network is integrated with the main feature map output from ARM（Anchor Refinement Module） in the RefineDet, which improves the output effect of anchors preliminary classification and regression in the convolutional layer, and provides refined anchors frame for ODM（Object Detection Module） for further regression and multi-class prediction. Secondly, after the ODM in the RefineDet, a multi-branch structure RFB（Receptive Field Block） is embedded to obtain receptive fields of different scale in the detection task to improve the features extracted from the backbone network. Thirdly, the activation function in the model is replaced by the nonlinear activation function PReLU（Parametric Rectified Linear Unit） with learnable parameters to speed up the convergence of the model. Then, the Bounding box loss function of RefineDet is replaced by the Repulsion Loss function to narrow the gap between a proposal and its designated target and increase the distance between the proposal and the surrounding non-target objects. Finally, an object detection dataset is constructed with 48 260 images in driving vision, including 38 608 as training set and 9 652 as test set, which are verified on mainstream GPU hardware platform. The mAP of this method is 85.59%, which is better than RefineDet and other improved algorithms;the FPS is 41.7 frame/s, which meets the application requirements of driving scene object detection. Experimental results show that the proposed method can improve the accuracy of object detection in driving vision, and solve the problems of occlusion object detection and small object detectionin driving vision to a certain extent.

Key words: deep learning, convolutional neural network, object detection, RefineDet algorithm, Receptive Field Block（RFB）, Light-weight Featured Image Pyramid（LFIP）, Parametric Rectified Linear Unit（PReLU）, loss function, occlusion object

摘要：

针对驾驶场景中目标检测卷积神经网络模型检测精度较低的问题，提出一种基于改进RefineDet网络结构的多尺度特征融合目标检测方法。在RefineDet网络结构中嵌入LFIP（Light-weight Featurized Image Pyramid，轻量级特征化的图像金字塔）网络，将LFIP网络生成的多尺度特征图与RefineDet中的ARM（Anchor Refinement Module，锚点框修正模块）输出的主特征图相融合，提升特征层中锚点框初步分类和回归的输出效果，为ODM（Object Detection Module，目标检测模块）模块提供修正的锚点框以便于进一步回归和多类别预测；在RefineDet网络结构中的ODM之后嵌入多分支结构RFB（Receptive Field Block，感受野模块），在检测任务中获得不同尺度的感受野以改善主干网络中提取的特征。将模型中的激活函数替换为带有可学习参数的非线性激活函数PReLU（Parametric Rectified Linear Unit，参数化修正线性单元），加快网络模型的收敛速度；将RefineDet的边界框回归损失函数替换为排斥力损失函数Repulsion Loss，使目标检测中的某预测框更靠近其对应的目标框，并使该预测框远离附近的目标框及预测框，可以提升遮挡情况下目标检测的精度；构建驾驶视觉下的目标检测数据集，共计48 260张，其中38 608张作为训练集，9 652张作为测试集，并在主流的GPU硬件平台进行验证。该方法的mAP为85.59%，优于RefineDet及其他改进算法；FPS为41.7 frame/s，满足驾驶场景目标检测的应用要求。实验结果表明，该方法在检测速度略微下降的情况，能够较好地提升驾驶视觉下的目标检测的精确度，并能够一定程度上解决驾驶视觉下的遮挡目标检测和小目标检测的问题。

关键词: 深度学习, 卷积神经网络, 目标检测, RefineDet算法, 感受野模块（RFB）, 轻量级特征化的图像金字塔（LFIP）, 参数化修正线性单元（PReLU）, 损失函数, 遮挡目标

HUANG Tongyu, HU Binjie, ZHU Tingting, HUANG Zhewen. Object Detection Method Based on Multi-scale Feature Fusion for Driving Scene[J]. Computer Engineering and Applications, 2021, 57(14): 134-141.

黄仝宇，胡斌杰，朱婷婷，黄哲文. 面向驾驶场景的多尺度特征融合目标检测方法[J]. 计算机工程与应用, 2021, 57(14): 134-141.

[1]	WU Wenjie, SONG Wen’ai, GAO Xuemei, YANG Jijiang, WANG Qing, HUANG Liping, LEI Yi. Review of X-Ray-Based Computer-Aided Diagnosis of Adult OSA [J]. Computer Engineering and Applications, 2021, 57(9): 1-8.
[2]	RAN Rong, XU Xinghua, QIU Shaohua, CUI Xiaopeng, OUYANG Bin. Review of Crack Detection Methods Based on Deep Convolutional Neural Networks [J]. Computer Engineering and Applications, 2021, 57(9): 23-35.
[3]	LI Xiaoxiao, HU Xiaoguang, WANG Ziqiang, DU Zhuoqun. Survey of Instance Segmentation Based on Deep Learning [J]. Computer Engineering and Applications, 2021, 57(9): 60-67.
[4]	MOU Qingping, ZHANG Ying, ZHANG Dongbo, WANG Xinjie, YANG Zhiqiao. Research on Visual Tracking Algorithm and Application of Target Loss Discrimination Mechanism [J]. Computer Engineering and Applications, 2021, 57(9): 140-147.
[5]	BAO Zhiqiang, XING Yu, LYU Shaoqing, HUANG Qiongdan. Improved YOLO V2 6D Object Pose Estimation Algorithm [J]. Computer Engineering and Applications, 2021, 57(9): 148-153.
[6]	HUANG Dongyi, YANG Bing, WU Zihao, KUANG Jiayi, YAN Zeming. Spatio-Temporal Fully Connected Convolutional Neural Networks for Citywide Cellular Prediction [J]. Computer Engineering and Applications, 2021, 57(9): 168-175.
[7]	ZHAO Zhiyan, YANG Hua, HU Zhiwei, YU Haiping. Identification Model of Pests on Yuluxiang Pear Leaves Based on TACNN [J]. Computer Engineering and Applications, 2021, 57(9): 176-181.
[8]	ZHOU Lungang, SUN Yifeng, WANG Kun, WU Jiang, HUANG Weigui, LI Binglong. End to End Object Recognition Algorithm for Multi-attributes of Multi-values [J]. Computer Engineering and Applications, 2021, 57(9): 182-190.
[9]	ZHANG Cheng, DAI Junfeng, XIONG Wenxin. Improved Handwritten Date Recognition in Scanned Documents Combined with LeNet-5 [J]. Computer Engineering and Applications, 2021, 57(9): 207-211.
[10]	WANG Bo, SONG Dan, WANG Hongyu. Research on Key Technologies of UAV Autonomous Inspection System [J]. Computer Engineering and Applications, 2021, 57(9): 255-263.
[11]	MA Zhexu, YANG Feng, QIAO Xu. Intelligent Detection Method of Railway Subgrade Defect [J]. Computer Engineering and Applications, 2021, 57(9): 272-278.
[12]	XU Shaojie, CAO Chuqing, WANG Yongjuan. Application Research of Visual SLAM in Indoor Dynamic Scenes [J]. Computer Engineering and Applications, 2021, 57(8): 175-179.
[13]	LI Mingshan, HAN Qingpeng, ZHANG Tianyu, WANG Daolei. Safety Helmet Detection Method of Improved SSD [J]. Computer Engineering and Applications, 2021, 57(8): 192-197.
[14]	GUO Xiaojing, SUI Haoda. Application of Improved YOLOv3 in Foreign Object Debris Target Detection on Airfield Pavement [J]. Computer Engineering and Applications, 2021, 57(8): 249-255.
[15]	DONG Peng, ZHOU Feng, ZHAO Congcong, WANG Yafei, MI Zetian, FU Xianping. Automatic Measurement of Underwater Sea Cucumber Size Based on Binocular Vision [J]. Computer Engineering and Applications, 2021, 57(8): 271-278.

Object Detection Method Based on Multi-scale Feature Fusion for Driving Scene

面向驾驶场景的多尺度特征融合目标检测方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics