Single-Stage Object Detection with Fusion of Point Cloud and Image Feature

doi:10.3778/j.issn.1002-8331.2112-0555

Abstract

Abstract: 3D objects can be effectively detected and classified with the supplement of image information to the geometry and texture information of point cloud. Aiming at the problem of effectively integrating image features into point clouds, an end-to-end deep neural network has been designed. A novel fusion module named PI-Fusion（point cloud and image fusion） is proposed to enhance semantic information of point cloud by using image features in a point-by-point manner. In addition, during the downsampling process, a fusion-sampling strategy is adopted to make use of distance farthest point sampling and feature farthest point sampling for small objects. After three downsampling processes of fused image and point cloud features, points are moved to the center of target object through a candidate point generation layer. Finally, the classification confidence and regression are predicted by the network with a single-stage object detection header. Experimental results on KITTI dataset show that the proposed method has improved 3.37, 1.92 and 1.58?percentage points in simple, moderate and hard degree of difficulty on KITTI dataset respectively in comparison with 3DSSD.

Key words: image recognition, point cloud object detection, multimodal feature fusion

摘要： 使用图像信息补充三维点云的几何和纹理信息，可以对三维物体进行有效地检测与分类。为了能够更好地将图像特征融入点云，设计了一个端到端的深度神经网络，提出了一个新颖的融合模块PI-Fusion（point cloud and image fusion），使用图像特征以逐点融合的方式来增强点云的语义信息。另外，在点云下采样的过程中，使用距离最远点采样和特征最远点采样的融合采样方式，以在小目标上采样到更多的点。经过融合图像和点云特征的三次下采样之后，通过一个候选点生成层将点移动到目标物体的中心。最后，通过一个单阶段目标检测头，得出分类置信度和回归框。在公开数据集KITTI的实验表明，与3DSSD相比，此方法在简单、中等、困难难度的检测上分别提升了3.37、1.92、1.58个百分点。

关键词: 图像识别, 点云目标检测, 多模态特征融合

CAI Zhengyi, ZHAO Jieyu, ZHU Feng. Single-Stage Object Detection with Fusion of Point Cloud and Image Feature[J]. Computer Engineering and Applications, 2023, 59(9): 140-149.

蔡正奕, 赵杰煜, 朱峰. 融合图像特征的单阶段点云目标检测[J]. 计算机工程与应用, 2023, 59(9): 140-149.

References

[1] REDMON J，DIVVALA S，GIRSHICK R，et al.You only look once：unified，real?time object detection[C]//2016 IEEE Conference on Computer Vision & Pattern Recognition，2016.
[2] REN S，HE K，GIRSHICK R，et al.Faster r-cnn：towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2017，39（6）：1137-1149.
[3] HE K，GKIOXARI G，DOLLáR P，et al.Mask r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision，2020.
[4] 张硕，叶勤，史婧，等.改进RangeNet++损失函数的车载点云小目标语义分割方法[J].计算机辅助设计与图形学学报，2021，33（5）：704-711.
ZHANG S，YE Q，SHI J，et al.A semantic segmentation method of in-vehicle small targets point cloud based on improved RangeNet++ loss function[J].Journal of Computer-Aided Design & Computer Graphics，2021，33（5）：704-711.
[5] 李文平，袁强，陈璐，等.基于雷达点云与图像数据的三维目标检测方法[J].电光与控制，2021，28（10）：110-115.
LI W P，YUAN Q，CHEN L，et al.Three-dimensional object detection method based on radar point cloud and image data[J].Electronics Optics & Control，2021，28（10）：110-115.
[6] 赵毅强，艾西丁·艾克白尔，陈瑞，等.基于体素化图卷积网络的三维点云目标检测方法[J].红外与激光工程，2021，50（10）：281-289.
ZHAO Y Q，ARXIDIN AKBAR，CHEN R，et al.3D point cloud object detection method in view of voxel based on graph convolution network[J].Infrared and Laser Engineering，2021，50（10）：281-289.
[7] 程腾，孙磊，侯登超，等.基于特征融合的多层次多模态目标检测[J].汽车工程，2021，43（11）：1602-1610.
CHENG T，SUN L，HOU D C，et al.Multi-level and multi-modal target detection based on feature fusion[J].Automotive Engineering，2021，43（11）：1602-1610.
[8] 罗玉涛，秦瀚.基于稀疏彩色点云的自动驾驶汽车3D目标检测方法[J].汽车工程，2021，43（4）：492-500.
LUO Y T，QIN H.3D object detection method for autonomous vehicle based on sparse color point cloud[J].Automotive Engineering，2021，43（4）：492-500.
[9] 彭玉旭，董胜超.基于注意力机制的三维点云车辆目标检测[J].计算机系统应用，2021，30（12）：211-217.
PENG Y X，DONG S C.3D point cloud vehicle target detection based on attention mechanism[J].Computer System & Applications，2021，30（12）：211-217.
[10] 吕方梅，习俊通，马登哲.人体表面点云数据的拓扑特征检测与自动分割[J].计算机工程与应用，2008，44（6）：196-198.
LV F M，XI J T，MA D Z.Topological features detection and automatic segmentation of point clouds from human body surface[J].Computer Engineering and Applications，2008，44（6）：196-198.
[11] 孙红岩，孙晓鹏，李华.基于K-means聚类方法的三维点云模型分割[J].计算机工程与应用，2006，42（10）：42-45.
SUN H Y，SUN X P，LI H.3D point cloud model segmentation based on K-means cluster analysis[J].Computer Engineering and Applications，2006，42（10）：42-45.
[12] 赵亮，胡杰，刘汉，等.基于语义分割的深度学习激光点云三维目标检测[J].中国激光，2021，48（17）：177-189.
ZHAO L，HU J，LIU H，et al.Deep learning based on semantic segmentation for three-dimensional object detection from point clouds[J].Chinese Journal of Lasers，2021，48（17）：177-189.
[13] DAI J，HE K，SUN J.Instance-aware semantic segmentation via multi-task network cascades[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016.
[14] LIU S，JIA J，FIDLER S，et al.Sgn：sequential grouping networks for instance segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision，2017.
[15] QI C R，LIU W，WU C，et al.Frustum pointnets for 3d object detection from rgb-d data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018.
[16] ZHOU Y，SUN P，ZHANG Y，et al.End-to-end multi-view fusion for 3d object detection in lidar point clouds[C]//Conference on Robot Learning，2020：923-932.
[17] LIANG M，YANG B，WANG S，et al.Deep continuous fusion for multi-sensor 3d object detection[C]//Proceedings of the European Conference on Computer Vision（ECCV），2020.
[18] ZHOU D，FANG J，SONG X，et al.Joint 3d instance segmentation and object detection for autonomous driving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020.
[19] QI C R，CHEN X，LITANY O，et al.Imvotenet：boosting 3d object detection in point clouds with image votes[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020.
[20] YANG Z，SUN Y，LIU S，et al.3dssd：point-based 3d single stage object detector[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020.
[21] SHI S，WANG X，LI H.Pointrcnn：3d object proposal generation and detection from point cloud[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019.
[22] GEIGER A，LENZ P，URTASUN R.Are we ready for autonomous driving? the kitti vision benchmark suite[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition，2012：3354-3361.