Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (9): 140-149.DOI: 10.3778/j.issn.1002-8331.2112-0555

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

Single-Stage Object Detection with Fusion of Point Cloud and Image Feature

CAI Zhengyi, ZHAO Jieyu, ZHU Feng   

  1. Faculty of Electrical and Engineering and Computer Science, Ningbo University, Ningbo, Zhejiang 315211, China
  • Online:2023-05-01 Published:2023-05-01

融合图像特征的单阶段点云目标检测

蔡正奕,赵杰煜,朱峰   

  1. 宁波大学 信息科学与工程学院,浙江 宁波 315211

Abstract: 3D objects can be effectively detected and classified with the supplement of image information to the geometry and texture information of point cloud. Aiming at the problem of effectively integrating image features into point clouds, an end-to-end deep neural network has been designed. A novel fusion module named PI-Fusion(point cloud and image fusion) is proposed to enhance semantic information of point cloud by using image features in a point-by-point manner. In addition, during the downsampling process, a fusion-sampling strategy is adopted to make use of distance farthest point sampling and feature farthest point sampling for small objects. After three downsampling processes of fused image and point cloud features, points are moved to the center of target object through a candidate point generation layer. Finally, the classification confidence and regression are predicted by the network with a single-stage object detection header. Experimental results on KITTI dataset show that the proposed method has improved 3.37, 1.92 and 1.58?percentage points in simple, moderate and hard degree of difficulty on KITTI dataset respectively in comparison with 3DSSD.

Key words: image recognition, point cloud object detection, multimodal feature fusion

摘要: 使用图像信息补充三维点云的几何和纹理信息,可以对三维物体进行有效地检测与分类。为了能够更好地将图像特征融入点云,设计了一个端到端的深度神经网络,提出了一个新颖的融合模块PI-Fusion(point cloud and image fusion),使用图像特征以逐点融合的方式来增强点云的语义信息。另外,在点云下采样的过程中,使用距离最远点采样和特征最远点采样的融合采样方式,以在小目标上采样到更多的点。经过融合图像和点云特征的三次下采样之后,通过一个候选点生成层将点移动到目标物体的中心。最后,通过一个单阶段目标检测头,得出分类置信度和回归框。在公开数据集KITTI的实验表明,与3DSSD相比,此方法在简单、中等、困难难度的检测上分别提升了3.37、1.92、1.58个百分点。

关键词: 图像识别, 点云目标检测, 多模态特征融合