计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (18): 132-141.DOI: 10.3778/j.issn.1002-8331.2406-0146

• 模式识别与人工智能 • 上一篇    下一篇

全局形状关系约束的点云三维目标检测方法

鲜世洋,李宗民,公绪超,徐畅,张鹏,王文超,白云,戎光彩   

  1. 1.中国石油大学(华东) 青岛软件学院、计算机科学与技术学院,山东 青岛 266580 
    2.中国石化集团 胜利石油管理局,山东 东营 257000
  • 出版日期:2025-09-15 发布日期:2025-09-15

Point Cloud 3D Object Detection Method with Global Shape Relation Constraints

XIAN Shiyang, LI Zongmin, GONG Xuchao, XU Chang, ZHANG Peng, WANG Wenchao, BAI Yun, RONG Guangcai   

  1. 1.Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
    2.Sinopec Shengli Petroleum Administration Co., Ltd., Dongying, Shandong 257000, China
  • Online:2025-09-15 Published:2025-09-15

摘要: 基于投票的方法在室内三维目标检测任务中展现出巨大的潜力,其中投票直接决定了检测结果的质量。然而位于物体空间重叠处的种子点容易出现错误投票的问题,即映射到错误目标物体中心附近。鉴于这些种子点在几何表面上通常是连续的,引入形状关系来改善这一问题。具体来说,提出了形状关系提取模块,通过构建二维流形并基于流形上的欧氏距离来表征形状关系,然后通过矩阵乘法实现形状关系对点云的约束。为了获取几何表面连续性信息,设计了二叉树Transformer模块。经过形状关系约束后的点云,通过优化的Transformer网络捕获全局上下文,从而学习到物体的表面结构。采用ScanNet和SUN RGB-D数据集进行对比实验,结果表明文中算法在mAP@0.25指标上分别达到65.1%和62.7%,相较于基线方法分别有6.5和5个百分点的提升,对比目前最优方法分别提高了0.6和1.1个百分点。

关键词: 三维目标检测, 点云, 流形学习, Transformer, 形状关系

Abstract: Voting-based method has shown great potential in indoor 3D object detection tasks, where voting directly determines the quality of the detection results. However, seed points located in overlapping areas of objects are prone to erroneous voting, mapping them near incorrect target object centers. Considering that these seed points are usually continuous on the geometric surface, introducing shape relations can improve this issue. Specifically, a shape relation extraction module is proposed, which constructs a 2D manifold and represents shape relations based on Euclidean distance on the manifold, then implements shape relation constraints on the point cloud through matrix multiplication. To obtain geometric surface continuity information, a binary tree Transformer module is designed. The point cloud constrained by shape relations captures global context through an optimized Transformer network, thus learning the surface structure of objects. Comparative experiments using the ScanNet and SUN RGB-D datasets show that the proposed algorithm achieves mAP@0.25 scores of 65.1% and 62.7%, respectively, improving by 6.5 and 5 percentage points compared to baseline methods, and outperforming the current state-of-the-art methods by 0.6 and 1.1 percentage points, respectively.

Key words: 3D object detection, point cloud, manifold learning, Transformer, shape relation