计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (19): 142-151.DOI: 10.3778/j.issn.1002-8331.2102-0037

• 模式识别与人工智能 • 上一篇    下一篇

基于注意力机制的点云神经网络架构搜索方法

谭台哲,黄永耀,杨卓,刘洋   

  1. 广东工业大学 计算机学院,广州 510006
  • 出版日期:2022-10-01 发布日期:2022-10-01

Attention Mechanism Based Point Cloud Neural Architecture Search Method

TAN Taizhe, HUANG Yongyao, YANG Zhuo, LIU Yang   

  1. School of Computer, Guangdong University of Technology, Guangzhou 510006, China
  • Online:2022-10-01 Published:2022-10-01

摘要: 为了解决设计面向点云数据的神经网络需要大量人工介入的问题,提出了基于注意力机制和点卷积的神经网络架构搜索方法。针对不同尺度点云的信息融合问题,提出了一种基于注意力机制的多尺度融合模块。针对点云的处理效率问题,设计了基于点卷积的特征提取模块作为候选操作,并与多尺度融合模块组成搜索单元。将多个搜索单元叠加成的神经网络作为搜索空间,并采用基于可微分神经网络架构搜索算法搜索出最优神经网络。在公开点云数据集ModelNet上的实验结果证明,该方法得到的神经网络具有领先的精度,同时具有较少的可学习参数,并且该方法大幅减少了人工介入的工作量。该数据集上的消融实验结果表明,在基线模型中加入提出的基于注意力机制的多尺度融合模块,精度提升了1.1个百分点。

关键词: 点卷积, 神经网络架构搜索(NAS), 三维点云, 多尺度融合, 注意力机制

Abstract: To address the problem that the design of a neural network for point cloud data require a lot of manual intervention, a neural network architecture search method based on attention mechanism and point convolution is proposed. Firstly, for the information fusion problem of point clouds of different scales, a multi-scale fusion module based on the attention mechanism is proposed. Secondly, for the problem of point cloud processing efficiency, a feature extraction module based on point convolution is designed as a candidate operation, and the multi-scale fusion module forms a search unit. The neural network cascaded by multiple search cells is formulated as the search space, and the search algorithm based on the differentiable neural network architecture is used to search for the optimal neural network. The experimental results on the public point cloud dataset ModelNet prove that the neural network obtained by this method achieves competitive accuracy and has fewer learnable parameters, and the method greatly reduces the workload of manual intervention. The ablation experimental results on this dataset show that adding the proposed attention mechanism-based multi-scale fusion module to the baseline model improves the accuracy by 1.1?percentage points.

Key words: point convolution, neural architecture search(NAS), 3D point cloud, multi-scale fusion, attention mechanism