Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (9): 148-153.DOI: 10.3778/j.issn.1002-8331.2001-0367

Previous Articles     Next Articles

Improved YOLO V2 6D Object Pose Estimation Algorithm

BAO Zhiqiang, XING Yu, LYU Shaoqing, HUANG Qiongdan   

  1. School of Communication and Information Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710121, China
  • Online:2021-05-01 Published:2021-04-29

改进YOLO V2的6D目标姿态估计算法


  1. 西安邮电大学 通信与信息工程学院,西安 710121


For the 3D pose estimation of the target, combined with the target detection model based on deep learning, 6D target pose estimation algorithm based on improved YOLO V2 is proposed. The feature information of an object in an RGB image is extracted by a convolutional neural network. Based on 2D detection, target position information is mapped to the three-dimensional space. The point-to-point mapping relationship is used to match and calculate target freedom degree in three dimensions. Then, target 6D pose is estimated. The algorithm detects a target in an RGB image. At the same time, target 6D attitude is predicted, which does not require additional post-processing. Experimental results show that the proposed algorithm performs better on other LineMod and Occlusion LineMod datasets than other CNN-based methods recently proposed. The proposed algorithm runs at 37?frames per second on Titan X GPU and can be processed in real time.

Key words: pose estimation, object detection, convolutional neural network, feature extraction


针对目标的三维姿态估计,结合基于深度学习的目标检测模型,提出一种基于改进YOLO V2的6D目标姿态估计算法。通过卷积神经网络提取一幅RGB图像中目标的特征信息;在2D检测的基础上将目标的位置信息映射到三维空间;利用点到点的映射关系在三维空间匹配并计算目标的自由度,进而估计目标的6D姿态。该算法不仅能检测单幅RGB图像中的目标,还可以预测目标的6D姿态,同时不需要额外的后处理过程。实验表明,该算法在LineMod和Occlusion LineMod数据集上的性能优于最近提出的其他基于CNN的方法,在Titan X GPU上的运行速度是37?frame/s,适合实时处理。

关键词: 姿态估计, 目标检测, 卷积神经网络, 特征提取