Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (6): 207-213.DOI: 10.3778/j.issn.1002-8331.2210-0089

• Graphics and Image Processing • Previous Articles     Next Articles

Hand Pose Estimation Based on Multi-Feature Enhancement

FENG Xinxin, GAO Shu   

  1. School of Computer Science & Artificial Intelligence, Wuhan University of Technology, Wuhan 430070, China
  • Online:2024-03-15 Published:2024-03-15

基于多特征增强的手部姿态估计方法

奉鑫鑫,高曙   

  1. 武汉理工大学 计算机与人工智能学院,武汉 430070

Abstract: Hand pose estimation is one of the important research directions of computer vision, which plays an important role in human-computer interaction, virtual reality, robot control and other application fields. At present, hand pose estimation has the problem of single feature representation method. This paper proposes a feature construction method of hand key point connection relationship and a key point feature aggregation enhancement method based on hand motion semantic relationship to improve the hand feature representation and information sharing ability. Aiming at the occlusion problem in hand target detection and image segmentation, a hand contour feature extraction method is designed to improve the preprocessing effect. Based on the proposed multi-feature representation and enhancement method, a depth learning neural network model based on full convolution structure is constructed to avoid the problem of spatial information loss caused by direct regression calculation of 3D pose information, thus effectively improving the accuracy of 3D hand pose estimation. Compared with the SOTA model on DO, ED, RHD datasets, it has achieved a competitive effect, and the average AUC result has reached 93.3%, indicating that the proposed method also has good universality.

Key words: 3D hand pose estimation, feature enhancement, convolutional neural network, full convolutional neural network

摘要: 手部姿态估计是计算机视觉研究的重要方向之一,在人机交互、虚拟现实、机器人控制等应用领域发挥着重要作用。针对目前手部姿态估计存在特征表示方法单一性的问题,提出手部关键点连接关系特征构建方法与基于手部运动语义关系的关键点特征聚合增强方法,提高手部特征表达与信息共享能力;针对手部目标检测与图像分割等预处理方法中存在的遮挡性问题,设计手部轮廓特征提取方法,提高预处理效果;基于所提出的多特征表示与增强方法,构建了一个基于全卷积结构的深度学习神经网络模型,避免直接回归计算3D姿态信息导致的空间信息丢失问题,从而有效提高了3D手部姿态估计精度。在DO、ED、RHD等多个数据集上与SOTA模型相比,均取得了竞争性的效果,且平均AUC结果达到了93.3%,说明所提出的方法也具有较好普适性。

关键词: 3D手部姿态估计, 特征增强, 卷积神经网络, 全卷积神经网络