计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (20): 254-261.DOI: 10.3778/j.issn.1002-8331.2307-0139

• 图形图像处理 • 上一篇    下一篇

联合关键点数据增强和结构先验的遮挡人体姿态估计

韩刚涛,王昊,汪松,陈恩庆   

  1. 郑州大学 电气与信息工程学院,郑州 450001
  • 出版日期:2024-10-15 发布日期:2024-10-15

Joint Keypoint Data Augmentation and Structural Prior for Occluded Human Pose Estimation

HAN Gangtao, WANG Hao, WANG Song, CHEN Enqing   

  1. School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
  • Online:2024-10-15 Published:2024-10-15

摘要: 人体姿态估计技术在许多领域都有着重要的应用。现有研究主要聚焦于无遮挡情况下的人体关键点精确定位,却忽略了人像采集过程中普遍存在的遮挡问题。针对此问题,提出了一个基于关键点数据增强的人体姿态估计方法。具体地,数据增强策略以训练图像中人体可见关键点为中心,生成特定数量和大小的遮挡区域,模拟人体关键点受遮挡时的场景,提升网络模型对遮挡场景下关键点预测的鲁棒性。为了提高模型对被遮挡关键点与相邻关键点间关联性的感知,进一步设计基于人体结构先验知识的损失函数,根据人体真实结构构建相邻关键点连接关系,约束预测的关键点坐标范围,从而提升被遮挡关键点的坐标精度。在OCHuman测试集和COCO验证集上的预测结果表明,相比于基准网络模型,该方法在不增加网络参数的情况下能够提升遮挡场景中的人体姿态估计性能。

关键词: 人体姿态估计, 关键点级遮挡, 数据增强, 人体结构损失

Abstract: Human pose estimation techniques have important applications in many fields. Existing studies focus on the precise localization of human keypoints without occlusion, but ignore the prevalent occlusion problem in the process of human image acquisition. To address this problem, a human pose estimation method based on keypoint data augmentation is proposed. Specifically, the data augmentation strategy generates a specific number and size of occlusion regions centered on the human visible keypoints in the training images to simulate the scenes when the human keypoints are occluded and to improve the robustness of the network model for keypoint prediction under the occlusion scenes. In order to improve the model’s perception of the correlation between the occluded keypoints and the adjacent keypoints, a loss function based on the priori knowledge of the human body structure is further designed to construct the adjacent keypoint connections based on the real structure of the human body and constrain the predicted keypoint coordinate range, so as to improve the coordinate accuracy of the occluded keypoints. The prediction results on the OCHuman test set and COCO validation set show that the method can improve the performance of human pose estimation in occluded scenes without increasing the network parameters compared with the benchmark network model.

Key words: human pose estimation, keypoint-level occlusion, data augmentation, human structure loss