计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (24): 196-204.DOI: 10.3778/j.issn.1002-8331.2208-0209

• 模式识别与人工智能 • 上一篇    下一篇

融入密集连接的多尺度轻量级人体姿态估计

高坤,李汪根,束阳,王志格,葛英奎   

  1. 安徽师范大学 计算机与信息学院,安徽 芜湖 241002
  • 出版日期:2022-12-15 发布日期:2022-12-15

Multi-Scale Lightweight Human Pose Estimation with Dense Connections

GAO Kun, LI Wanggen, SHU Yang, WANG Zhige, GE Yingkui   

  1. School of Computer & Information, Anhui Normal University, Wuhu, Anhui 241002, China
  • Online:2022-12-15 Published:2022-12-15

摘要: 针对如何在保证轻量化的同时提升网络的性能问题,以轻量级简单基线(LPN)为基础提出的融入密集连接的多尺度轻量级人体姿态估计(LDMNet),重新设计了下采样的瓶颈结构,将密集连接与深度可分离卷积以及多尺度特征提取相结合,构建了一个轻量高效的特征提取结构,同时改进了空洞空间卷积池化金字塔多特征进行再提取。在MPII数据集以及COCO数据集上的实验表明,与基线方法LPN相比,LDMNet在少量增加参数量和计算量的情况下,在MPII验证集上的平均准确率提升了1.9个百分点,在COCO验证集上的平均准确率提升了3.2个百分点,另外与最新的轻量级网络LiteHRNet相比在COCO验证集与MPII验证集上平均准确率也取得了2.9和1.5个百分点的提升,该网络在轻量化的基础上有效地提升了网络的识别精度。

关键词: 人体姿态估计, 密集连接网络, 多尺度特征提取, 轻量级网络

Abstract: To solve the problem of how to improve the performance of the network while being lightweight, a multi-scale lightweight human pose estimation(LDMNet) combined with densely connected network is proposed on the basis of  simple and lightweight human pose estimation(LPN). The bottleneck structure of downsampling is redesigned, and the dense connection is combined with deep separable convolution and multi-scale feature extraction. A lightweight and efficient feature extraction structure is constructed, and the multi-feature extraction of the hollow space convolution pool pyramid is improved. Experiments on the MPII dataset and COCO dataset show that compared with the baseline method LPN, the average accuracy on the MPII dataset is increased by 1.9 percentage points and the average accuracy on the COCO dataset is increased by 3.2 percentage points when the complexity of the model is slightly increased by LDMNet. In addition, compared with the latest lightweight network LiteHRNet, the average accuracy of the COCO validation set and MPII validation set is also improved by 2.9 and 1.5 percentage points, indicating that the network effectively improves the recognition accuracy of the network on the basis of lightweight.

Key words: human pose estimation, dense connection network, multi-scale feature extraction, lightweight network