计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (2): 143-149.DOI: 10.3778/j.issn.1002-8331.2007-0276

• 模式识别与人工智能 • 上一篇    下一篇

轻量型高分辨率人体关键点检测改进研究

刘鹏坤,朱成杰,张越   

  1. 安徽理工大学 电气与信息工程学院,安徽 淮南 232000
  • 出版日期:2021-01-15 发布日期:2021-01-14

Research on Improved Lightweight High Resolution Human Keypoint Detection

LIU Pengkun, ZHU Chengjie, ZHANG Yue   

  1. College of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, Anhui 232000, China
  • Online:2021-01-15 Published:2021-01-14

摘要:

人体关键点检测任务作为一种像素级别的检测任务,深度学习方法通常采用高分辨率特征图表征方法来回归关键点以增强检测效果。针对该方法由于始终采用高分辨率表征导致的参数量过大、运算复杂度要求过高的问题,提出了两种轻量型基础网络模块为Gattneck模块与Gattblock模块,以HRNet(High-Resolution Network)为基础框架,构建出一种轻量型人体关键点检测网络GattNet(Ghost-attention Network)。通过引入线性变换生成冗余特征图与通道注意力机制对通道权重进行重分配对HRNet进行轻量化改进,使用该方法网络参数量下降41.5%,运算复杂度降低36.7%。在MS COCO(Microsoft Common Objects in Context)2017数据集上进行验证,实验结果表明所提出GattNet网络在保留精度的前提下有效降低了参数量与运算复杂度。

关键词: 深度学习, 卷积网络, 高分辨率特征表示, 人体关键点检测, 注意力机制

Abstract:

Human keypoint detection as a pixel-level detection task, high resolution feature map representations are usually adopted to enhance the detection effect for keypoint regression in deep learning. To solve the problem of large number of parameters and high computation requirement caused by maintaining high-resolution representations throughout the process in High-Resolution Network(HRNet), Gattneck module and Gattblock module are proposed, which are two lightweight basic network modules based on HRNet and construct a lightweight human keypoint detection network, namely Ghost-attention Network(GattNet). GattNet uses linear transformation to generate redundant feature maps and adopts channelwise attention mechanism to make attention weights reweight activations in different channels, which reduces the number of HRNet parameters by 41.5% and the computational complexity by 36.7%. The validation results on MS COCO 2017 dataset show that GattNet can effectively reduce the number of parameters and the computational cost while preserving the accuracy.

Key words: deep learning, convolutional network, high-resolution representation, human keypoint detection, attention mechanism