引入注意力机制的多分辨率人体姿态估计研究

doi:10.3778/j.issn.1002-8331.2010-0317

摘要/Abstract

摘要：

针对人体姿态估计任务中多分辨率特征表征直接融合时存在无法有效利用特征图空间特征信息的问题，基于High-Resolution Net（HRNet）进行结构设计，构建出结合了通道域注意力和空间域注意力机制的多分辨率人体姿态估计网络GCT-Nonlocal Net（GNNet），提出了一种基于注意力机制的多分辨率表征融合方法，在不同分辨率表征融合前由空间注意力提取出各分辨率表征更有用的空间特征信息来改进融合单元，使得各分辨率表征间的信息融合效果更佳，最终输出的高分辨率表征含有更丰富的特征信息，同时构造了Gateneck模块和Gateblock模块，其通过引入通道注意力更明确地对通道关系建模从而高效地提取通道信息。在MS COCOVAL 2017进行验证，结果显示提出的GNNet相较于SOTA级表现的HRNet在相当参数量与运算量的情况下获得了更高的准确度，mAP提高了1.4个百分点。实验结果表明，所提方法有效地提高了多分辨率特征表征融合效果。

关键词: 卷积神经网络, 人体姿态估计, 多分辨率特征表征融合, 空间域注意力机制, 通道域注意力机制

Abstract:

In order to solve the problem that spatial information of feature maps is unable to effectively utilize when multi-resolution feature representations are directly fused in human pose estimation task, the multi-resolution human pose estimation network is proposed based on the High-Resolution Net（HRNet） for structural design, namely GCT-Nonlocal Net （GNNet）, which combines both channel domain and spatial domain attention mechanism and contains improved exchange units, Gateneck module and Gateblock module. The exchange units are improved to extract more useful spatial information from the various feature representations by adding spatial attention mechanism before the multi-scale fusions, which make the information fusions between the different resolution representations better and result in the final high-resolution representation containing richer representation information. In addition, the Gateneck module and the Gateblock module are able to model channel relationships more explicitly to extract channel information more effectively by introducing channel attention mechanism. The verification results on MS COCO VAL 2017 dataset show that the proposed GNNet achieves higher accuracy with the similar parameter and computation complexities, compared with the state-of-the-art human pose estimation network, HRNet, and the mAP is improved by 1.4 percentage points. As a result, the improved exchanged units make multi-scale information fusions more effective between the various resolution representations.

Key words: convolutional neural network, human pose estimation, multi-resolution feature representation fusion, spatial attention mechanism, channel attention mechanism

张越，黄友锐，刘鹏坤. 引入注意力机制的多分辨率人体姿态估计研究[J]. 计算机工程与应用, 2021, 57(8): 126-132.

ZHANG Yue, HUANG Yourui, LIU Pengkun. Research on Multi-resolution Human Pose Estimation with Attention Mechanism[J]. Computer Engineering and Applications, 2021, 57(8): 126-132.

[1]	牟清萍，张莹，张东波，王新杰，杨知桥. 目标丢失判别机制的视觉跟踪算法及应用研究[J]. 计算机工程与应用, 2021, 57(9): 140-147.
[2]	包志强，邢瑜，吕少卿，黄琼丹. 改进YOLO V2的6D目标姿态估计算法[J]. 计算机工程与应用, 2021, 57(9): 148-153.
[3]	赵志焱，杨华，胡志伟，宇海萍. 基于TACNN的玉露香梨叶虫害识别[J]. 计算机工程与应用, 2021, 57(9): 176-181.
[4]	周伦钢，孙怡峰，王坤，吴疆，黄维贵，李炳龙. 目标多种多值属性的端端快速识别网络[J]. 计算机工程与应用, 2021, 57(9): 182-190.
[5]	张成，戴俊峰，熊闻心. 融合LeNet-5改进的扫描文档手写日期识别[J]. 计算机工程与应用, 2021, 57(9): 207-211.
[6]	麻哲旭，杨峰，乔旭. 铁路路基病害智能检测方法[J]. 计算机工程与应用, 2021, 57(9): 272-278.
[7]	冉蓉，徐兴华，邱少华，崔小鹏，欧阳斌. 基于深度卷积神经网络的裂纹检测方法综述[J]. 计算机工程与应用, 2021, 57(9): 23-35.
[8]	梁芳烜，杨锋，卢丽云，尹梦晓. 基于卷积神经网络的脑肿瘤分割方法综述[J]. 计算机工程与应用, 2021, 57(7): 34-43.
[9]	杨培伟，周余红，邢岗，田智强，许夏瑜. 卷积神经网络在生物医学图像上的应用进展[J]. 计算机工程与应用, 2021, 57(7): 44-58.
[10]	常昊，陈晓雷，张爱华，李策，林冬梅. 嵌入改进SENet的卷积神经网络连续血压预测[J]. 计算机工程与应用, 2021, 57(7): 130-135.
[11]	李现国，冯欣欣，李建雄. 多尺度残差网络的单幅图像超分辨率重建[J]. 计算机工程与应用, 2021, 57(7): 215-221.
[12]	王翀，韩振奇，徐浩煜，祝永新，徐胜，陈夏. 基于改进显著图的高效裂纹检测算法[J]. 计算机工程与应用, 2021, 57(6): 219-224.
[13]	黄金杰，蔺江全，何勇军，何瑾洁，王雅君. 局部语义与上下文关系的中文短文本分类算法[J]. 计算机工程与应用, 2021, 57(6): 94-100.
[14]	贺钰博，刘坤. 基于卷积神经网络的海面显著性目标检测[J]. 计算机工程与应用, 2021, 57(6): 108-116.
[15]	张良，张增，舒伟华，梅魁志. 基于YOLOv3的卷积层结构化剪枝[J]. 计算机工程与应用, 2021, 57(6): 131-137.

引入注意力机制的多分辨率人体姿态估计研究

Research on Multi-resolution Human Pose Estimation with Attention Mechanism

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics