[1] MARCOS-RAMIRO A, PIZARRO D, MARRON-ROMERA M, et al. Let your body speak: communicative cue extraction on natural interaction using RGBD data[J]. IEEE Transactions on Multimedia, 2015, 17(10): 1721-1732.
[2] ELKHOLY A, HUSSEIN M E, GOMAA W, et al. Efficient and robust skeleton-based quality assessment and abnormality detection in human action performance[J]. IEEE Journal of Biomedical and Health Informatics, 2019, 24(1): 280-291.
[3] 甄昊宇, 张德. 结合自适应图卷积与时态建模的骨架动作识别[J]. 计算机工程与应用, 2023, 59(18): 137-144.
ZHEN H Y, ZHANG D. Combining adaptive graph convolution and temporal modeling for skeleton-based action recognition[J]. Computer Engineering and Applications, 2023, 59(18): 137-144.
[4] ANDRILUKA M, IQBAL U, INSAFUTDINOV E, et al. Posetrack: a benchmark for human pose estimation and tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 5167-5176.
[5] 李博. 改进型深度迁移学习的跨镜行人追踪算法[J]. 计算机工程与应用, 2021, 57(10): 110-116.
LI B. Improved deep transfer learning algorithm for person re-identification[J]. Computer Engineering and Applications, 2021, 57(10): 110-116.
[6] 马金林, 崔琦磊, 马自萍, 等. 预加权调制密集图卷积网络三维人体姿态估计[J]. 计算机科学与探索, 2024, 18(4): 963-977.
MA J L, CUI Q L, MA Z P, et al. Pre-weighted modulated dense graph convolutional networks for 3D human pose estimation[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(4): 963-977.
[7] 王仕宸, 黄凯, 陈志刚, 等. 深度学习的三维人体姿态估计综述[J]. 计算机科学与探索, 2023, 17(1): 74-87.
WANG S C, HUANG K, CHEN Z G, et al. Survey on 3D human pose estimation of deep learning[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(1): 74-87.
[8] 杨旭升, 吴江宇, 胡佛, 等. 基于渐进高斯滤波融合的多视角人体姿态估计[J]. 自动化学报, 2024, ?50(3): 607-616.
YANG X S, WU J Y, HU F, et al. Multi-view human pose estimation based on progressive Gaussian filtering fusion[J]. Acta Automatica Sinica, 2024, ?50(3): 607-616.
[9] ROGEZ G, RIHAN J, RAMALINGAM S, et al. Randomized trees for human pose detection[C]//Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008: 1-8.
[10] URTASUN R, DARRELL T. Local probabilistic regression for activity-independent human pose inference[C]//Proceedings of the IEEE Conference on?Computer Vision and Pattern Recognition, 2008.
[11] TOSHEV A, SZEGEDY C. DeepPose: human pose estimation via deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 1653-1660.
[12] PAPANDREOU G, ZHU T, KANAZAWA N, et al. Towards accurate multi-person pose estimation in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 4903-4911.
[13] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 6000 - 6010.
[14] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[J]. arXiv:2010.11929, 2020.
[15] XU Y, ZHANG J, ZHANG Q, et al. ViTPose: simple vision transformer baselines for human pose estimation[C]//Advances in Neural Information Processing Systems: 2022: 38571-38584.
[16] MAO W, GE Y, SHEN C, et al. Poseur: direct human pose regression with transformers[C]//Proceedings of the European Conference on Computer Vision, 2022: 72-88.
[17] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
[18] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 1-9.
[19] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Proceedings of the European Conference on Computer Vision, 2020: 213-229.
[20] WU B, XU C, DAI X, et al. Visual transformers: token-based image representation and processing for computer vision[J]. arXiv:2006.03677, 2020.
[21] 邓益侬, 罗健欣, 金凤林. 基于深度学习的人体姿态估计方法综述[J]. 计算机工程与应用, 2019, 55(19): 22-42.
DENG Y N, LUO J X, JIN F L. Overview of human pose estimation methods based on deep learning[J]. Computer Engineering and Applications, 2019, 55(19): 22-42.
[22] 周燕, 刘紫琴, 曾凡智, 等. 深度学习的二维人体姿态估计综述[J]. 计算机科学与探索, 2021, 15(4): 641-657.
ZHOU Y, LIU Z Q, ZENG F Z, et al. Survey on two-dimensional human pose estimation of deep learning[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(4): 641-657.
[23] ZHENG C, MENDIETA M, YANG T, et al. Feater: an efficient network for human reconstruction via feature map-based transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 13945-13954.
[24] WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020: 390-391.
[25] NEWELL A, YANG K, DENG J. Stacked hourglass networks for human pose estimation[C]//Proceedings of the 14th European Conference on Computer Vision, 2016: 483-499.
[26] CHEN Y, WANG Z, PENG Y, et al. Cascaded pyramid network for multi-person pose estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7103-7112.
[27] XIAO B, WU H, WEI Y. Simple baselines for human pose estimation and tracking[C]//Proceedings of the European Conference on Computer Vision, 2018: 466-481.
[28] SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 5693-5703.
[29] XIONG Z, WANG C, LI Y, et al. Swin-pose: swin transformer based human pose estimation[C]//Proceedings of the 2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval, 2022: 228-233.
[30] LI K, WANG S, ZHANG X, et al. Pose recognition with cascade transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 1944-1953.
[31] YUAN Y, FU R, HUANG L, et al. HRFormer: high-resolution vision transformer for dense predict[C]//Advances in Neural Information Processing Systems, 2021: 7281-7293.
[32] YANG S, QUAN Z, NIE M, et al. TransPose: keypoint localization via transformer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 11802-11812.
[33] LI Y, ZHANG S, WANG Z, et al. TokenPose: learning keypoint tokens for human pose estimation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 11313-11322.
[34] MAO W, GE Y, SHEN C, et al. TFPose: direct human pose estimation with transformers[J]. arXiv:2103.15320, 2021.
[35] 江春灵, 曾碧, 姚壮泽, 等. 融合权重自适应损失和注意力的人体姿态估计[J]. 计算机工程与应用, 2023, 59(18): 145-153.
JIANG C L, ZENG B, YAO Z Z, et al. Human pose estimation fusing weight adaptive loss and attention[J]. Computer Engineering and Applications, 2023, 59(18): 145-153.
[36] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.
[37] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision, 2018: 3-19.
[38] HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 13713-13722.
[39] XU W, WAN Y. ELA: efficient local attention for deep convolutional neural networks[J]. arXiv:2403.01123, 2024.
[40] YOO J, KIM T, LEE S, et al. Enriched CNN-transformer feature aggregation networks for super-resolution[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023: 4956-4965.
[41] WU Y, HE K. Group normalization[C]//Proceedings of the European Conference on Computer Vision, 2018: 3-19.
[42] ZHANG F, ZHU X, DAI H, et al. Distribution-aware coordinate representation for human pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 7093-7102.
[43] SUN X, ADAMU M J, ZHANG R, et al. Pixel-coordinate-induced human pose high-precision estimation method[J]. Electronics, 2023, 12(7): 1648.
[44] 高坤, 李汪根, 束阳, 等. 融入密集连接的多尺度轻量级人体姿态估计[J]. 计算机工程与应用, 2022, 58(24): 196-204.
GAO K, LI W G, SHU Y, et al. Multi-scale lightweight human pose estimation with dense connections[J]. Computer Engineering and Applications, 2022, 58(24): 196-204.
[45] GENG Z, SUN K, XIAO B, et al. Bottom-up human pose estimation via disentangled keypoint regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 14676-14686.
[46] XU J, LIU W, XING W, et al. MSPENet: multi-scale adaptive fusion and position enhancement network for human pose estimation[J]. The Visual Computer, 2023, 39(5): 2005-2019.
[47] DONG K, SUN Y, CHENG X, et al. Combining detailed appearance and multi-scale representation: a structure-context complementary network for human pose estimation[J]. Applied Intelligence, 2023, 53(7): 8097-8113.
[48] WANG W, XIE E, LI X, et al. Pyramid vision transformer: a versatile backbone for dense prediction without convolutions[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 568-578.
[49] LIU Z, LIN Y, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 10012-10022. |