
Computer Engineering and Applications ›› 2025, Vol. 61 ›› Issue (16): 160-170.DOI: 10.3778/j.issn.1002-8331.2405-0224
• Pattern Recognition and Artificial Intelligence • Previous Articles Next Articles
CHEN Xianglong, LI Songyang, CHEN Enqing, GUO Xin, WANG Song
Online:2025-08-15
Published:2025-08-15
陈相龙,李松洋,陈恩庆,郭新,汪松
CHEN Xianglong, LI Songyang, CHEN Enqing, GUO Xin, WANG Song. Lightweight Human Pose Estimation with Joint Cross-Stage Information Fusion[J]. Computer Engineering and Applications, 2025, 61(16): 160-170.
陈相龙, 李松洋, 陈恩庆, 郭新, 汪松. 联合跨阶段信息的轻量化人体姿态估计[J]. 计算机工程与应用, 2025, 61(16): 160-170.
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2405-0224
| [1] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. [2] CHEN Y L, WANG Z C, PENG Y X, et al. Cascaded pyramid network for multi-person pose estimation[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7103-7112. [3] 王燕妮, 胡敏, 韩世鹏, 等. 多尺度和多层级特征融合的人体姿态估计[J]. 计算机工程与应用, 2025, 61(6): 199-209. WANG Y N, HU M, HAN S P, et al. Human pose estimation with multi-scale and multi-level feature fusion[J]. Computer Engineering and Applications. 2025, 61(6): 199-209. [4] SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5686-5696. [5] DAI Y M, GIESEKE F, OEHMCKE S, et al. Attentional feature fusion[C]//Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 3559-3568. [6] LI X, WANG W H, HU X L, et al. Selective kernel networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 510-519. [7] ZHANG H, WU C R, ZHANG Z Y, et al. ResNeSt: split-attention networks[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2022: 2735-2745. [8] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. [9] TOSHEV A, SZEGEDY C. DeepPose: human pose estimation via deep neural networks[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 1653-1660. [10] TOMPSON J, JAIN A, LECUN Y, et al. Joint training of a convolutional network and a graphical model for human pose estimation[C]//Advances in Neural Information Processing Systems 27, 2014. [11] WEI S H, RAMAKRISHNA V, KANADE T, et al. Convolutional pose machines[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4724-4732. [12] NEWELL A, YANG K Y, DENG J. Stacked hourglass networks for human pose estimation[C]//Proceedings of the 14th European Conference on Computer Vision. Cham: Springer, 2016: 483-499. [13] XIAO B, WU H P, WEI Y C. Simple baselines for human pose estimation and tracking[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 472-487. [14] YANG S, QUAN Z B, NIE M, et al. TransPose: keypoint localization via transformer[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 11782-11792. [15] XU Y, ZHANG J, ZHANG Q, et al. ViTPose: simple vision transformer baselines for human pose estimation[C]//Advances in Neural Information Processing Systems 35, 2022: 38571-38584. [16] FAN D P, WANG W G, CHENG M M, et al. Shifting more attention to video salient object detection[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 8546-8556. [17] FU K R, FAN D P, JI G P, et al. JL-DCF: joint learning and densely-cooperative fusion framework for RGB-D salient object detection[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 3049-3059. [18] MNIH V, HEESS N, GRAVES A, et al. Recurrent models of visual attention[C]//Advances in Neural Information Processing Systems 27, 2014. [19] JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks[C]//Advances in Neural Information Processing Systems 28, 2015. [20] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 3-19. [21] WANG Q L, WU B G, ZHU P F, et al. ECA-net: efficient channel attention for deep convolutional neural networks[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11531-11539. [22] HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[J]. arXiv:1704.04861, 2017. [23] SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520. [24] HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 1314-1324. [25] ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6848-6856. [26] MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 122-138. [27] 冯明文, 徐杨, 张永丹, 等. 结合动态分裂卷积和注意力的多尺度人体姿态估计[J]. 计算机工程与应用, 2024, 60(22): 219-229. FENG M W, XU Y, ZHANG Y D, et al. Multi-scale human posture estimation based on dynamic split convolution and attention[J]. Computer Engineering and Applications, 2024, 60(22): 219-229. [28] YU C Q, XIAO B, GAO C X, et al. Lite-HRNet: a lightweight high?resolution network[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 10435-10445. [29] MEHTA S, RASTEGARI M. MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer[J]. arXiv:2110.02178, 2021. [30] ZHANG J N, LI X T, LI J, et al. Rethinking mobile block for efficient attention-based models[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2023: 1389-1400. [31] LIU Z, MAO H Z, WU C Y, et al. A ConvNet for the 2020s[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 11966-11976. [32] PARK N, KIM S. How do vision transformers work?[C]//Proceedings of the 10th International Conference on Learning Representations, 2022. [33] WANG Y H, LI M Y, CAI H, et al. Lite pose: efficient architecture design for 2D human pose estimation[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 13116-13126. [34] LIU Y, ZHANG S Y, CHEN J C, et al. Improving pixel-based MIM by reducing wasted modeling capability[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2023: 5338-5349. [35] HJELM R D, FEDOROV A, LAVOIE-MARCHILDON S, et al. Learning deep representations by mutual information estimation and maximization[C]//Proceedings of the 6th International Conference on Learning Representations, 2018. [36] FEDERICI M, DUTTA A, FORRé P, et al. Learning robust representations via multi-view information bottleneck[C]//Proceedings of the 8th International Conference on Learning Representations, 2020. [37] LIU Z G, FENG R Y, CHEN H M, et al. Temporal feature alignment and mutual information maximization for video-based human pose estimation[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 10996-11006. [38] ZHAO L, WANG Y X, ZHAO J P, et al. Learning view-disentangled human pose representation by contrastive cross-view mutual information maximization[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 12788-12797. [39] TIAN X D, ZHANG Z Z, LIN S H, et al. Farewell to mutual information: variational distillation for cross-modal person re?identification[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 1522-1531. [40] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]//Proceedings of the 13th European Conference on Computer Vision. Cham: Springer, 2014: 740-755. [41] ANDRILUKA M, PISHCHULIN L, GEHLER P, et al. 2D human pose estimation: new benchmark and state of the art analysis[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 3686-3693. [42] ZHANG F, ZHU X T, DAI H B, et al. Distribution-aware coordinate representation for human pose estimation[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 7091-7100. [43] ZHAO A R, LI J L, ZENG H T, et al. DSPose: dual-space-driven keypoint topology modeling for human pose estimation[J]. Sensors, 2023, 23(17): 7626. [44] 高坤, 李汪根, 束阳, 等. 融入密集连接的多尺度轻量级人体姿态估计[J]. 计算机工程与应用, 2022, 58(24): 196-204. GAO K, LI W G, SHU Y, et al. Multi-scale lightweight human pose estimation with dense connections[J]. Computer Engineering and Applications, 2022, 58(24): 196-204. [45] LI K, WANG S J, ZHANG X, et al. Pose recognition with cascade transformers[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 1944-1953. [46] LI Y J, ZHANG S K, WANG Z C, et al. TokenPose: learning keypoint tokens for human pose estimation[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 11293-11302. |
| [1] | HAO Hefei, ZHANG Longhao, CUI Hongzhen, ZHU Xiaoyue, PENG Yunfeng, LI Xianghui. Review of Application of Deep Neural Networks in Human Pose Estimation [J]. Computer Engineering and Applications, 2025, 61(9): 41-60. |
| [2] | SHI Lichen, YANG Chao, LIU Xuechao, ZHOU Xingyu. Lightweight Low-Light Object Detection Algorithm Based on CDD-YOLO [J]. Computer Engineering and Applications, 2025, 61(6): 106-117. |
| [3] | WANG Yanni, HU Min, HAN Shipeng, CHEN Yixuan, LYU Hao. Human Pose Estimation with Multi-Scale and Multi-Level Feature Fusion [J]. Computer Engineering and Applications, 2025, 61(6): 199-209. |
| [4] | WANG Guoxiang, LI Changlong, SONG Junfeng, YE Zhen, JIN Heng. Image Depth Estimation Algorithm Incorporating Adaptive Sampling and Context-Aware Module [J]. Computer Engineering and Applications, 2025, 61(5): 261-268. |
| [5] | LIAO Ningsheng, CAO Tianxiu, LIU Keyan, XU Meng, ZHU Mi, GU Yuxuan, WANG Pengfei. Small Target Detection Algorithm for UAV Based on Composite Feature and Multi-Scale Fusion [J]. Computer Engineering and Applications, 2025, 61(3): 111-120. |
| [6] | XU Zhenfeng, XU Yunfeng, YU Zizhou, MEI Wei, ZHANG Yan. Multi-Scale Target Detection Algorithm for Dense Pedestrian Detection Task [J]. Computer Engineering and Applications, 2025, 61(17): 304-316. |
| [7] | JIA Xiangyu, ZHANG Yonghong, KAN Xi, ZHU Linglong, LI Xu. D3F-DET: Lightweight and Multiscale Fusion Algorithm for Pavement Defect Detection [J]. Computer Engineering and Applications, 2025, 61(17): 159-170. |
| [8] | WANG Luxue, WANG Xiaoxia, LI Xiang, CHEN Xiao. Parallel Multi-Scale Feature Recursive Learning for Low-Light Image Enhancement [J]. Computer Engineering and Applications, 2025, 61(16): 265-271. |
| [9] | LIU Yuping, SHANG Cuijuan, LI Mingming. Improved YOLOv11 Algorithm for Small Target Detection in UAVs [J]. Computer Engineering and Applications, 2025, 61(15): 124-131. |
| [10] | XIAO Jun, ZHAO Ji. Efficient Lightweight Human Pose Estimation with Progressive Feature Fusion [J]. Computer Engineering and Applications, 2025, 61(15): 218-228. |
| [11] | LUO Xianzhi, WANG Hang. Small Target Detection Algorithm for UAV Based on Cross-Scale Feature Fusion [J]. Computer Engineering and Applications, 2025, 61(14): 135-147. |
| [12] | DONG Yibing, ZENG Hui, LI Jianke, HOU Shaojie, SHI Lei. DPRT-YOLO: Real-Time Object Detector for Intelligent and Connected Vehicles in Complex Driving Environments [J]. Computer Engineering and Applications, 2025, 61(14): 148-162. |
| [13] | XIONG Gan, CHEN Cifa, ZHANG Shang. QMDF-YOLO11: Rice Pests Detection Algorithm in Complex Scenarios [J]. Computer Engineering and Applications, 2025, 61(13): 113-123. |
| [14] | XU Jingke, SUO Xianglong, ZHOU Lei. Improved YOLOv8 Algorithm for Small Target Detection in Drone Aerial Photography [J]. Computer Engineering and Applications, 2025, 61(11): 119-131. |
| [15] | FENG Tailai, ZHANG Xuesong, SONG Cunli, LI Guangyu, JIN Hua. Improved Small Object Detection Method of YOLOv7 [J]. Computer Engineering and Applications, 2025, 61(10): 203-213. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||