[1] 秦晴, 王卫星, 刘清华, 等. 基于骨架信息的民族舞蹈典型动作识别[J]. 计算机工程与应用, 2023, 59(5): 281-288.
QIN Q, WANG W X, LIU Q H, et al. Classical ethnic dance action recognition based on skeleton information[J]. Computer Engineering and Applications, 2023, 59(5): 281-288.
[2] NIE X B, XIONG C, ZHU S C. Joint action recognition and pose estimation from video[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 1293-1301.
[3] CHO N G, YUILLE A L, LEE S W. Adaptive occlusion state estimation for human pose tracking under self-occlusions[J]. Pattern Recognition, 2013, 46(3): 649-661.
[4] SHOTTON J, SHARP T, KIPMAN A, et al. Real-time human pose recognition in parts from single depth images[J]. Communications of the ACM, 2013, 56(1): 116-124.
[5] CHEN Y, WANG Z, PENG Y, et al. Cascaded pyramid network for multi-person pose estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7103-7112.
[6] QIU L, ZHANG X, LI Y, et al. Peeking into occluded joints: a novel framework for crowd pose estimation[C]//Proceedings of the 16th European Conference on Computer Vision, 2020: 488-504.
[7] KHIRODKAR R, CHARI V, AGRAWAL A, et al. Multi-instance pose networks: rethinking top-down pose estimation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 3122-3131.
[8] 褚真, 米庆, 马伟, 等. 部位级遮挡感知的人体姿态估计[J]. 计算机研究与发展, 2022, 59(12): 2760-2769.
CHU Z, MI Q, MA W, et al. Part-level occlusion-aware human pose estimation[J]. Journal of Computer Research and Development, 2022, 59(12): 2760-2769.
[9] FANG H S, XIE S, TAI Y W, et al. RMPE: regional multi-person pose estimation[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2334-2343.
[10] XIAO B, WU H, WEI Y. Simple baselines for human pose estimation and tracking[C]//Proceedings of the European Conference on Computer Vision, 2018: 466-481.
[11] SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 5693-5703.
[12] YANG S, QUAN Z, NIE M, et al. Transpose: keypoint localization via transformer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 11802-11812.
[13] LI Y, ZHANG S, WANG Z, et al. TokenPose: learning keypoint tokens for human pose estimation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 11313-11322.
[14] YUAN Y, FU R, HUANG L, et al. HRFormer: high-resolution transformer for dense prediction[J]. arXiv:2110.09408, 2021.
[15] XU Y, ZHANG J, ZHANG Q, et al. ViTPose: simple vision transformer baselines for human pose estimation[J]. arXiv:2204.12484, 2022.
[16] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[J]. arXiv:2010.11929, 2020.
[17] ZHANG S H, LI R, DONG X, et al. Pose2seg: detection free human instance segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 889-898.
[18] SU K, YU D, XU Z, et al. Multi-person pose estimation with enhanced channel-wise and spatial information[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 5674-5682.
[19] LI J, WANG C, ZHU H, et al. CrowdPose: efficient crowded scenes pose estimation and a new benchmark[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 10863-10872.
[20] GENG Z, SUN K, XIAO B, et al. Bottom-up human pose estimation via disentangled keypoint regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 14676-14686.
[21] WANG D, ZHANG S, HUA G. Robust pose estimation in crowded scenes with direct pose-level inference[C]//Advances in Neural Information Processing Systems, 2021, 34: 6278-6289.
[22] WANG D, ZHANG S. Contextual instance decoupling for robust multi-person pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 11060-11068.
[23] ZHANG D, ZHENG Z, WANG T, et al. HROM: learning high-resolution representation and object-aware masks for visual object tracking[J]. Sensors, 2020, 20(17): 4807.
[24] CUBUK E D, ZOPH B, MANE D, et al. Autoaugment: learning augmentation policies from data[J]. arXiv:1805.
09501, 2018.
[25] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[J]. arXiv:2004.10934, 2020.
[26] ZHANG D, ZHENG Z, LI M, et al. CSART: channel and spatial attention-guided residual learning for real-time object tracking[J]. Neurocomputing, 2021, 436: 260-272.
[27] WANG X, BO L, FUXIN L. Adaptive wing loss for robust face alignment via heatmap regression[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 6971-6981.
[28] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2980-2988.
[29] WANG S, GUO X, TIE Y, et al. Discriminative patch descriptor learning with focal triplet loss function[C]//Proceedings of the 2021 IEEE International Conference on Image Processing, 2021: 3567-3571.
[30] TOMPSON J J, JAIN A, LECUN Y, et al. Joint training of a convolutional network and a graphical model for human pose estimation[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems, 2014: 1799-1807.
[31] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]//Proceedings of the 13th European Conference on Computer Vision, 2014: 740-755.
[32] HE K, CHEN X, XIE S, et al. Masked autoencoders are scalable vision learners[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 16000-16009.
[33] REDDI S J, KALE S, KUMAR S. On the convergence of Adam and beyond[J]. arXiv:1904.09237, 2019. |