视觉手语翻译技术研究综述

doi:10.3778/j.issn.1002-8331.2503-0238

摘要/Abstract

摘要： 视觉手语翻译技术是连接听障人群与健听人群的重要桥梁，近年来在计算机视觉与深度学习技术的驱动下取得显著进展。该技术旨在将手语视频动作自动转化为自然语言文本，从而实现两个群体的无障碍沟通。为便于研究者全面系统地了解视觉手语翻译任务，分别从三个方面展开综述研究：梳理并分类视觉手语翻译相关研究成果，并探讨其方法特点与技术演进；阐述手语数据的采集设备、多语言公开手语数据集以及常用评价指标；从当前手语技术的研究现状与应用实践出发，探讨该领域面临的挑战，并提出相应的展望和建议。

关键词: 视频理解, 手语翻译, 计算机视觉, 深度学习

Abstract: Visual sign language translation (VSLT) serves as a crucial bridge between the deaf and hearing communities. With the rapid development of computer vision and deep learning, VSLT has made significant progress in recent years. Its aim is to automatically convert sign language video sequences into natural language text, facilitating accessibility and inclusivity. To provide a comprehensive and systematic review of VSLT, this study examines the field from three key perspectives. It categorizes and analyzes VSLT researches, discussing the methodological characteristics and technological evolution. It provides the description of sign language data acquisition equipment, publicly available multi-language sign language datasets, and commonly used evaluation metrics. It examines the current state of research and practical applications, identifies existing challenges, and proposes relevant outlooks and feasible measures for researchers.

Key words: video understanding, sign language translation, computer vision, deep learning

吕佳威, 李绍彬, 朱若琳. 视觉手语翻译技术研究综述[J]. 计算机工程与应用, 2025, 61(24): 68-85.

LYU Jiawei, LI Shaobin, ZHU Ruolin. Review of Research on Visual Sign Language Translation Technology[J]. Computer Engineering and Applications, 2025, 61(24): 68-85.

参考文献

[1] World Health Organization. World report on hearing[EB/OL]. (2021-03-03) [2025-05-07]. https://cdn.who.int/media/docs/default-source/documents/health-topics/deafness-and-hearing-loss/world-report-on-hearing/wrh-exec-summary-ch.pdf.
[2] 中国残联. 关于印发《第二期国家手语和盲文规范化行动计划(2021—2025年)》的通知[EB/OL]. (2021-12-09) [2025-05-07]. https://www.zgmx.org.cn/newsdetail/d-72564-0.html.
China Disabled Persons’ Federation. Notice on issuing the “second phase national sign language and braille standardization action plan (2021—2025)”[EB/OL]. (2021-12-09) [2025-05-07]. https://www.zgmx.org.cn/newsdetail/d-72564-0.html.
[3] 唐申庚. 基于深度学习的手语翻译与生成技术研究[D]. 合肥: 合肥工业大学, 2022.
TANG S G. Research on deep learning based sign language translation and generation technology [D]. Hefei: Hefei University of Technology, 2022.
[4] 冯时. 基于新型连续手语数据集的中国手语识别和翻译关键技术研究[D]. 天津: 天津理工大学, 2023.
FENG S. Research on key technologies of Chinese sign language recognition and translation based on new continuous sign language dataset[D]. Tianjin: Tianjin University of Technology, 2023.
[5] KOVA? I, MARáK P. Finger vein recognition: utilization of adaptive Gabor filters in the enhancement stage combined with SIFT/SURF-based feature extraction[J]. Signal, Image and Video Processing, 2023, 17(3): 635-641.
[6] HU L Y, GAO L Q, LIU Z K, et al. Continuous sign language recognition with correlation network[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 2529-2539.
[7] CAMGOZ N C, HADFIELD S, KOLLER O, et al. Neural sign language translation[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7784-7793.
[8] GROBEL K, ASSAN M. Isolated sign language recognition using hidden Markov models[C]//Proceedings of the 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation. Piscataway: IEEE, 1997: 162-167.
[9] BRASHEAR H, HENDERSON V, PARK K H, et al. American sign language recognition in game development for deaf children[C]//Proceedings of the 8th International ACM SIGACCESS Conference on Computers and Accessibility. New York: ACM, 2006: 79-86.
[10] VOGLER C, METAXAS D. A framework for recognizing the simultaneous aspects of American sign language[J]. Computer Vision and Image Understanding, 2001, 81(3): 358-384.
[11] HUANG J, ZHOU W G, LI H Q, et al. Sign language recognition using 3D convolutional neural networks[C]//Proceedings of the 2015 IEEE International Conference on Multimedia and Expo. Piscataway: IEEE, 2015: 1-6.
[12] CIHAN CAMG?Z N, KOLLER O, HADFIELD S, et al. Sign language transformers: joint end-to-end sign language recognition and translation[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10020-10030.
[13] FANG S, CHEN C, WANG L, et al. SignLLM: sign language production large language models[J]. arXiv:2405.10718, 2024.
[14] VANBANG L E, 朱煜, 赵江坤, 等. 基于深度图像HOG特征的实时手势识别方法[J]. 华东理工大学学报(自然科学版), 2015, 41(5): 698-702.
VANBANG L E, ZHU Y, ZHAO J K, et al. Real-time gesture recognition method based on depth image HOG features[J]. Journal of East China University of Science and Technology (Natural Science Edition), 2015, 41(5): 698-702.
[15] 王旋, 方河川, 常俪琼, 等. 基于RFID的免携带设备手势识别关键技术研究[J]. 计算机研究与发展, 2017, 54(12): 2752-2760.
WANG X, FANG H C, CHANG L Q, et al. Research on key technologies of RFID based device free gesture recognition[J]. Journal of Computer Research and Development, 2017, 54(12): 2752-2760.
[16] 王巍, 张慧静, 任相臻. 基于SVM的单手指语识别方法[J]. 计算机工程与设计, 2018, 39(10): 3234-3239.
WANG W, ZHANG H J, REN X Z. Single finger language recognition method based on SVM[J]. Computer Engineering and Design, 2018, 39(10): 3234-3239.
[17] ZHENG L H, LIANG B. Sign language recognition using depth images[C]//Proceedings of the 2016 14th International Conference on Control, Automation, Robotics and Vision. Piscataway: IEEE, 2016: 1-6.
[18] OLIVEIRA M, SUTHERLAND A, FAROUK M. Two-stage PCA with interpolated data for hand shape recognition in sign language[C]//Proceedings of the 2016 IEEE Applied Imagery Pattern Recognition Workshop. Piscataway: IEEE, 2016: 1-4.
[19] KOLLER O, FORSTER J, NEY H. Continuous sign language recognition: towards large vocabulary statistical recognition systems handling multiple signers[J]. Computer Vision and Image Understanding, 2015, 141: 108-125.
[20] WANG H J, CHAI X J, CHEN X L. A novel sign language recognition framework using hierarchical Grassmann covariance matrix[J]. IEEE Transactions on Multimedia, 2019, 21(11): 2806-2814.
[21] SAGGIO G, CAVALLO P, RICCI M, et al. Sign language recognition using wearable electronics: implementing k-nearest neighbors with dynamic time warping and convolutional neural network algorithms[J]. Sensors, 2020, 20(14): 3879.
[22] PU J F, ZHOU W G, LI H Q. Sign language recognition with multi-modal features[C]//Advances in Multimedia Information Processing. Cham: Springer, 2016: 252-261.
[23] LI Y N, MIAO Q G, TIAN K, et al. Large-scale gesture recognition with a fusion of RGB-D data based on the C3D model[C]//Proceedings of the 2016 23rd International Conference on Pattern Recognition. Piscataway: IEEE, 2016: 25-30.
[24] RABINER L R. A tutorial on hidden Markov models and selected applications in speech recognition[J]. Proceedings of the IEEE, 1989, 77(2): 257-286.
[25] MVLLER M. Dynamic time warping[M]//Information retrieval for music and motion. Berlin, Heidelberg: Springer, 2007: 69-84.
[26] GRAVES A, FERNáNDEZ S, GOMEZ F, et al. Connectionist temporal classification: labelling unsegmented sequ-ence data with recurrent neural networks[C]//Proceedings of the 23rd International Conference on Machine Learning. New York: ACM, 2006: 369-376.
[27] BAUER B, HIENZ H, KRAISS K F. Video-based continuous sign language recognition using statistical methods[C]//Proceedings of the 15th International Conference on Pattern Recognition. Piscataway: IEEE, 2000: 463-466.
[28] ELAKKIYA R, SELVAMANI K. Subunit sign modeling framework for continuous sign language recognition[J]. Computers & Electrical Engineering, 2019, 74: 379-390.
[29] YANG R D, SARKAR S, LOEDING B. Enhanced level bui-lding algorithm for the movement epenthesis problem in sign language recognition[C]//Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2007: 1-8.
[30] MOLCHANOV P, YANG X D, GUPTA S, et al. Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural networks[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4207-4215.
[31] 吴金山, 黄子建, 陈小芳, 等. 基于改进CenterNet模型的快速珍稀植物识别[J/OL]. 林业科技通讯, 2025: 1-9(2025-04-15)[2025-05-08]. https://link.cnki.net/doi/10.13456/j.cnki.lykt. 2025.03.11.0002.
WU J S, HUANG Z J, CHEN X F, et al. Rapid identific-ation of rare plants based on improved CenterNet model[J/OL]. Forest Science and Technology, 2025: 1-9(2025-04-15)[2025-05-08]. https://link.cnki.net/doi/10.13456/j.cnki.lykt.2025.03. 11.0002.
[32] 郑雨帆, 王银涛, 孙琦. 基于轻量化深度网络的水下声呐目标识别方法[J/OL]. 指挥控制与仿真, 2025: 1-10 (2025-04-15)[2025-05-08]. https://kns.cnki.net/kcms/detail/32.1759.TJ.20250414. 1557.038.html.
ZHENG Y F, WANG Y T, SUN Q. Underwater sonar target recognition method based on lightweight depth network[J/OL]. Command Control & Simulation, 2025: 1-10 (2025-04-15)[2025-05-08]. https://kns.cnki.net/kcms/detail/32.1759.TJ.20250414. 1557.038.html.
[33] 刘威, 张成挺, 许高明, 等. 基于三维卷积神经网络和信道状态信息的人体动作识别[J]. 数据通信, 2024(3): 10-14.
LIU W, ZHANG C T, XU G M, et al. Human action recognition based on 3DCNN and CSI[J]. Data Communications, 2024(3): 10-14.
[34] ANGELIN BEULAH S, SIVAGAMI M. Comparative analysis of 2D and 3D convolutional neural networks for medical ultrasound image classification[J]. Journal of Image and Graphics. 2025, 13(1): 1-14.
[35] PIGOU L, DIELEMAN S, KINDERMANS P J, et al. Sign language recognition using convolutional neural networks[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer, 2015: 572-578.
[36] PU J F, ZHOU W G, LI H Q. Dilated convolutional network with iterative optimization for continuous sign language recognition[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence. New York: ACM, 2018: 885-891.
[37] ADALOGLOU N, CHATZIS T, PAPASTRATIS I, et al. A com-prehensive study on deep learning-based methods for sign language recognition[J]. IEEE Transactions on Multimedia, 2021, 24: 1750-1762.
[38] CHEN Y T, ZUO R L, WEI F Y, et al. Two-stream network for sign language recognition and translation[C]//Proceedings of the 36th International Conference on Neural Information Processing Systems. New York: ACM, 2022: 17043-17056.
[39] GAO L Q, LI H B, LIU Z J, et al. RNN-transducer based Chinese sign language recognition[J]. Neurocomputing, 2021, 434: 45-54.
[40] LI H B, GAO L Q, HAN R Z, et al. Key action and joint CTC-attention based sign language recognition[C]//Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2020: 2348-2352.
[41] 位俊超, 陈春雨. 基于SAT-GCN的花样滑冰选手动作检测算法研究[J]. 应用科技, 2023, 50(1): 7-13.
WEI J C, CHEN C Y. Research on motion detection algorithm of figure skaters based on SAT-GCN[J]. Applied Science and Technology, 2023, 50(1): 7-13.
[42] WANG Z C, ZHANG J Q. Continuous sign language recognition based on multi-part skeleton data[C]//Proceedings of the 2021 International Joint Conference on Neural Networks. Piscataway: IEEE, 2021: 1-8.
[43] LI R H, MENG L. Multi-view spatial-temporal network for continuous sign language recognition[J]. arXiv:2204.08747, 2022.
[44] JIAO P Q, MIN Y C, LI Y N, et al. CoSign: exploring co-occurrence signals in skeleton-based continuous sign language recognition[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2024: 20619-20629.
[45] HAO A M, MIN Y C, CHEN X L. Self-mutual distillation learning for continuous sign language recognition[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 11283-11292.
[46] TUNGA A, NUTHALAPATI S V, WACHS J. Pose-based sign language recognition using GCN and BERT[C]//Proceedings of the 2021 IEEE Winter Conference on Applic-ations of Computer Vision Workshops. Piscataway: IEEE, 2021: 31-40.
[47] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Advances in Neural Information Processing Systems, 2017.
[48] 周振霄, 王华, 魏德健, 等. Transformer在医学图像分割中的研究进展[J]. 计算机工程与应用, 2025, 61(20): 54-74.
ZHOU Z X, WANG H, WEI D J, et al. Research progress of Transformers in medical image segmentation[J]. Computer Engineering and Applications, 2025, 61(20): 54-74.
[49] 陈广秋, 刘枫铭, 段锦, 等. 基于轻量化Transformer的车道线检测方法[J]. 华中科技大学学报(自然科学版), 2025, 53(3): 117-126.
CHEN G Q, LIU F M, DUAN J, et al. Lane line detection method based on lightweight transformer[J]. Journal of Huazhong University of Science and Technology (Natural Science Edition), 2025, 53(3): 117-126.
[50] 杨爱萍, 方思捷, 邵明福, 等. 基于Transformer的多尺度水下图像增强网络[J]. 东北大学学报(自然科学版), 2024, 45(12): 1696-1705.
YANG A P, FANG S J, SHAO M F, et al. Transformer-based multi-scale underwater image enhancement network[J]. Journal of Northeastern University (Natural Science), 2024, 45(12): 1696-1705.
[51] CAMGOZ N C, HADFIELD S, KOLLER O, et al. Sub-UNets: end-to-end hand shape and continuous sign language recognition[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 3075-3084.
[52] ZHANG Z H, PU J F, ZHUANG L S, et al. Continuous sign language recognition via reinforcement learning[C]//Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway: IEEE, 2019: 285-289.
[53] HU H Z, ZHAO W C, ZHOU W G, et al. SignBERT: pre-training of hand-model-aware representation for sign language recognition[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 11067-11076.
[54] ZHOU H, ZHOU W G, ZHOU Y, et al. Spatial-temporal multi-cue network for continuous sign language recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020: 13009-13016.
[55] ZHOU B J, CHEN Z G, CLAPéS A, et al. Gloss-free sign language translation: improving from visual-language pretraining[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2023: 20814-20824.
[56] YIN K, READ J. Better sign language translation with STMC-transformer[C]//Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, 2020: 5975-5989.
[57] ZHOU H, ZHOU W G, ZHOU Y, et al. Spatial-temporal multi-cue network for sign language recognition and translation[J]. IEEE Transactions on Multimedia, 2022, 24: 768-779.
[58] GUAN M, WANG Y, MA G K, et al. Multi-stream keypoint attention network for sign language recognition and translation[J]. arXiv:2405.05672, 2024.
[59] 常钰坤, 曹港生, 马振九, 等. 基于PSO-LSTM模型的上肢动作识别方法[J]. 华东理工大学学报(自然科学版), 2024, 50(5): 760-769.
CHANG Y K, CAO G S, MA Z J, et al. Upper limb motion recognition method based on PSO-LSTM model[J]. Journal of East China University of Science and Technology, 2024, 50(5): 760-769.
[60] 卫青蓝, 罗天辰, 张远. 从跨媒体到跨空间: 情感计算的发展[J]. 信息传播研究, 2024, 31(6): 13-23.
WEI Q L, LUO T C, ZHANG Y. From cross-media to cross-space: the development of affective computing[J]. Information and Communication Research, 2024, 31(6): 13-23.
[61] AHN J, JANG Y, CHUNG J S. Slowfast network for continuous sign language recognition[C]//Proceedings of the 2024 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2024: 3920-3924.
[62] GAN S W, YIN Y F, JIANG Z W, et al. Towards real-time sign language recognition and translation on edge devices[C]//Proceedings of the 31st ACM International Conference on Multimedia. New York: ACM, 2023: 4502-4512.
[63] LIANG H, HUANG C Y, XU Y C, et al. LLaVA-SLT: visual language tuning for sign language translation[J]. arXiv:2412. 16524, 2024.
[64] SIMONYAN K, ZISSERMAN A. Two-stream convolutional networks for action recognition in videos[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. New York: ACM, 2014: 568-576.
[65] XIN W T, LIU R Y, LIU Y, et al. Transformer for Skeleton-based action recognition: a review of recent advances[J]. Neurocomputing, 2023, 537: 164-186.
[66] KUMAR P, GAUBA H, PRATIM ROY P, et al. A multimodal framework for sensor based sign language recognition[J]. Neurocomputing, 2017, 259: 21-38.
[67] JIANG S Y, SUN B, WANG L C, et al. Skeleton aware multi-modal sign language recognition[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2021: 3408-3418.
[68] LIANG R H, OUHYOUNG M. A real-time continuous gesture recognition system for sign language[C]//Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition. Piscataway: IEEE, 1998: 558-567.
[69] TUBAIZ N, SHANABLEH T, ASSALEH K. Glove-based continuous Arabic sign language recognition in user-dependent mode[J]. IEEE Transactions on Human-Machine Systems, 2015, 45(4): 526-533.
[70] TUFFAHA M, SHANABLEH T, ASSALEH K. Novel feature extraction and classification technique for sensor-based continuous Arabic sign language recognition[C]//Proceedings of the 22nd International Conference on Neural Information Processing. Cham: Springer International Publishing, 2015: 290-299.
[71] HASSAN S, SEITA M, BERKE L, et al. ASL-homework-RGBD dataset: an annotated dataset of 45 fluent and non-fluent signers performing American sign language homeworks[J]. arXiv:2207.04021, 2022.
[72] HASSAN M, ASSALEH K, SHANABLEH T. Multiple proposals for continuous Arabic sign language recognition[J]. Sensing and Imaging, 2019, 20(1): 4.
[73] TATENO S, LIU H B, OU J H. Development of sign language motion recognition system for hearing-impaired people using electromyography signal[J]. Sensors, 2020, 20(20): 5807.
[74] SURI K, GUPTA R. Continuous sign language recognition from wearable IMUs using deep capsule networks and game theory[J]. Computers & Electrical Engineering, 2019, 78: 493-503.
[75] SHARMA S, GUPTA R, KUMAR A. Continuous sign language recognition using isolated signs data and deep transfer learning[J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14(3): 1531-1542.
[76] EKIZ D, KAYA G E, BU?UR S, et al. Sign sentence recognition with smart watches[C]//Proceedings of the 2017 25th Signal Processing and Communications Applications Conference. Piscataway: IEEE, 2017: 1-4.
[77] ZHANG L, ZHANG Y X, ZHENG X L. WiSign: ubiquitous American sign language recognition using commercial Wi-Fi devices[J]. ACM Transactions on Intelligent Systems and Technology, 2020, 11(3): 1-24.
[78] MENG X J, FENG L, YIN X, et al. Sentence-level sign language recognition using RF signals[C]//Proceedings of the 2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing. Piscataway: IEEE, 2019: 1-6.
[79] YE L T, LAN S C, ZHANG K, et al. EM-sign: a non-contact recognition method based on 24 GHz Doppler radar for continuous signs and dialogues[J]. Electronics, 2020, 9(10): 1577.
[80] MUKUSHEV M, UBINGAZHIBOV A, KYDYRBEKOVA A, et al. FluentSigners-50: a signer independent benchmark dataset for sign language processing[J]. PLoS One, 2022, 17(9): e0273649.
[81] FANG B Y, CO J, ZHANG M. DeepASL: enabling ubiquitous and non-intrusive word and sentence-level sign language translation[C]//Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems. New York: ACM, 2017: 1-13.
[82] MITTAL A, KUMAR P, ROY P P, et al. A modified LSTM model for continuous sign language recognition using leap motion[J]. IEEE Sensors Journal, 2019, 19(16): 7056-7063.
[83] YANG S, ZHU Q. Video-based Chinese sign language recognition using convolutional neural network[C]//Proceedings of the 2017 IEEE 9th International Conference on Communication Software and Networks. Piscataway: IEEE, 2017: 929-934.
[84] HISHAM B, HAMOUDA A. Supervised learning classifiers for Arabic gestures recognition using Kinect V2[J]. SN Applied Sciences, 2019, 1(7): 768.
[85] STAMP R, COHN D, HEL-OR H, et al. Kinecting the dots: using motion-capture technology to distinguish sign language linguistic from gestural expressions[J]. Language and Speech, 2024, 67(1): 255-276.
[86] ATHITSOS V, NEIDLE C, SCLAROFF S, et al. The American sign language lexicon video dataset[C]//Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2008: 1-8.
[87] COOPER H, ONG E J, PUGEAULT N, et al. Sign language recognition using sub-units[M]//Gesture recognition. Cham: Springer International Publishing, 2017: 89-118.
[88] OSZUST M, WYSOCKI M. Polish sign language words recognition with Kinect[C]//Proceedings of the 2013 6th International Conference on Human System Interactions. Piscataway: IEEE, 2013: 219-226.
[89] SCHEMBRI A, FENLON J, RENTELIS R, et al. Building the British sign language corpus[J]. Language Document-ation & Conservation, 2013(7): 136-154.
[90] CHAI X, WANG H, CHEN X. The devisign large vocabulary of Chinese sign language database and baseline evalu-ations: VIPL-TR-14-SLR-001[R]. Beijing: Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), 2014.
[91] ESCALERA S, BARó X, GONZàLEZ J, et al. ChaLearn looking at people challenge 2014: dataset and results[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer, 2014: 459-473.
[92] ZHANG J H, ZHOU W G, XIE C, et al. Chinese sign language recognition with adaptive HMM[C]//Proceedings of the 2016 IEEE International Conference on Multimedia and Expo. Piscataway: IEEE, 2016: 1-6.
[93] LIU T, ZHOU W G, LI H Q. Sign language recognition with long short-term memory[C]//Proceedings of the 2016 IEEE International Conference on Image Processing. Piscataway: IEEE, 2016: 2871-2875.
[94] GUTIERREZ-SIGUT E, COSTELLO B, BAUS C, et al. LSE-sign: a lexical database for spanish sign language[J]. Behavior Research Methods, 2016, 48(1): 123-137.
[95] JOZE H R V, KOLLER O. MS-ASL: a large-scale data set and benchmark for understanding American sign language[J]. arXiv:1812.01053, 2018.
[96] JOHNSTON T. From archive to corpus: transcription and ann-otation in the creation of signed language corpora[J]. International Journal of Corpus Linguistics, 2010, 15(1): 106-131.
[97] YANG S, JUNG S, KANG H, et al. The Korean sign language dataset for action recognition[C]//Proceedings of the International Conference on Multimedia Modeling. Cham: Springer, 2020: 532-542.
[98] CHENG K L, YANG Z Y, CHEN Q F, et al. Fully convolutional networks for continuous sign language recognition[C]//Proceedings of the European Conference on Computer Vision. New York: ACM, 2020: 697-714.
[99] LI D X, OPAZO C R, YU X, et al. Word-level deep sign language recognition from video: a new large-scale dataset and methods comparison[C]//Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2020: 1448-1458.
[100] ALBANIE S, VAROL G, MOMENI L, et al. BSL-1K: sca-ling up co-articulated sign language recognition using mou-thing cues[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer, 2020: 35-53.
[101] ?ZDEMIR O, KINDIRO?LU A A, CAMG?Z N C, et al. BosphorusSign22k sign language recognition dataset[J]. arXiv:2004.01283, 2020.
[102] SINCAN O M, KELES H Y. AUTSL: a large scale multi-modal Turkish sign language dataset and baseline methods[J]. IEEE Access, 2020, 8: 181340-181355.
[103] SRIDHAR A, GANESAN R G, KUMAR P, et al. INCLUDE: a large scale dataset for Indian sign language recognition[C]//Proceedings of the 28th ACM International Conference on Multimedia. New York: ACM, 2020: 1366-1375.
[104] TAVELLA F, SCHLEGEL V, ROMEO M, et al. WLASL-LEX: a dataset for recognising phonological properties in American sign language[J]. arXiv:2203.06096, 2022.
[105] SIDIG A A I, LUQMAN H, MAHMOUD S, et al. KArSL: Arabic sign language database[J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 2021, 20(1): 1-19.
[106] RONCHETTI F, QUIROGA F M, ESTREBOU C, et al. LSA64: an Argentinian sign language dataset[J]. arXiv:2310. 17429, 2023.
[107] NUZHDIN A, NAGAEV A, SAUTIN A, et al. HaGRIDv2: 1M images for static and dynamic hand gesture recognition[J]. arXiv: 2412.01508, 2024.
[108] DREUW P, FORSTER J, DESELAERS T, et al. Efficient approximations to model-based joint tracking and recogn-ition of continuous sign language[C]//Proceedings of the 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition. Piscataway: IEEE, 2008: 1-6.
[109] BULL H, BRAFFORT A, GOUIFFES M. MEDIAPI-SKEL-a 2D-skeleton video database of french sign language with aligned french subtitles[C]//Proceedings of the Twelfth Language Resources and Evaluation Conference. Paris: ELRA, 2020: 6063-6068.
[110] FORSTER J, SCHMIDT C A, KOLLER O, et al. Extensions of the sign language recognition and translation corpus RWTH-PHOENIX-weather[C]//Proceedings of the International Conference on Language Resources and Evaluation, 2014.
[111] VIITANIEMI V, JANTUNEN T, SAVOLAINEN L, et al. S-pot-a benchmark in spotting signs within continuous signing[C]//Proceedings of the 9th International Conference on Language Resources and Evaluation(LREC). Paris: ELRA, 2014.
[112] HUANG J, ZHOU W G, ZHANG Q L, et al. Video-based sign language recognition without temporal segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2018.
[113] KO S K, KIM C J, JUNG H, et al. Neural sign language translation based on human keypoint estimation[J]. Applied Sciences, 2019, 9(13): 2683.
[114] DUARTE A, PALASKAR S, VENTURA L, et al. How2Sign: a large-scale multimodal dataset for continuous American sign language[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 2734-2743.
[115] ZHOU H, ZHOU W G, QI W Z, et al. Improving sign language translation with monolingual data by sign back-translation[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 1316-1325.
[116] ALBANIE S, VAROL G, MOMENI L, et al. BBC-Oxford British sign language dataset[J]. arXiv:2111.03635, 2021.
[117] DAL BIANCO P, RíOS G, RONCHETTI F, et al. LSA-T: the first continuous Argentinian sign language dataset for sign language translation[C]//Advances in Artificial Intelligence. Cham: Springer International Publishing, 2022: 293-304.
[118] UTHUS D, TANZER G, GEORG M. YouTube-ASL: a large-scale, open-domain American sign language-English parallel corpus[J]. arXiv:2306.15162, 2023.
[119] JOSHI A, AGRAWAL S, MODI A. ISLTranslate: dataset for translating Indian sign language[J]. arXiv:2307.05440, 2023.
[120] NIU Z, ZUO R L, MAK B, et al. A Hong Kong sign language corpus collected from sign-interpreted TV news[J]. arXiv:2405.00980, 2024.
[121] KIM W, KIM T Y, KIM B, et al. Korean disaster safety information sign language translation benchmark dataset[C]//Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation. Paris: ELRA, 2024: 9948-9953.
[122] ZHU Q D, LI J, YUAN F, et al. A Chinese continuous sign language dataset based on complex environments[J]. arXiv: 2409.11960, 2024.
[123] ZHANG P Y, YIN H, WANG Z R, et al. EvSign: sign language recognition and Translation with streaming events[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer, 2024: 335-351.
[124] VON AGRIS U, KRAISS K F. Towards a video corpus for signer-independent continuous sign language recognition[C]//Proceedings of the 7th International Workshop on Gesture in Human-Computer Interaction and Simulation, 2007.
[125] ALISHZADE N, HASANOV J. AzSLD: Azerbaijani sign language dataset for fingerspelling, word, and sentence translation with baseline software[J]. Data in Brief, 2025, 58: 111230.
[126] 郑璇. 手语数字人研发现状与思考[J]. 语言战略研究, 2024, 9(3): 17-28.
ZHENG X. The development of signing avatars: current situation and reflections[J]. Chinese Journal of Language Policy and Planning, 2024, 9(3): 17-28.
[127] BALTATZIS V, POTAMIAS R A, VERVERAS E, et al. Neural sign actors: a diffusion model for 3D sign language production from text[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 1985-1995.
[128] NATARAJAN B, ELAKKIYA R. Dynamic GAN for high-quality sign language video generation from skeletal poses using generative adversarial networks[J]. Soft Computing, 2022, 26(23): 13153-13175.
[129] DONG L, CHAUDHARY L, XU F, et al. SignAvatar: sign language 3D motion reconstruction and generation[C]//Proceedings of the 2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition. Piscataway: IEEE, 2024: 1-10.