[1] PUROHIT M, PATEL M, MALAVIYA H, et al. Intelligibility improvement of dysarthric speech using mmse discogan[C]//Proceedings of the 2020 International Conference on Signal Processing and Communications (SPCOM), 2020: 1-5.
[2] 《中国脑卒中防治报告2019》编写组. 《中国脑卒中防治报告2019》概要[J]. 中国脑血管病杂志, 2020, 17(5): 272-281.
Report on stroke prevention and treatment in China Writing Group. Brief report on stroke prevention and treatment in China, 2019[J]. Chinese Journal of Cerebrovascular Diseases, 2020, 17(5): 272-281.
[3] 徐莉, 徐明成, 夏逸婷, 等. 针灸治疗缺血性脑卒中构音障碍的疗效观察[J]. 云南中医学院学报, 2017, 40(6): 95-97.
XU L, XU M C, XIA Y T, et al. Observation of acupuncture and moxibustion in the treatment of dysarthria in ischemic stroke[J]. Journal of Yunnan University of Traditional Chinese Medicine, 2017, 40(6): 95-97.
[4] 李阿妮. 失语症患者语音信号的识别研究[D]. 西安: 西安科技大学, 2010.
LI A N. The reserch on recognition of aphasia speech signals[D]. Xi’an: Xi’an University of Science and Technology, 2010.
[5] 张旺. 基于语音识别的功能性构音障碍分析评估研究[D]. 兰州: 兰州理工大学, 2019.
ZHANG W. Analysis and evaluation of functional articulation disorder based on speech recognition[D]. Lanzhou: Lanzhou University of Technology, 2019.
[6] 李山路, 王泳, 甘俊英. 重录语音检测算法[J]. 信号处理, 2017, 33(1): 95-101.
LI S L, WANG Y, GAN J Y. An algorithm of speech recapture detection[J]. Journal of Signal Processing, 2017, 33(1): 95-101.
[7] FILIPPIDOU F, MOUSSIADES L. Α benchmarking of IBM, Google and Wit automatic speech recognition systems[C]//Proceedings of the IFIP International Conference on Artificial Intelligence Applications and Innovations. Cham: Springer, 2020: 73-82.
[8] DE RUSSIS L, CORNO F. On the impact of dysarthric speech on contemporary ASR cloud platforms[J]. Journal of Reliable Intelligent Environments, 2019, 5(3): 163-172.
[9] GREEN J R, MACDONALD R L, JIANG P P, et al. Automatic speech recognition of disordered speech: personalized models outperforming human listeners on short phrases[C]//Proceedings of the Interspeech 2021, 2021: 4778-4782
[10] 刘伟, 谢建志. 语音合成系统中语音库样本能量均衡方法研究[J]. 信号处理, 2017, 33(2): 229-235.
LIU W, XIE J Z. Voice energy balance method for text to speech database[J]. Journal of Signal Processing, 2017, 33(2): 229-235.
[11] DELLER J R, LIU M S, FERRIER L J, et al. The Whitaker database of dysarthric (cerebral palsy) speech[J]. The Journal of the Acoustical Society of America, 1993, 93(6): 3516-3518.
[12] MENENDEZ-PIDAL X, POLIKOFF J B, PETERS S M, et al. The Nemours database of dysarthric speech[C]//Proceedings of the 4th International Conference on Spoken Language Processing, 2002: 1962-1965.
[13] RUDZICZ F, NAMASIVAYAM A K, WOLFF T. The Torgo database of acoustic and articulatory speech from speakers with dysarthria[J]. Language Resources and Evaluation, 2012, 46(4): 523-541.
[14] KIM H, HASEGAWA-JOHNSON M, PERLMAN A, et al. Dysarthric speech database for universal access research[C]//Proceedings of the Interspeech 2008, 2008: 1741-1744.
[15] NICOLAO M, CHRISTENSEN H, CUNNINGHAM S, et al. A framework for collecting realistic recordings of dysarthric speech-the homeservice corpus[C]//Proceedings of the 10th International Conference on Language Resources and Evaluation, 2016: 1993-1997.
[16] 唐以廷. 汉语普通话失语症患者的特定字识别研究[D]. 汕头: 汕头大学, 2020.
TANG Y T. A study on the recognition of specific words in patients with aphasia in Mandarin Chinese[D]. Shantou: Shantou University, 2020.
[17] MARINI M, VIGANò M, CORBO M, et al. IDEA: an Italian dysarthric speech database[C]//Proceedings of the 2021 IEEE Spoken Language Technology Workshop (SLT), 2021: 1086-1093.
[18] TURRISI R, BRACCIA A, EMANUELE M, et al. EasyCall corpus: a dysarthric speech dataset[J]. arXiv:2104.02542, 2021.
[19] MACDONALD R L, JIANG P P, CATTIAU J, et al. Disordered speech data collection: lessons learned at 1 million utterances from project euphonia[C]//Proceedings of the Interspeech 2021, 2021: 4833-4837.
[20] MARIYA CELIN T A, NAGARAJAN T, VIJAYALAKSHMI P. Dysarthric speech corpus in Tamil for rehabilitation research[C]//Proceedings of the 2016 IEEE Region 10 Conference (TENCON), 2017: 2610-2613.
[21] STYLER W. Using Praat for linguistic research[D]. Colorado, America: University of Colorado at Boulder Phonetics Lab, 2013.
[22] XIONG F F, BARKER J, CHRISTENSEN H. Deep learning of articulatory-based representations and applications for improving dysarthric speech recognition[C]//Proceedings of the ITG Symposium on Speech Communication, 2018: 1-5.
[23] POVEY D, GHOSHAL A, BOULIANNE G, et al. The Kaldi speech recognition toolkit[C]//Proceedings of the IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, 2011.
[24] XIONG F F, BARKER J, YUE Z J, et al. Source domain data selection for improved transfer learning targeting dysarthric speech recognition[C]//Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020: 7424-7428.
[25] POVEY D, CHENG G F, WANG Y M, et al. Semi-orthogonal low-rank matrix factorization for deep neural networks[C]//Proceedings of the Interspeech 2018. , 2018: 3743-3747.
[26] POVEY D, PEDDINTI V, GALVEZ D, et al. Purely sequence-trained neural networks for ASR based on lattice-free MMI[C]//Proceedings of the Interspeech 2016, 2016: 2751-2755.
[27] PARK D S, CHAN W, ZHANG Y, et al. SpecAugment: a simple data augmentation method for automatic speech recognition[C]//Proceedings of the Interspeech 2019, 2019: 2613-2617.
[28] YU J J, XIE X R, LIU S S, et al. Development of the CUHK dysarthric speech recognition system for the UA speech corpus[C]//Proceedings of the Interspeech 2018, 2018: 2938-2942.
[29] ELSKEN T, METZEN J H, HUTTER F. Neural architecture search[M]//Automated machine learning. Cham: Springer, 2019: 63-77.
[30] LIU S S, GENG M Z, HU S K, et al. Recent progress in the CUHK dysarthric speech recognition system[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 2267-2281.
[31] LIU H X, SIMONYAN K, YANG Y M. DARTS: differentiable architecture search[J]. arXiv:1806.09055, 2018.
[32] AFOURAS T, CHUNG J S, SENIOR A, et al. Deep audio-visual speech recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(12): 8717-8727.
[33] 段淑斐, 王俊芹, DINGAM Camille, 等. 基于发音空间特征的构音障碍患者的病情分级[J]. 复旦学报 (自然科学版), 2021, 60(3): 288-296.
DUAN S F, WANG J Q, CAMILLE D, et al. Disease degree classification of dysarthria based on spatial features of articulation[J]. Journal of Fudan University (Natural Science), 2021, 60(3): 288-296.
[34] FRANK R, GRAEME H, PASCAL V L. Vocal tract representation in the recognition of cerebral palsied speech[J]. Journal of Speech, Language, and Hearing Research, 2012, 55(4): 1190-207.
[35] SHI B W, HSU W N, LAKHOTIA K, et al. Learning audio-visual speech representation by masked multimodal cluster prediction[J]. arXiv:2201.02184, 2022.
[36] HU S J, LIU S S, XIE X R, et al. Exploiting cross domain acoustic-to-articulatory inverted features for disordered speech recognition[C]//Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022: 6747-6751.
[37] YUE Zhengjun, LOWEIMI E, CVETKOVIC Z, et al. Multi-modal acoustic-articulatory feature fusion for dysarthric speech recognition[C]//Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022: 7372-7376.
[38] KRISHNA G, CARNAHAN M, SHAMAPANT S, et al. Brain signals to rescue aphasia, apraxia and dysarthria speech recognition[C]//Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 2021: 6008-6014.
[39] CHUNG J, GULCEHRE C, CHO K, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[J]. arXiv:1412.3555, 2014.
[40] SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
[41] JACKS A, HALEY K, BISHOP G, et al. Automated speech recognition in adult stroke survivors: comparing human and computer transcriptions[J]. Folia Phoniatrica et Logopaedica, 2019, 71(5/6): 286-296.
[42] 李仕萍, 凌卫新, 陈卓铭, 等. 语言障碍诊断系统的设计及实现[J]. 计算机工程与应用, 2004, 40(30): 191-193.
LI S P, LING W X, CHEN Z M, et al. Design and implementation of the system of languages barrier diagnoses[J]. Computer Engineering and Applications, 2004, 40(30): 191-193.
[43] ENDERBY P. Frenchay dysarthria assessment[J]. International Journal of Language & Communication Disorders, 1980, 15(3): 165-173.
[44] GHIO A, POUCHOULIN G, TESTON B, et al. How to manage sound, physiological and clinical data of 2500 dysphonic and dysarthric speakers?[J]. Speech Communication, 2012, 54(5): 664-679.
[45] HAN M, CHEN F, NI Z, et al. ViLaS: integrating vision and language into automatic speech recognition[J]. arXiv:2305.19972, 2023.
[46] HE Y, SENG K P, ANG L M. Multimodal sensor-input architecture with deep learning for audio-visual speech recognition in wild[J]. Sensors, 2023, 23(4): 1834.
[47] 徐静. 基于声学特征探讨低动力型构音障碍帕金森病的量化评估方法[D]. 广州: 暨南大学, 2020.
XU J. A quantitative evaluation method of Parkinson’s disease with hypodynamic dysarthria based on speech characteristics[D]. Guangzhou: Jinan University, 2020.
[48] MIN Z, WANG J. Exploring the integration of large language models into automatic speech recognition systems: an empirical study[J]. arXiv:2307.06530, 2023.
[49] 马英杰, 陈骥, 帅杰. 基于语音识别的失语症康复治疗仪软件设计与实现[J]. 生物医学工程学杂志, 2006, 23(6): 1343-1346.
MA Y J, CHEN J, SHUAI J. Design and implementation of aphasia rehabilitation software based on speech recognition[J]. Journal of Biomedical Engineering, 2006, 23(6): 1343-1346. |