Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (14): 1-15.DOI: 10.3778/j.issn.1002-8331.2202-0196
• Research Hotspots and Reviews • Previous Articles Next Articles
LI Kezi, XU Yang, ZHANG Sicong, YAN Jiale
Online:
2022-07-15
Published:
2022-07-15
李克资,徐洋,张思聪,闫嘉乐
LI Kezi, XU Yang, ZHANG Sicong, YAN Jiale. Survey on Adversarial Example Attack and Defense Technology for Automatic Speech Recognition[J]. Computer Engineering and Applications, 2022, 58(14): 1-15.
李克资, 徐洋, 张思聪, 闫嘉乐. 自动语音辨识对抗攻击和防御技术综述[J]. 计算机工程与应用, 2022, 58(14): 1-15.
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2202-0196
[1] VACHER M,SERIGNAT J F,CHAILLOL S.Sound classification in a smart room environment:an approach using GMM and HMM methods[C]//4th IEEE Conference on Speech Technology and Human-Computer Dialogue(SpeD 2007),2007:135-146. [2] BANSAL P,KANT A,KUMAR S,et al.Improved hybrid model of HMM/GMM for speech recognition[J].Technologies and Applications,2008:69. [3] ZOU Q,NI L,WANG Q,et al.Robust gait recognition by integrating inertial and RGBD sensors[J].IEEE Trans Cybern,2018,48(4):1136-1150. [4] SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:1-9. [5] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778. [6] POVEY D,GHOSHAL A,BOULIANNE G,et al.The kaldi speech recognition toolkit[C]//IEEE 2011 Workshop on Automatic Speech Recognition and Understanding,2011. [7] HANNUN A,CASE C,CASPER J,et al.Deep speech:scaling up end-to-end speech recognition[J].arXiv:1412.5567,2014. [8] SU J,VARGAS D V,SAKURAI K.One pixel attack for fooling deep neural networks[J].IEEE Transactions on Evolutionary Computation,2019,23(5):828-841. [9] XIE C,WANG J,ZHANG Z,et al.Adversarial examples for semantic segmentation and object detection[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:1369-1378. [10] SONG D,EYKHOLT K,EVTIMOV I,et al.Physical adversarial examples for object detectors[J].arXiv:1807.07769,2018. [11] REN S,DENG Y,HE K,et al.Generating natural language adversarial examples through probability weighted word saliency[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,2019:1085-1097. [12] VAIDYA T,ZHANG Y,SHERR M,et al.Cocaine noodles:exploiting the gap between human and machine speech recognition[C]//9th USENIX Conference on Offensive Technologies,2015. [13] CARLINI N,MISHRA P,VAIDYA T,et al.Hidden voice commands[C]//Proceedings of the 25th USENIX Conference on Security Symposium(SEC’16),2016:513-530. [14] ZHANG G,YAN C,JI X,et al.Dolphinattack:inaudible voice commands[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security,2017:103-117. [15] SONG L,MITTAL P.Inaudible voice commands[C]//2017 ACM SIGSAC Conference on Computer and Communications Security,2017. [16] ROY N,HASSANIEH H,CHOUDHURY R R.Backdoor:making microphones hear inaudible sounds[C]//15th Annual International Conference,2017. [17] YUAN X J,CHEN Y X,ZHAO Y,et al.Commandersong:a systematic approach for practical adversarial voice recognition[J].arXiv:1801.08535,2018. [18] CARLINI N,WAGNER D.Audio adversarial examples:targeted attacks on speech-to-text[C]//2018 IEEE Security and Privacy Workshops(SPW),2018:1-7. [19] ALZANTOT M,BALAJI B,SRIVASTAVA M.Did you hear that? Adversarial examples against automatic speech recognition[J].arXiv:1801.00554,2018. [20] SAINATH T N,PARADA C.Convolutional neural networks for small-footprint keyword spotting[C]//Sixteenth Annual Conference of the International Speech Communication Association,2015. [21] TAORI R,KAMSETTY A,CHU B,et al.Psychoacoustic ples for black box audio systems[C]//2019 IEEE Security and Privacy Workshops(SPW),2019:15-20. [22] KHARE S,ARALIKATTE R,MANI S.Adversarial black-box attacks on automatic speech recognition systems using multi-objective evolutionary optimization[C]//Interspeech 2019,2019. [23] GUO C,RANA M,CISSE M,et al.Countering adversarial images using input transformations[C]//International Conference on Learning Representations,2018. [24] LIN J,GAN C,HAN S.Defensive quantization:when efficiency meets robustness[C]//International Conference on Learning Representations,2018. [25] LIANG B,LI H,SU M,et al.Detecting adversarial image examples in deep neural networks with adaptive noise reduction[J].IEEE Transactions on Dependable and Secure Computing,2021,18(1):72-85. [26] GOODFELLOW I J,SHLENS J,SZEGEDY C.Explaining and harnessing adversarial examples[J].arXiv:1412.6572,2014. [27] PAPERNOT N,MCDANIEL P,WU X,et al.Distillation as a defense to adversarial perturbations against deep neural networks[C]//2016 IEEE Symposium on Security and Privacy(SP),2016:582-597. [28] MOOSAVI-DEZFOOLI S M,FAWZI A,FAWZI O,et al.Universal adversarial perturbations[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:1765-1773. [29] MOOSAVI-DEZFOOLI S M,FAWZI A,FROSSARD P.Deepfool:a simple and accurate method to fool deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:2574-2582. [30] VADILLO J,SANTANA R.Universal adversarial examples in speech command classification[J].arXiv:1911.10182,2019. [31] ABDOLI S,HAFEMANN L G,RONY J,et al.Universal adversarial audio perturbations[J].arXiv:1908.03173,2019. [32] RONY J,HAFEMANN L G,OLIVEIRA L S,et al.Decoupling direction and norm for efficient gradient-based l2 adversarial attacks and defenses[J].IEEE/CVF Conference on Computer Vision & Pattern Recognition,2018. [33] NEEKHARA P,HUSSAIN S,PANDEY P,et al.Universal adversarial perturbations for speech recognition systems[J].arXiv:1905.03828,2019. [34] YU J L,BO L.A normalized levenshtein distance me- tric[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(6):1091-1095. [35] LU Z,HAN W,ZHANG Y,et al.Exploring targeted universal adversarial perturbations to end-to-end asr models[J].arXiv:2104.02757,2021. [36] CHAN W,JAITLY N,LE Q,et al.Listen,attend and spell:a neural network for large vocabulary conversational speech recognition[C]//2016 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2016:4960-4964. [37] GRAVES A,FERNáNDEZ S,GOMEZ F,et al.Connectionist temporal classification:labelling unsegmented sequence data with recurrent neural networks[C]//Proceedings of the 23rd International Conference on Machine Learning,2006:369-376. [38] GRAVES A.Sequence transduction with recurrent neural networks[J].arXiv:1211.3711,2012. [39] WANG D H,DONG L,WANG R,et al.Targeted speech adversarial example generation with generative adversarial network[J].IEEE Access,2020,8:124503-124513. [40] XIE Y,LI Z,SHI C,et al.Enabling fast and universal audio adversarial attack using generative model[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2021:14129-14137. [41] WANG Y,YAO H,ZHAO S.Auto-encoder based dimensionality reduction[J].Neurocomputing,2016,184:232-242. [42] GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial nets[C]//Advances in Neural Information Processing Systems,2014. [43] YAKURA H,SAKUMA J.Robust audio adversarial example for a physical attack[J].arXiv:1810.11793,2018. [44] ATHALYE A,ENGSTROM L,ILYAS A,et al.Synthesizing robust adversarial examples[C]//International Conference on Machine Learning,2018:284-293. [45] QIN Y,CARLINI N,COTTRELL G,et al.Imperceptible,robust,and targeted adversarial examples for automatic speech recognition[C]//International Conference on Machine Learning,2019:5231-5240. [46] SCHEIBLER R,BEZZAM E,DOKMANIC I.Pyroomacoustics:a python package for audio room simulation and array processing algorithms[C]//2018 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2018:351-355. [47] SZURLEY J,KOLTER J Z.Perceptual based adversarial audio attacks[J].arXiv:1906.06355,2019. [48] SCH?NHERR L,EISENHOFER T,ZEILER S,et al.Imperio:robust over-the-air adversarial examples for automatic speech recognition systems[C]//Annual Computer Security Applications Conference,2020:843-855. [49] CHEN T,SHANGGUAN L,LI Z,et al.Metamorph:injecting inaudible commands into over-the-air voice controlled systems[C]//Proceedings of NDSS,2020. [50] LIU X,WAN K,DING Y,et al.Weighted-sampling audio adversarial example attack[J].Proceedings of the AAAI Conference on Artificial Intelligence,2020,34(4):4908-4915. [51] ESMAEILPOUR M,CARDINAL P,KOERICH A L.Towards robust speech-to-text adversarial attack[J].arXiv:2103. 08095,2021. [52] SHEN J,NGUYEN P,WU Y,et al.Lingvo:a modular and scalable framework for sequence-to-sequence modeling[J].arXiv:1902.08295,2019. [53] SCH?NHERR L,KOHLS K,ZEILER S,et al.Adversarial attacks against automatic speech recognition systems via psychoacoustic hiding[J].arXiv:1808.05665,2018. [54] RUDIN L I,OSHER S,FATEMI E.Nonlinear total variation based noise removal algorithms[J].Physica D:Nonlinear Phenomena,1992,60:259-268. [55] MURATA T,ISHIBUCHI H.Moga:multi-objective genetic algorithms[C]//IEEE International Conference on Evolutionary Computation,1995:289-294. [56] DEB K,PRATAP A,AGARWAL S,et al.A fast and elitist multiobjective genetic algorithm:NSGA-II[J].IEEE Transactions on Evolutionary Computation,2002,6(2):182-197. [57] ABDULLAH H,GARCIA W,PEETERS C,et al.Practical hidden voice attacks against speech and speaker recognition systems[J].arXiv:1904.05734,2019. [58] CHEN Y,YUAN X,ZHANG J,et al.Devil’s whisper:a general approach for physical adversarial attacks against commercial black-box speech recognition devices[C]//29th USENIX Conference on Security Symposium,2020:2667-2684. [59] ISHIDA S,ONO S.Adjust-free adversarial example generation in speech recognition using evolutionary multi-objective optimization under black-box condition[J].Artificial Life and Robotics,2021,26(2):243-249. [60] MADRY A,MAKELOV A,SCHMIDT L,et al.Towards deep learning models resistant to adversarial attacks[C]//International Conference on Learning Representations,2018. [61] SUN S,YEH C F,OSTENDORF M,et al.Training augmentation with adversarial examples for robust speech recognition[C]//Interspeech 2018,2018. [62] HINTON G,VINYALS O,DEAN J.Distilling the knowledge in a neural network[J].arXiv:1503.02531,2015. [63] DAS N,SHANBHOGUE M,CHEN S T,et al.Adagio:interactive experimentation with adversarial attack and defense for audio[C]//European Conference,ECML PKDD 2018,Dublin,Ireland,September 10-14,2018. [64] LATIF S,RANA R,QADIR J.Adversarial machine learning and speech emotion recognition:utilizing generative adversarial networks for robustness[J].arXiv:1811.11402,2018. [65] ESMAEILPOUR M,CARDINAL P,KOERICH A L.Class-conditional defense GAN against end-to-end speech attacks[C]//ICASSP 2021-2021 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2021:2565-2569. [66] ESMAEILPOUR M,CARDINAL P,KOERICH A L.A robust approach for securing audio classification against adversarial attacks[J].IEEE Transactions on Information Forensics and Security,2019,15:2147-2159. [67] TAMURA K,OMAGARI A,HASHIDA S.Novel defense method against audio adversarial example for speech-to-text transcription neural networks[C]//2019 IEEE 11th International Workshop on Computational Intelligence and Applications(IWCIA),2019:115-120. [68] YANG C H,QI J,CHEN P Y,et al.Characterizing speech adversarial examples using self-attention u-net enhancement[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2020:3107-3111. [69] RAJARATNAM K,SHAH K,KALITA J.Isolated and ensemble audio preprocessing methods for detecting adversarial examples against automatic speech recognition[C]//Conference on Computational Linguistics and Speech Processing(ROCLING),2018. [70] SAMIZADE S,TAN Z H,SHEN C,et al.Adversarial example detection by classification for deep speech recognition[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2020:3102-3106. [71] ZENG Q,SU J,FU C,et al.A multiversion programming inspired approach to detecting audio adversarial examples[C]//2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks(DSN),2019. [72] RAJARATNAM K,KALITA J.Noise flooding for detecting audio adversarial examples against automatic speech recognition[C]//2018 IEEE International Symposium on Signal Processing and Information Technology(ISSPIT),2018. [73] KWON H,YOON H,PARK K W.Poster:detecting audio adversarial example through audio modification[C]//Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security,2019:2521-2523. [74] YANG Z,CHEN P Y,LI B,et al.Characterizing audio adversarial examples using temporal dependency[C]//7th International Conference on Learning Representations,2019. [75] MA P,PETRIDIS S,PANTIC M.Detecting adversarial attacks on audio-visual speech recognition[J].arXiv:1912.08639,2019. [76] LIU Y P,CHEN X Y,LIU C,et al.Delving into transferable adversarial examples and black-box attacks[C]//International Conference on Learning Representations,2017. [77] CISSE M,ADI Y,NEVEROVA N,et al.Houdini:fooling deep structured prediction models[J].arXiv:1707.05373,2017. [78] AMODEI D,ANANTHANARAYANAN S,ANUBHAI R,et al.Deep speech 2:end-to-end speech recognition in English and Mandarin[C]//International Conference on Machine Learning,2016:173-182. [79] KREUK F,ADI Y,CISS′E M,et al.Fooling end-to-end speaker verification with adversarial examples[C]//IEEE International Conference on Acoustics,Speech and Signal Processing,2018:1962-1966. [80] 董胤蓬,苏航,朱军.面向对抗样本的深度神经网络可解释性分析[J].自动化学报,2022,48(1):75-86. DONG Y P,SU H,ZHU J.Interpretability analysis of deep neural networks with adversarial examples[J].Acta Automatica Sinica,2022,48(1):75-86. [81] HU S,SHANG X,QIN Z,et al.Adversarial examples for automatic speech recognition:attacks and countermeasures[J].IEEE Communications Magazine,2019,57(10):120-126. [82] ABDULLAH H,WARREN K,BINDSCHAEDLER V,et al.SoK:the faults in our ASRs:an overview of attacks against automatic speech recognition and speaker identification systems[C]//2021 IEEE Symposium on Security and Privacy(SP),2021:730-747. [83] 刘会,赵波,郭嘉宝,等.针对深度学习的对抗攻击综述[J].密码学报,2021,8(2):202-214. LIU H,ZHAO B,GUO J B,et al.Survey on adversarial attacks towards deep learning[J].Journal of Cryptologic Research,2021,8(2):202-214. [84] 潘文雯,王新宇,宋明黎,等.对抗样本生成技术综述[J].软件学报,2020,31(1):67-81. PAN W W,WANG X Y,SONG M L,et al.Survey on generating adversarial examples[J].Journal of Software,2020,31(1):67-81. [85] 张思思,左信,刘建伟.深度学习中的对抗样本问题[J].计算机学报,2019,42(8):1886-1904. ZHANG S S,ZUO X,LIU J W.The problem of the adversarial examples in deep learning[J].Chinese Journal of Computers,2019,42(8):1886-1904. [86] 张树栋,高海昌,曹曦文,等.针对ASR系统的快速有目标自适应对抗攻击[J].西安电子科技大学学报,2021,48(1):168-175. ZHANG S D,GAO H C,CAO X W,et al.Adaptive fast and targeted adversarial attack for speech recognition[J].Journal of Xidian Universarity,2021,48(1):1886-1904. [87] 王曙燕,金航,孙家泽.GAN图像对抗样本生成方法[J].计算机科学与探索,2021,15(4):702-711. WANG S Y,JIN H,SUN J Z.Method for image adversarial samples generating based on GAN[J].Journal of Frontiers of Computer Science and Technology,2021,15(4):702-711. [88] 陈晋音,叶林辉,郑海斌,等.面向语音识别系统的黑盒对抗攻击方法[J].小型微型计算机系统,2020,41(5):1019-1029. CHEN J Y,YE L H,ZHENG H B,et al.Black-box adversarial attack toward speech recognition system[J].Journal of Chinese Computer Systems,2020,41(5):1019-1029. |
[1] | XU Yinxiang, CHEN Qidong, SUN Jun. Text Adversarial Attack Method Applying Based on Improved Quantum Behaved Particle Swarm Optimization [J]. Computer Engineering and Applications, 2022, 58(9): 175-180. |
[2] | LUO Xianglong, GUO Huang, LIAO Cong, HAN Jing, WANG Lixin. Spatiotemporal Short-Term Traffic Flow Prediction Based on Broad Learning System [J]. Computer Engineering and Applications, 2022, 58(9): 181-186. |
[3] | Alim Samat, Sirajahmat Ruzmamat, Maihefureti, Aishan Wumaier, Wushuer Silamu, Turgun Ebrayim. Research on Sentence Length Sensitivity in Neural Network Machine Translation [J]. Computer Engineering and Applications, 2022, 58(9): 195-200. |
[4] | CHEN Yixiao, Alifu·Kuerban, LIN Wenlong, YUAN Xu. CA-YOLOv5 for Crowded Pedestrian Detection [J]. Computer Engineering and Applications, 2022, 58(9): 238-245. |
[5] | FANG Yiqiu, LU Zhuang, GE Junwei. Forecasting Stock Prices with Combined RMSE Loss LSTM-CNN Model [J]. Computer Engineering and Applications, 2022, 58(9): 294-302. |
[6] | GAO Guangshang. Survey on Attention Mechanisms in Deep Learning Recommendation Models [J]. Computer Engineering and Applications, 2022, 58(9): 9-18. |
[7] | JI Meng, HE Qinglong. AdaSVRG: Accelerating SVRG by Adaptive Learning Rate [J]. Computer Engineering and Applications, 2022, 58(9): 83-90. |
[8] | SHI Jie, YUAN Chenxiang, DING Fei, KONG Weixiang. Survey of Building Target Detection in SAR Images [J]. Computer Engineering and Applications, 2022, 58(8): 58-66. |
[9] | XIONG Fengguang, ZHANG Xin, HAN Xie, KUANG Liqun, LIU Huanle, JIA Jionghao. Research on Improved Semantic Segmentation of Remote Sensing [J]. Computer Engineering and Applications, 2022, 58(8): 185-190. |
[10] | YANG Jinfan, WANG Xiaoqiang, LIN Hao, LI Leixiao, YANG Yanyan, LI Kecen, GAO Jing. Review of One-Stage Vehicle Detection Algorithms Based on Deep Learning [J]. Computer Engineering and Applications, 2022, 58(7): 55-67. |
[11] | WANG Zhiyong, XING Kai, DENG Hongwu, LI Yaming, HU Xuan. Adversarial Attack Against ResNeXt Based on Few-Shot Learning and Causal Intervention [J]. Computer Engineering and Applications, 2022, 58(7): 68-76. |
[12] | WANG Bin, LI Xin. Research on Multi-Source Domain Adaptive Algorithm Integrating Dynamic Residuals [J]. Computer Engineering and Applications, 2022, 58(7): 162-166. |
[13] | TAN Shuqiu, TANG Guofang, TU Yuanya, ZHANG Jianxun, GE Panjie. Classroom Monitoring Students Abnormal Behavior Detection System [J]. Computer Engineering and Applications, 2022, 58(7): 176-184. |
[14] | ZHANG Meiyu, LIU Yuehui, HOU Xianghui, QIN Xujia. Automatic Coloring Method for Gray Image Based on Convolutional Network [J]. Computer Engineering and Applications, 2022, 58(7): 229-236. |
[15] | ZHANG Zhuangzhuang, QU Licheng, LI Xiang, ZHANG Minghao, LI Zhaolu. Traffic Flow Prediction with Missing Data Based on Spatial-Temporal Convolutional Neural Networks [J]. Computer Engineering and Applications, 2022, 58(7): 259-265. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||