1.College of Information and Cyber Security, People’s Public Security University of China, Beijing 100038, China
2.Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou 510006, China
[1] GOODFELLOW I J,SHLENS J,SZEGEDY C.Explaining and harnessing adversarial examples[J].arXiv:1412.6572,2014.
[2] KURAKIN A,GOODFELLOW I J,BENGIO S.Adversarial examples in the physical world[M]//Artificial intelligence safety and security.[S.l.]:Chapman and Hall/CRC,2018:99-112.
[3] DONG Y,LIAO F,PANG T,et al.Boosting adversarial attacks with momentum[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,June 18-22,2018.New York:IEEE,2018:9185-9193.
[4] TAN H,ZHANG J,ZHANG H,et al.NRI-FGSM:an efficient transferable adversarial attack method for speaker recognition system[C]//Proceedings of the 23rd Annual Conference of the International Speech Communication Association,Incheon,September 18-22,2022.New York:IEEE,2022:18-22.
[5] XIAO C,LI B,ZHU J Y,et al.Generating adversarial examples with adversarial networks[J].arXiv:1801.02610,2018.
[6] JANDIAL S,MANGLA P,VARSHNEY S,et al.Advgan++:harnessing latent layers for adversary generation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops,Seoul,October 27-28,2019.New York:IEEE,2019.
[7] CARLINI N,WAGNER D.Towards evaluating the robustness of neural networks[C]//2017 IEEE Symposium on Security and Privacy(SP),San Jose,May 22-26,2017.New York:IEEE,2017:39-57.
[8] CHEN G,CHENB S,FAN L,et al.Who is real bob? adversarial attacks on speaker recognition systems[C]//2021 IEEE Symposium on Security and Privacy(SP),San Francisco,May 24-27,2021.New York:IEEE,2021:694-711.
[9] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,June 27-30,2016.New York:IEEE,2016:770-778.
[10] WAIBEL A,HANAZAWA T,HINTON G,et al.Phoneme recognition using time-delay neural networks[J].IEEE Transactions on Acoustics,Speech,and Signal Processing,1989,37(3):328-339.
[11] HEO H S,LEE B J,HUH J,et al.Clova baseline system for the voxceleb speaker recognition challenge 2020[J].arXiv:2009.14153,2020.
[12] KIM S H,NAM H,PARK Y H.Temporal dynamic convolutional neural network for text-independent speaker verification and phonemic analysis[C]//IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),Singapore,May 22-27,2022.New York:IEEE,2022:6742-6746.
[13] SNYDER D,GARCIA-ROMERO D,SELL G,et al.X-vectors:robust dnn embeddings for speaker recognition[C]//2018 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),Calgary,April 15-20,2018.New York:IEEE,2018:5329-5333.
[14] DESPLANQUES B,THIENPONDT J,DEMUYNCK K.Ecapa-tdnn:emphasized channel attention,propagation and aggregation in tdnn based speaker verification[J].arXiv:2005.07143,2020.
[15] PRINCE S J D,ELDER J H.Probabilistic linear discriminant analysis for inferences about identity[C]//2007 IEEE 11th International Conference on Computer Vision,Rio de Janeiro,October 14-20,2007.New York:IEEE,2007:1-8.
[16] HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,June 18-22,2018.New York:IEEE,2018:7132-7141.
[17] GAO S H,CHENG M M,ZHAO K,et al.Res2net:a new multi-scale backbone architecture[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,43(2):652-662.
[18] OKABE K,KOSHINAKA T,SHINODA K.Attentive statistics pooling for deep speaker embedding[J].arXiv:1803.10963,2018.
[19] SZEGEDY C,ZAREMBA W,SUTSKEVER I,et al.Intriguing properties of neural networks[J].arXiv:1312. 6199,2013.
[20] LI X,ZHONG J,WU X,et al.Adversarial attacks on GMM i-vector based speaker verification systems[C]//2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),Barcelona,May 4-8,2020.New York:IEEE,2020:6579-6583.
[21] 廖俊帆,顾益军,张培晶,等.端到端说话人辨认的对抗样本应用比较研究[J].计算机工程,2021,47(6):132-141.
LIAO J F,GU Y J,ZHANG P J,et al.Comparative research on application of adversarial samples for end-to-end speaker identification[J].Computer Engineering,2021,47(6):132-141.
[22] GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial nets[C]//Advances in Neural Information Processing Systems,2014.
[23] NESTEROV Y.A method for unconstrained convex minimization problem with the rate of convergence O(1/k2)[C]//Doklady AN USSR,1983:543-547.
[24] RUDER S.An overview of gradient descent optimization algorithms[J].arXiv:1609.04747,2016.
[25] ZHANG Y,JIANG Z,VILLALBA J,et al.Black-box attacks on spoofing countermeasures using transferability of adversarial examples[C]//INTERSPEECH,Shanghai,October 25-29,2020.NewYork:IEEE,2020:4238-4242.
[26] MOOSAVI-DEZFOOLI S M,FAWZI A,FAWZI O,et al.Universal adversarial perturbations[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Honolulu,July 21-26,2017.New York:IEEE,2017:1765-1773.
[27] NAGRANI A,CHUNG J S,ZISSERMAN A.Voxceleb:a large-scale speaker identification dataset[J].arXiv:1706.08612,2017.
[28] CHUNG J S,NAGRANI A,ZISSERMAN A.Voxceleb2:deep speaker recognition[J].arXiv:1806.05622,2018.