计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (23): 56-66.DOI: 10.3778/j.issn.1002-8331.2206-0028
唐玉敏,范菁,曲金帅
出版日期:
2022-12-01
发布日期:
2022-12-01
TANG Yumin, FAN Jing, QU Jinshuai
Online:
2022-12-01
Published:
2022-12-01
摘要: 目前用于建立和操作多媒体信息技术已经发展到了可确保高度真实感的程度。深度伪造作为一种生成式深度学习算法,可实现音频、图像、视频的伪造生成,近些年也取得了相当巨大的进步,与之对抗的深度伪造检测技术也在不断的发展中。梳理常见深度伪造生成的技术以及相关的数据集,总结其中的原理以及最新方法成果;并对深度伪造检测相关技术和数据集进行分析总结。对深度伪造生成和检测的未来研究方向进行了总结和展望。
唐玉敏, 范菁, 曲金帅. 深度伪造生成与检测研究综述[J]. 计算机工程与应用, 2022, 58(23): 56-66.
TANG Yumin, FAN Jing, QU Jinshuai. Overview of Deepfake Generation and Detection[J]. Computer Engineering and Applications, 2022, 58(23): 56-66.
[1] MALIK A,KURIBAYASHI M,ABDULLAHI S M,et al.DeepFake detection for human face images and videos:a survey[J].IEEE Access,2022,10:18757-18775. [2] 曹秀莲,汤益华.深度伪造检测技术发展现状研究[J].网络安全技术与应用,2022(5):49-51. CAO Xiulian,TANG Yihua.Research on the development status of deep forgery detection technology[J].Network Security Technology & Application,2022(5):49-51. [3] 刘国柱.深度伪造与国家安全:基于总体国家安全观的视角[J].国际安全研究,2022,40(3):3-31. LIU Guozhu.Deep forgery and national security:from the perspective of overall national security view[J].Journal of International Security Studies,2022,40(3):3-31. [4] MASOOD M,NAWAZ M,MALIK K M,et al.Deepfakes generation and detection:state-of-the-art,open challenges,countermeasures,and way forward[J].Applied Intelligence,2022:1-53. [5] RAMESH A,DHARIWAL P,NICHOL A,et al.Hierarchical text-conditional image generation with clip latents[J].arXiv:2204.06125,2022. [6] ZHOU Y,ZHANG R,CHEN C,et al.Towards language-free training for text-to-image generation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:17907-17917. [7] HO J,SAHARIA C,CHAN W,et al.Cascaded diffusion models for high fidelity image generation[J].arXiv:2106. 15282,2021. [8] PARK S,SHIN Y G.Generative convolution layer for image generation[J].Neural Networks,2022,152:370-379. [9] ROMBACH R,BLATTMANN A,LORENZ D,et al.High-resolution image synthesis with latent diffusion models[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:10684-10695. [10] DENG Y,YANG J,CHEN D,et al.Disentangled and controllable face image generation via 3D imitative-contrastive learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:5154-5163. [11] KIM H,GARRIDO P,TEWARI A,et al.Deep video portraits[J].ACM Transactions on Graphics,2018,37(4):1-14. [12] PRAJWAL K R,MUKHOPADHYAY R,NAMBOODIRI V P,et al.A lip sync expert is all you need for speech to lip generation in the wild[C]//Proceedings of the 28th ACM International Conference on Multimedia,2020:484-492. [13] ZHOU Y,LIM S N.Joint audio-visual deepfake detec-tion[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2021:14800-14809. [14] 任延珍,刘晨雨,刘武洋,等.语音伪造及检测技术研究综述[J].信号处理,2021,37(12):2412-2439. REN Yanzhen,LIU Chenyu,LIU Wuyang,et al.A survey on speech forgery and detection[J].Journal of Signal Processing,2021,37(12):2412-2439. [15] 李旭嵘,纪守领,吴春明,等.深度伪造与检测技术综述[J].软件学报,2021,32(2):496-518. LI Xurong,JI Shouling,WU Chunming,et al.Survey on deepfakes and detection techniques[J].Journal of Software,2021,32(2):496-518. [16] 杨帅,乔凯,陈健,等.语音合成及伪造、鉴伪技术综述[J].计算机系统应用,2022,31(7):12-22. YANG Shuai,QIAO Kai,CHEN Jian,et al.Overview on speech synthesis,forgery and detection technology[J].Computer System & Applications,2022,31(7):12-22. [17] ZHANG Y,JIANG F,DUAN Z.One-class learning towards synthetic voice spoofing detection[J].IEEE Signal Processing Letters,2021,28:937-941. [18] 苗晓孔,孙蒙,张雄伟,等.基于参数转换的语音深度伪造及其对声纹认证的威胁评估[J].信息安全学报,2020,5(6):53-59. MIAO Xiaokong,SUN Meng,ZHANG Xiongwei,et al.Deep speech forgery based on parameter transformation and threat assessment to voiceprint authentication[J].Journal of Cyber Security,2020,5(6):53-59. [19] QURESHI A,MEGíAS D,KURIBAYASHI M.Detecting deepfake videos using digital watermarking[C]//2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference(APSIPA ASC),2021:1786-1793. [20] SONG L,LIU B,YIN G,et al.TACR-Net:editing on deep video and voice portraits[C]//Proceedings of the 29th ACM International Conference on Multimedia,2021:478-486. [21] CHADHA A,KUMAR V,KASHYAP S,et al.Deepfake:an overview[C]//Proceedings of Second International Conference on Computing,Communications,and Cyber-Security.Singapore:Springer,2021:557-566. [22] ZAKHAROV E,SHYSHEYA A,BURKOV E,et al.Few-shot adversarial learning of realistic neural talking head models[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2019:9459-9468. [23] SAMUEL S.A guy made a deepfake app to turn photos of women into nudes.It didn’t go well[EB/OL].[2021-02-18].https://www.vox.com/2019/6/27/18761639/ai?deepfake-deepnude-app-nude-women-porn. [24] THIES J,ZOLLHOFER M,STAMMINGER M,et al.Face2face:real-time face capture and reenactment of RGB videos[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:2387-2395. [25] HWANG T.Deepfakes:a grounded threat assessment[Z].Georgetown University.Centre for Security and Emerging Technologies,2020. [26] ZOBAED S,RABBY F,HOSSAIN I,et al.DeepFakes:detecting forged and synthetic media content using machine learning[M]//Artificial intelligence in cyber security:impact and implications.Cham:Springer,2021:177-201. [27] LIN C H,CHANG C C,CHEN Y S,et al.Coco-GAN:generation by parts via conditional coordinating[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2019:4512-4521. [28] KARRAS T,LAINE S,AITTALA M,et al.Analyzing and improving the image quality of StyleGAN[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),2020. [29] GANDHI A,JAIN S.Adversarial perturbations fool deepfake detectors[C]//2020 International Joint Conference on Neural Networks(IJCNN),2020:1-8. [30] GENG Z,CAO C,TULYAKOV S.3D guided fine-grained face manipulation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:9821-9830. [31] HUANG Y,JUEFEI-XU F,WANG R,et al.FakePolisher:making deepfakes more detection-evasive by shallow reconstruction[C]//Proceedings of the 28th ACM International Conference on Multimedia,2020:1217-1226. [32] ZHANG Y,ZHANG S,HE Y,et al.One-shot face reenact-ment[J].arXiv:1908.03251,2019. [33] KAUR G,SINGH N,KUMAR M.Image forgery techniques:a review[J].Artificial Intelligence Review,2022:1-49. [34] ROSSLER A,COZZOLINO D,VERDOLIVA L,et al.FaceForensics++:learning to detect manipulated facial images[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2019:1-11. [35] NAGRANI A,CHUNG J S,ZISSERMAN A.VoxCeleb:a large-scale speaker identification dataset[J].arXiv:1706. 08612,2017. [36] CHUNG J S,NAGRANI A,ZISSERMAN A.VoxCeleb2:deep speaker recognition[J].arXiv:1806.05622,2018. [37] WU Z,KHODABAKHSH A,DEMIROGLU C,et al.SAS:a speaker verification spoofing database containing diverse attacks[C]//2015 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2015:4440-4444. [38] PANAYOTOV V,CHEN G,POVEY D,et al.LibriSpeech:an ASR corpus based on public domain audio books[C]//2015 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2015:5206-5210. [39] LORENZO-TRUEBA J,YAMAGISHI J,TODA T,et al.The voice conversion challenge 2018:promoting development of parallel and nonparallel methods[J].arXiv:1804.04262,2018. [40] MYSORE G J.Can we automatically transform speech recorded on common consumer devices in real-world environments into professional production quality speech?—a dataset,insights,and challenges[J].IEEE Signal Processing Letters,2014,22(8):1006-1010. [41] MAZE B,ADAMS J,DUNCAN J A,et al.IARPA Janus Benchmark-C:face dataset and protocol[C]//2018 International Conference on Biometrics(ICB),2018:158-165. [42] R?SSLER A,COZZOLINO D,VERDOLIVA L,et al.Face-Forensics:a large-scale video dataset for forgery detection in human faces[J].arXiv:1803.09179,2018. [43] WU W,ZHANG Y,LI C,et al.ReenactGAN:learning to Reenact faces via boundary transfer[J].arXiv:1807. 11079,2018. [44] SANDERSON C,LOVELL B C.Multi-region probabilistic histograms for robust and scalable identity inference[C]//International Conference on Biometrics.Berlin,Heidelberg:Springer,2009:199-208. [45] PETRIDIS S,STAFYLAKIS T,MA P,et al.Audio-visual speech recognition with a hybrid CTC/attention architecture[C]//2018 IEEE Spoken Language Technology Workshop(SLT),2018:513-520. [46] STERPU G,SAAM C,HARTE N.Attention-based audio-visual fusion for robust automatic speech recognition[C]//Proceedings of the 20th ACM International Conference on Multimodal Interaction,2018:111-115. [47] KARRAS T,LAINE S,AILA T.A style-based generator architecture for generative adversarial networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:4401-4410. [48] LI S,DENG W,DU J P.Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:2852-2861. [49] CAO Q,SHEN L,XIE W,et al.VGGFace2:a dataset for recognising faces across pose and age[C]//2018 13th IEEE International Conference on Automatic Face & Gesture Recognition(FG 2018),2018:67-74. [50] DU S,WARD R.Wavelet-based illumination normalization for face recognition[C]//IEEE International Conference on Image Processing,2005. [51] LE V,BRANDT J,LIN Z,et al.Interactive facial feature localization[C]//European Conference on Computer Vision.Berlin,Heidelberg:Springer,2012:679-692. [52] WANG X,HUANG J,MA S,et al.DeepFake disrupter:the detector of deepfake is my friend[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:14920-14929. [53] WANG J,WU Z,OUYANG W,et al.M2tr:multi-modal multi-scale transformers for deepfake detection[C]//Proceedings of the 2022 International Conference on Multimedia Retrieval,2022:615-623. [54] NARAYAN K,AGARWAL H,MITTAL S,et al.DeSI:deep-fake source identifier for social media[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:2858-2867. [55] CHEN L,ZHANG Y,SONG Y,et al.Self-supervised learning of adversarial example:towards good generali-zations for deepfake detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:18710-18719. [56] GUARNERA L,GIUDICE O,NIE?NER M,et al.On the exploitation of deepfake model recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:61-70. [57] COCCOMINI D A,MESSINA N,GENNARO C,et al.Combining efficientnet and vision transformers for video deepfake detection[C]//International Conference on Image Analysis and Processing.Cham:Springer,2022:219-229. [58] USTUBIOGLU A,USTUBIOGLU B,ULUTAS G.Mel spectrogram based audio forgery detection using CNN[EB/OL].[2022-05-10].https://doi.org/10.21203/rs.3.rs-1828771/v1. [59] CHETTRI B,STOLLER D,MORFI V,et al.Ensemble models for spoofing detection in automatic speaker veri-fication[C]//Proc Interspeech,2019:1033-1037. [60] MOUSSA D,HIRSCH G,RIESS C.Towards unconstrained audio splicing detection and localization with neural networks[J].arXiv:2207.14682,2022. [61] AKHTAR N,SADDIQUE M,ASGHAR K,et al.Digital video tampering detection and localization:review,representations,challenges and algorithm[J].Mathematics,2022,10(2):168. [62] YANG X,LI Y,LYU S.Exposing deep fakes using incon-sistent head poses[C]//ICASSP 2019-2019 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2019:8261-8265. [63] TEMMERMANS F,BHOWMIK D,PEREIRA F,et al.JPEG fake media:a provenance-based sustainable approach to secure and trustworthy media annotation[C]//Applications of Digital Image Processing XLIV,2021. [64] DEMIR I,CIFTCI U A.Where do deep fakes look? synthetic face detection via gaze tracking[C]//ACM Symposium on Eye Tracking Research and Applications,2021:1-11. [65] QAZI E U H,ZIA T,ALMORJAN A.Deep learning-based digital image forgery detection system[J].Applied Sciences,2022,12(6):2851. [66] JEONG Y,KIM D,MIN S,et al.BiHPF:bilateral highpass filters for robust deepfake detection[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision,2022:48-57. [67] LIU S,LIAN Z,GU S,et al.Block shuffling learning for deepfake detection[J].arXiv:2202.02819,2022. [68] GUARNERA L,GIUDICE O,BATTIATO S.Deepfake detection by analyzing convolutional traces[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops,2020:666-667. [69] HOODA A,MANGAOKAR N,FENG R,et al.Towards adversarially robust deepfake detection:an ensemble approach[J].arXiv:2202.05687,2022. [70] 裘昊轩,杜彦辉,芦天亮.针对深度伪造的对抗攻击算法动态APGD设计[J/OL].计算机工程与应用:1-11[2022-05-10].http://kns.cnki.net/kcms/detail/11.2127.TP.20220126.1536. 005.html. QIU Haoxuan,DU Yanhui,LU Tianliang.Design of DAPGD,an adversarial attack algorithm against deepfake[J/OL].Computer Engineering and Applications:1-11[2022-05-10].http://kns.cnki.net/kcms/detail/11.2127.TP.20220126.1536. 005.html. [71] 耿鹏志,唐云祁,樊红兴,等.数据增强对深度伪造检测模型的影响研究[J].计算机工程与应用,2021,57(17):10-16. GENG Pengzhi,TANG Yunqi,FAN Hongxing,et al.Research on influence of data enhancement on deepfake detection model[J].Computer Engineering and Applications,2021,57(17):10-16. [72] YAMAGISHI J,TODISCO M,SAHIDULLAH M,et al.Asvspoof 2019:automatic speaker verification spoofing and countermeasures challenge evaluation plan[EB/OL].[2022-05-10].http://www.asvspoof.org/asvspoof2019/asvspoof2019evaluationplan.pdf. [73] DOLHANSKY B,BITTON J,PFLAUM B,et al.The deepfake detection challenge(DFDC) dataset[J].arXiv:2006.07397,2020. [74] LI Y,YANG X,SUN P,et al.Celeb-DF:a large-scale challenging dataset for deepfake forensics[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:3207-3216. [75] LI Y,CHANG M C,LYU S.In ictu oculi:exposing AI gener-ated fake face videos by detecting eye blinking[J].arXiv:1806.02877,2018. [76] CIFTCI U A,DEMIR I,YIN L.FakeCatcher:detection of synthetic portrait videos using biological signals[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020:32750816. [77] JIANG L,LI R,WU W,et al.DeeperForensics-1.0:a large-scale dataset for real-world face forgery detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:2889-2898. [78] YU F,SEFF A,ZHANG Y,et al.LSUN:construction of a large-scale image dataset using deep learning with humans in the loop[J].arXiv:1506.03365,2015. [79] LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft COCO:common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755. [80] LIU Z,LUO P,WANG X,et al.Deep learning face attributes in the wild[C]//Proceedings of the IEEE International Conference on Computer Vision,2015:3730-3738. |
[1] | 罗向龙, 郭凰, 廖聪, 韩静, 王立新. 时空相关的短时交通流宽度学习预测模型[J]. 计算机工程与应用, 2022, 58(9): 181-186. |
[2] | 阿里木·赛买提, 斯拉吉艾合麦提·如则麦麦提, 麦合甫热提, 艾山·吾买尔, 吾守尔·斯拉木, 吐尔根·依不拉音. 神经机器翻译面对句长敏感问题的研究[J]. 计算机工程与应用, 2022, 58(9): 195-200. |
[3] | 陈一潇, 阿里甫·库尔班, 林文龙, 袁旭. 面向拥挤行人检测的CA-YOLOv5[J]. 计算机工程与应用, 2022, 58(9): 238-245. |
[4] | 方义秋, 卢壮, 葛君伟. 联合RMSE损失LSTM-CNN模型的股价预测[J]. 计算机工程与应用, 2022, 58(9): 294-302. |
[5] | 高广尚. 深度学习推荐模型中的注意力机制研究综述[J]. 计算机工程与应用, 2022, 58(9): 9-18. |
[6] | 吉梦, 何清龙. AdaSVRG:自适应学习率加速SVRG[J]. 计算机工程与应用, 2022, 58(9): 83-90. |
[7] | 石颉, 袁晨翔, 丁飞, 孔维相. SAR图像建筑物目标检测研究综述[J]. 计算机工程与应用, 2022, 58(8): 58-66. |
[8] | 熊风光, 张鑫, 韩燮, 况立群, 刘欢乐, 贾炅昊. 改进的遥感图像语义分割研究[J]. 计算机工程与应用, 2022, 58(8): 185-190. |
[9] | 杨锦帆, 王晓强, 林浩, 李雷孝, 杨艳艳, 李科岑, 高静. 深度学习中的单阶段车辆检测算法综述[J]. 计算机工程与应用, 2022, 58(7): 55-67. |
[10] | 王斌, 李昕. 融合动态残差的多源域自适应算法研究[J]. 计算机工程与应用, 2022, 58(7): 162-166. |
[11] | 谭暑秋, 汤国放, 涂媛雅, 张建勋, 葛盼杰. 教室监控下学生异常行为检测系统[J]. 计算机工程与应用, 2022, 58(7): 176-184. |
[12] | 张美玉, 刘跃辉, 侯向辉, 秦绪佳. 基于卷积网络的灰度图像自动上色方法[J]. 计算机工程与应用, 2022, 58(7): 229-236. |
[13] | 张壮壮, 屈立成, 李翔, 张明皓, 李昭璐. 基于时空卷积神经网络的数据缺失交通流预测[J]. 计算机工程与应用, 2022, 58(7): 259-265. |
[14] | 许杰, 祝玉坤, 邢春晓. 基于深度强化学习的金融交易算法研究[J]. 计算机工程与应用, 2022, 58(7): 276-285. |
[15] | 张昊, 张小雨, 张振友, 李伟. 基于深度学习的入侵检测模型综述[J]. 计算机工程与应用, 2022, 58(6): 17-28. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||