1.平安科技(深圳)有限公司,广东 深圳 518063
2.华中科技大学,湖北 武汉 430070
[ "李俊杰(2002‒),男,华中科技大学硕士生,平安科技(深圳)有限公司实习生,主要研究方向为视频深度伪造检测、分布外样本检测等。" ]
[ "王健宗(1983‒),男,博士,平安科技(深圳)有限公司副总工程师,资深人工智能总监,联邦学习技术部总经理,智能金融前沿技术研究院院长。美国佛罗里达大学人工智能博士后,美国莱斯大学和华中科技大学联合培养博士,中国计算机学会资深会员,中国计算机学会大数据专家委员会委员,中国自动化学会联邦数据和联邦智能专业委员会副主任。主要研究方向为大模型、联邦学习和深度学习等。" ]
[ "张旭龙(1988‒),男,博士,平安科技(深圳)有限公司,高级算法研究员,复旦大学计算机理学博士,2023年入选上海市东方英才计划青年项目。兼任深圳清华大学研究院及中国科学技术大学先进技术研究院校外导师,中国自动化学会联邦数据与联邦智能专业委员会委员。目前是IEEE、中国自动化学会以及中国计算机学会会员。主要研究方向为大模型、具身智能、跨模态智能计算以及模型上下文协议(MCP)等。发表学术论文80余篇,申请国家发明专利授权20余项。" ]
[ "瞿晓阳(1988-),男,博士,平安科技(深圳)有限公司前沿机器学习算法分组分责人,清华大学深圳国际研究生院校外导师,中国科技大学校先进技术研究院校外导师,中佛罗里达大学访问学者,从事机器学习、大数据、体系结构方面的研究工作,在语音语义分析、自动化机器学习、零样本和小样本学习、高性能计算与存储等方面经验丰富。在体系结构方向(如INFOCOM、DAC、IPDPS、TPDS)和人工智能方向(如NeurIPS、IJCAI、ICASSP、Interspeech)等国际会议和期刊发表过近50篇文章,其中1篇论文荣获会议最佳学生论文奖提名;担任多个国际期刊的评委,已授权专利70篇,已出版专著2本。" ]
收稿:2025-03-31,
网络首发:2026-02-25,
移动端阅览
李俊杰,王健宗,张旭龙等.视频深度伪造检测的泛化性问题:方法、挑战与技术进展[J].大数据,
LI Junjie,WANG Jianzong,ZHANG Xulong,et al.Generalization challenges in video deepfake detection: methods, obstacles, and technological advances[J].BIG DATA RESEARCH,
李俊杰,王健宗,张旭龙等.视频深度伪造检测的泛化性问题:方法、挑战与技术进展[J].大数据, DOI:10.11959/j.issn.2096-0271.2025072.
LI Junjie,WANG Jianzong,ZHANG Xulong,et al.Generalization challenges in video deepfake detection: methods, obstacles, and technological advances[J].BIG DATA RESEARCH, DOI:10.11959/j.issn.2096-0271.2025072.
随着人工智能技术的快速发展,深度伪造技术已成为生成逼真人类音频、图像和视频的强大工具,其普及与成本降低对个人隐私和社会信任构成了严峻威胁。综述了视频深度伪造检测领域的泛化性问题,系统回顾了技术方法、挑战与进展。首先梳理了深度伪造检测技术的发展脉络,并对神经网络与人类在检测任务中的表现进行了对比分析。针对跨数据集检测的挑战,分析了由于数据集间特征分布差异,传统深度学习模型在跨域应用中的性能下降问题。为解决这一问题,研究者提出了多种方法,如神经网络架构设计、多目标学习、新增损失函数、数据增强、创新训练流程和生物特征融合技术等。近年来,随着检测技术进入稳定发展阶段,研究逐渐遭遇瓶颈,亟须突破。系统分析该领域的最新成果,剖析技术优势与不足,揭示了在提升检测泛化性及应对未知分布域方面的潜力与挑战,为后续研究提供参考。
With the rapid development of artificial intelligence
deepfake technology has become a powerful tool for generating realistic audio
images
and videos. However
its widespread use and decreasing costs pose serious threats to personal privacy and social trust. This paper reviews the generalization issues in the field of video deepfake detection
providing a comprehensive overview of methods
challenges
and advancements. It first traces the development of deepfake detection techniques and compares the performance of neural networks with human capabilities in detection tasks. Regarding the challenges of cross-dataset detection
the paper analyzes how differences in feature distributions between datasets lead to the performance degradation of traditional deep learning models in cross-domain applications. To address this issue
researchers have proposed various solutions
such as neural network architecture design
multi-task learning
new loss functions
data augmentation
innovative training procedures
and the integration of biometric recognition technologies. In recent years
as deepfake detection research has reached a stage of stable development
it has also encountered bottlenecks
calling for breakthroughs. Through a systematic analysis of the latest research achievements in this field
this paper examines the strengths and limitations of current techniques
revealing both the potential and challenges in enhancing generalization and addressing unknown distribution domains. The insights provided offer valuable guidance for future research and applications.
KINGMA D P , WELLING M . Auto-encoding variational Bayes [EB ] . arXiv preprint , 2013 , arXiv: 1312.6114 .
GOODFELLOW I , POUGET-ABADIE J , MIRZA M , et al . Generative adversarial networks [J ] . Communications of the ACM , 2020 , 63 ( 11 ): 139 - 144 .
HO J , JAIN A , ABBEEL P . Denoising diffusion probabilistic models [J ] . Advances in Neural Information Processing Systems , 2020 , 33 : 6840 - 6851 .
PORTENIER T , HU Q Y , SZABÓ A , et al . Faceshop: deep sketch-based face image editing [J ] . ACM Transactions on Graphics , 2018 , 37 ( 4 ): 1 - 13 .
YU J H , LIN Z , YANG J M , et al . Free-form image inpainting with gated convolution [C ] // Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE Press , 2019 : 4470 - 4479 .
YANG S , WANG Z Y , LIU J Y , et al . Deep plastic surgery: robust and controllable image editing with human-drawn sketches [C ] // Computer Vision – ECCV 2020 . Cham : Springer International Publishing , 2020 : 601 - 617 .
ZENG Y , LIN Z , PATEL V M . SketchEdit: mask-free local image manipulation with partial sketches [C ] // Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2022 : 5941 - 5951 .
SABIR E , CHENG J X , JAISWAL A , et al . Recurrent convolutional strategies for face manipulation detection in videos [EB ] . arXiv preprint , 2019 , arXiv: 1905.00582 .
YANG X , LI Y Z , LYU S W . Exposing deep fakes using inconsistent head poses [C ] // Proceedings of the ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE Press , 2019 : 8261 - 8265 .
ZHAO H Q , WEI T Y , ZHOU W B , et al . Multi-attentional deepfake detection [C ] // Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2021 : 2185 - 2194 .
ZHOU P , HAN X T , MORARIU V I , et al . Two-stream neural networks for tampered face detection [C ] // Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops . Piscataway : IEEE Press , 2017 : 1831 - 1839 .
ZHAI Y H , LUAN T Y , DOERMANN D , et al . Towards generic image manipulation detection with weakly-supervised self-consistency learning [C ] // Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE Press , 2023 : 22333 - 22343 .
PENG C L , MIAO Z M , LIU D C , et al . Where deepfakes gaze at? spatial–temporal gaze inconsistency analysis for video face forgery detection [J ] . IEEE Transactions on Information Forensics and Security , 2024 , 19 : 4507 - 4517 .
HU J , LIAO X , LIANG J W , et al . FInfer: frame inference-based deepfake detection for high-visual-quality videos [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2022 , 36 ( 1 ): 951 - 959 .
YANG W Y , ZHOU X Y , CHEN Z K , et al . AVoiD-DF: audio-visual joint learning for detecting deepfake [J ] . IEEE Transactions on Information Forensics and Security , 2023 , 18 : 2015 - 2029 .
GU Q Q , CHEN S , YAO T P , et al . Exploiting fine-grained face forgery clues via progressive enhancement learning [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2022 , 36 ( 1 ): 735 - 743 .
LIANG J H , SHI H F , DENG W H . Exploring disentangled content information for face forgery detection [C ] // Computer Vision – ECCV 2022 . Cham : Springer Nature Switzerland , 2022 : 128 - 145 .
YANG T Y , CAO J , SHENG Q , et al . Learning to disentangle GAN fingerprint for fake image attribution [EB ] . arXiv preprint , 2021 , arXiv: 2106.08749 .
WANG C R , DENG W H . Representative forgery mining for fake face detection [C ] // Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2021 : 14923 - 14932 .
ZHU X Y , WANG H , FEI H Y , et al . Face forgery detection by 3D decomposition [C ] // Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2021 : 2929 - 2939 .
ZHENG Y L , BAO J M , CHEN D , et al . Exploring temporal coherence for more general video face forgery detection [C ] // Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE Press , 2021 : 15024 - 15034 .
SUN Z K , HAN Y J , HUA Z Y , et al . Improving the efficiency and robustness of deepfakes detection through precise geometric features [C ] // Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2021 : 3609 - 3618 .
AMERINI I , GALTERI L , CALDELLI R , et al . Deepfake video detection through optical flow based CNN [C ] // Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop . Piscataway : IEEE Press , 2019 .
WESTERLUND M . The emergence of deepfake technology: a review [J ] . Technology Innovation Management Review , 2019 , 9 ( 11 ): 39 - 52 .
TOLOSANA R , VERA-RODRIGUEZ R , FIERREZ J , et al . Deepfakes and beyond: a Survey of face manipulation and fake detection [J ] . Information Fusion , 2020 , 64 : 131 - 148 .
DENG J Y , LIN C H , HU P B , et al . Towards benchmarking and evaluating deepfake detection [J ] . IEEE Transactions on Dependable and Secure Computing , 2024 , 21 ( 6 ): 5112 - 5127 .
LI C Q , HUANG Z W , PAUDEL D P , et al . A continual deepfake detection benchmark: dataset, methods, and essentials [C ] // Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision . Piscataway : IEEE Press , 2023 : 1339 - 1349 .
YAN Z , ZHANG Y , YUAN X H , et al . DeepfakeBench: a comprehensive benc-hmark of deepfake detection [C ] // Proceedings of the 37th International Conference on Neural Information Processing Systems . 2023 : 4534 - 4565 .
SELVARAJU R R , COGSWELL M , DAS A , et al . Grad-CAM: visual explanations from deep networks via gradient-based localization [C ] // Proceedings of the 2017 IEEE International Conference on Computer Vision . Piscataway : IEEE Press , 2017 : 618 - 626 .
VAN DER MAATEN L , HINTON G . Visualizing data using t-SNE [J ] . Journal of Machine Learning Research , 2008 , 9 : 2579 - 2605 .
ROSSLER A , COZZOLINO D , VERDO-LIVA L , et al . FaceForensics++: learning to detect manipulated facial images [C ] // Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE Press , 2019 : 1 - 11 .
LI Y Z , YANG X , SUN P , et al . Celeb-DF: a large-scale challenging dataset for DeepFake forensics [C ] // Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2020 : 3207 - 3216 .
DOLHANSKY B , HOWES R , PFLAUM B , et al . The deepfake detection challenge (DFDC) preview dataset [EB ] . arXiv preprint , 2020 , arXiv: 2006.07397 .
LI Y Z , CHANG M C , LYU S W . In ictu oculi: exposing AI created fake videos by detecting eye blinking [C ] // Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security . Piscataway : IEEE Press , 2018 : 1 - 7 .
LI L Z , BAO J M , YANG H , et al . Advancing high fidelity identity swapping for forgery detection [C ] // Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2020 : 5073 - 5082 .
JIANG L M , LI R , WU W , et al . DeeperForensics-1.0: a large-scale dataset for real-world face forgery detection [C ] // Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2020 : 2886 - 2895 .
KHALID H , TARIQ S , KIM M , et al . FakeAVCeleb: a novel audio-video multimodal deepfake dataset [EB ] . arXiv preprint , 2021 , arXiv: 2108.05080 .
SHIOHARA K , YAMASAKI T . Detecting deepfakes with self-blended images [C ] // Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2022 : 18699 - 18708 .
ZHAO T C , XU X , XU M Z , et al . Learning self-consistency for deepfake detection [C ] // Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE Press , 2021 : 15003 - 15013 .
TAN M X , LE Q V . EfficientNet: rethinking model scaling for convolutional neural networks [EB ] . arXiv preprint , 2019 , arXiv: 1905.11946 .
HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2016 : 770 - 778 .
NGUYEN H H , YAMAGISHI J , ECH-IZEN I . Capsule-forensics: using capsule networks to detect forged images and videos [C ] // Proceedings of the ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE Press , 2019 : 2307 - 2311 .
LI Y Z , LYU S W . Exposing DeepFake videos by detecting face warping artifacts [EB ] . arXiv preprint , 2019 , arXiv: 1811.00656 .
LI L Z , BAO J M , ZHANG T , et al . Face X-ray for more general face forgery detection [C ] // Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2020 : 5001 - 5010 .
DANG H , LIU F , STEHOUWER J , et al . On the detection of digital face manipulation [C ] // Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2020 : 5781 - 5790 .
NI Y S , MENG D P , YU C Q , et al . CORE: consistent representation learning for face forgery detection [C ] // Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops . Piscataway : IEEE Press , 2022 : 12 - 21 .
CAO J Y , MA C , YAO T P , et al . End-to-end reconstruction-classification learning for face forgery detection [C ] // Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2022 : 4103 - 4112 .
YAN Z Y , ZHANG Y , FAN Y B , et al . UCF: uncovering common features for generalizable deepfake detection [C ] // Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE Press , 2023 : 22355 - 22366 .
GUO Z H , LIU Y J , ZHANG J , et al . Face forgery detection with elaborate backbone [EB ] . arXiv preprint , 2024 , arXiv: 2409.16945 .
YU C E , ZHANG X H , DUAN Y X , et al . Diff-ID: an explainable identity difference quantification framework for DeepFake detection [J ] . IEEE Transactions on Dependable and Secure Computing , 2024 , 21 ( 5 ): 5029 - 5045 .
CHENG J K , YAN Z Y , ZHANG Y , et al . Can we leave deepfake data behind in training Deepfake detector [EB ] . arXiv preprint , 2024 , arXiv: 2408.17052 .
YAN Z Y , LUO Y H , LYU S W , et al . Transcending forgery specificity with latent space augmentation for generalizable deepfake detection [C ] // Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2024 : 8984 - 8994 .
QIAN Y Y , YIN G J , SHENG L , et al . Thinking in frequency: face forgery detection by mining frequency-aware clues [C ] // Computer Vision – ECCV 2020 . Cham : Springer International Publishing , 2020 : 86 - 103 .
LIU H G , LI X D , ZHOU W B , et al . Spatial-phase shallow learning: rethinking face forgery detection in frequency domain [C ] // Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2021 : 772 - 781 .
LUO Y C , ZHANG Y , YAN J C , et al . Generalizing face forgery detection with high-frequency features [C ] // Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2021 : 16317 - 16326 .
WANG Z D , BAO J M , ZHOU W G , et al . AltFreezing for more general video face forgery detection [C ] // Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2023 : 4129 - 4138 .
HALIASSOS A , VOUGIOUKAS K , PETRIDIS S , et al . Lips don’t lie: a generalisable and robust approach to face forgery detection [C ] // Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2021 : 5039 - 5049 .
XU Y T , LIANG J , JIA G Y , et al . TALL: thumbnail layout for deepfake video detection [C ] // Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE Press , 2023 : 22601 - 22611 .
DONG S C , WANG J , JI R H , et al . Implicit identity leakage: the stumbling block to improving deepfake detection generalization [C ] // Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2023 : 3994 - 4004 .
HUANG B J , WANG Z Y , YANG J F , et al . Implicit identity driven deepfake face swapping detection [C ] // Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2023 : 4490 - 4499 .
HALIASSOS A , MIRA R , PETRIDIS S , et al . Leveraging real talking faces via self-supervision for robust forgery detection [C ] // Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2022 : 14930 - 14942 .
CHEN L , ZHANG Y , SONG Y B , et al . Self-supervised learning of adversarial example: towards good generalizations for deepfake detection [C ] // Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2022 : 18689 - 18698 .
ZHUANG W Y , CHU Q , TAN Z T , et al . UIA-ViT: unsupervised inconsistency-aware method based on vision transformer for face forgery detection [C ] // Computer Vision – ECCV 2022 . Cham : Springer Nature Switzerland , 2022 : 391 - 407 .
NGUYEN D , MEJRI N , SINGH I P , et al . LAA-net: localized artifact attention network for quality-agnostic and generalizable deepfake detection [EB ] . arXiv preprint , 2024 , arXiv: 2401.13856 .
BA Z J , LIU Q Y , LIU Z G , et al . Exposing the deception: uncovering more forgery clues for deepfake detection [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 2 ): 719 - 728 .
GUILLARO F , COZZOLINO D , SUD A , et al . TruFor: leveraging all-round clues for trustworthy image forgery detection and localization [C ] // Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2023 : 20606 - 20615 .
FEI J W , DAI Y S , YU P P , et al . Learning second order local anomaly for general face forgery detection [C ] // Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2022 : 20238 - 20248 .
GUO Y , ZHEN C , YAN P F . Controllable guide-space for generalizable face forgery detection [C ] // Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE Press , 2023 : 20761 - 20770 .
GUAN J , ZHOU H , HONG Z , et al . Delving into sequential patches for deepfake detection [J ] . Advances in Neural Information Processing Systems , 2022 , 35 : 4517 - 4530 .
LI M S , LI X R , YU K , et al . Spatio-temporal catcher: a self-supervised transformer for deepfake video detection [C ] // Proceedings of the 31st ACM International Conference on Multimedia . New York : ACM , 2023 : 8707 - 8718 .
TAN C C , ZHAO Y , WEI S K , et al . Frequency-aware deepfake detection: improving generalizability through frequency space domain learning [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 5 ): 5052 - 5060 .
WANG Y , YU K , CHEN C , et al . Dynamic graph learning with content-guided spatial-frequency relation reasoning for deepfake detection [C ] // Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2023 : 7278 - 7287 .
ZHOU X Y , HAN H , SHAN S G , et al . Fine-grained open-set deepfake detection via unsupervised domain adaptation [J ] . IEEE Transactions on Information Forensics and Security , 2024 , 19 : 7536 - 7547 .
SONG D , LEE N , KIM J , et al . Anomaly detection of deepfake audio based on real audio using generative adversarial network model [J ] . IEEE Access , 2024 , 12 : 184311 - 184326 .
LI J C , HU Y J , LIU B B , et al . Boosting deepfake feature extractors using unsupervised domain adaptation [J ] . IEEE Signal Processing Letters , 2024 , 31 : 2010 - 2014 .
LV Q X , LI Y Z , DONG J Y , et al . DomainForensics: exposing face forgery across domains via bi-directional adaptation [J ] . IEEE Transactions on Information Forensics and Security , 2024 , 19 : 7275 - 7289 .
KHORMALI A , YUAN J S . Self-supervised graph transformer for deepfake detection [J ] . IEEE Access , 2024 , 12 : 58114 - 58127 .
YU Y , LIU X L , NI R R , et al . PVASS-MDD: predictive visual-audio alignment self-supervision for multimodal deepfake detection [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 34 ( 8 ): 6926 - 6936 .
PARK J , PARK L H , AHN H E , et al . Coexistence of deepfake defenses: addressing the poisoning challenge [J ] . IEEE Access , 2024 , 12 : 11674 - 11687 .
GUARNERA L , GIUDICE O , BATTI-ATO S . Mastering deepfake detection: a cutting-edge approach to distinguish GAN and diffusion-model images [J ] . ACM Transactions on Multimedia Computing, Communications, and Applications , 2024 , 20 ( 11 ): 1 - 24 .
0
浏览量
0
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621