Spatio-temporal co-attention fusion network for video splicing localization

Lin, Man; Cao, Gang; Lou, Zijie; Zhang, Chi

doi:10.1117/1.jei.33.3.033027

J. Electron. Imag.

2024

DOI: 10.1117/1.jei.33.3.033027

|View full text |Cite

Spatio-temporal co-attention fusion network for video splicing localization

Man Lin,

Gang Cao,

Zijie Lou

et al.

Abstract: Digital video splicing has become easy and ubiquitous. Malicious users copy some regions of a video and paste them into another video to create realistic forgeries. It is important to blindly detect such forgery regions in videos. A spatio-temporal co-attention fusion network (SCFNet) is proposed for video splicing localization. Specifically, a three-stream network is used as an encoder to capture manipulation traces across multiple frames. The deep interaction and fusion of spatio-temporal forensic features a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article2

Relationship

Self Cite0

Independent2

Authors

Journals

Cited by 2 publications

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Detecting sequential video forgery using spatiotemporal attention mechanisms

Singh,

Rathor,

Kumar

2024

J. Electron. Imag.

View full text Add to dashboard Cite

Detecting sequential video forgery using spatiotemporal attention mechanisms

Singh,

Rathor,

Kumar

2024

J. Electron. Imag.

View full text Add to dashboard Cite

Video and Audio Deepfake Datasets and Open Issues in Deepfake Technology: Being Ahead of the Curve

Akhtar,

Pendyala,

Athmakuri

2024

Forensic Sciences

View full text Add to dashboard Cite

The revolutionary breakthroughs in Machine Learning (ML) and Artificial Intelligence (AI) are extensively being harnessed across a diverse range of domains, e.g., forensic science, healthcare, virtual assistants, cybersecurity, and robotics. On the flip side, they can also be exploited for negative purposes, like producing authentic-looking fake news that propagates misinformation and diminishes public trust. Deepfakes pertain to audio or visual multimedia contents that have been artificially synthesized or digitally modified through the application of deep neural networks. Deepfakes can be employed for benign purposes (e.g., refinement of face pictures for optimal magazine cover quality) or malicious intentions (e.g., superimposing faces onto explicit image/video to harm individuals producing fake audio recordings of public figures making inflammatory statements to damage their reputation). With mobile devices and user-friendly audio and visual editing tools at hand, even non-experts can effortlessly craft intricate deepfakes and digitally altered audio and facial features. This presents challenges to contemporary computer forensic tools and human examiners, including common individuals and digital forensic investigators. There is a perpetual battle between attackers armed with deepfake generators and defenders utilizing deepfake detectors. This paper first comprehensively reviews existing image, video, and audio deepfake databases with the aim of propelling next-generation deepfake detectors for enhanced accuracy, generalization, robustness, and explainability. Then, the paper delves deeply into open challenges and potential avenues for research in the audio and video deepfake generation and mitigation field. The aspiration for this article is to complement prior studies and assist newcomers, researchers, engineers, and practitioners in gaining a deeper understanding and in the development of innovative deepfake technologies.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Spatio-temporal co-attention fusion network for video splicing localization

Cited by 2 publications

References 30 publications

Detecting sequential video forgery using spatiotemporal attention mechanisms

Detecting sequential video forgery using spatiotemporal attention mechanisms

Video and Audio Deepfake Datasets and Open Issues in Deepfake Technology: Being Ahead of the Curve

Contact Info

Product

Resources

About