Deep Learning for Deepfakes Creation and Detection: A Survey

Nguyen, Thanh Thi; Nguyen, Quoc Viet Hung; Nguyen, Dung Tien; Nguyen, Duc Thanh; Huynh‐The, Thien; Nahavandi, Saeid; Nguyên, Thành Tâm; Pham, Quoc-Viet; Nguyen, Cuong M.

doi:10.2139/ssrn.4030341

Cited by 59 publications

(49 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The increased accessibility and the many controls CCVS offers could accelerate the emergence of questionable applications, such as "deepfakes" (e.g., a video created from someone's picture and an arbitrary audio) which could lead to harassment, defamation, or dissemination of fake news. On top of current efforts to automate their detection [47], it remains our responsibility to grow awareness of these possible misuses. Despite these worrying aspects, our contribution has plenty of positive applications which outweigh the potential ethical harms.…”

Section: Discussionmentioning

confidence: 99%

CCVS: Context-aware Controllable Video Synthesis

Moing¹,

Ponce²,

Schmid³

2021

Preprint

View full text Add to dashboard Cite

This presentation introduces a self-supervised learning approach to the synthesis of new video clips from old ones, with several new key elements for improved spatial resolution and realism: It conditions the synthesis process on contextual information for temporal continuity and ancillary information for fine control. The prediction model is doubly autoregressive, in the latent space of an autoencoder for forecasting, and in image space for updating contextual information, which is also used to enforce spatio-temporal consistency through a learnable optical flow module. Adversarial training of the autoencoder in the appearance and temporal domains is used to further improve the realism of its output. A quantizer inserted between the encoder and the transformer in charge of forecasting future frames in latent space (and its inverse inserted between the transformer and the decoder) adds even more flexibility by affording simple mechanisms for handling multimodal ancillary information for controlling the synthesis process (e.g., a few sample frames, an audio track, a trajectory in image space) and taking into account the intrinsically uncertain nature of the future by allowing multiple predictions. Experiments with an implementation of the proposed approach give very good qualitative and quantitative results on multiple tasks and standard benchmarks.Preprint. Under review.

show abstract

Section: Discussionmentioning

confidence: 99%

CCVS: Context-aware Controllable Video Synthesis

Moing¹,

Ponce²,

Schmid³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The rapid growth of computer vision and deep learning technology has driven the recently emerged phenomena of deepfakes ( deep learning and fake ), which can automatically forge images and videos that humans cannot easily recognize [ 29 - 31 ]. In addition, deepfake techniques offer the possibility of generating unrecognizable images of a person’s face and altering or swapping a person’s face in existing images and videos with another face that exhibits the same expressions as the original face [ 29 ]. Various deepfake attempts have been used for negative purposes, such as creating controversial content related to celebrities, politicians, companies, and even individuals to damage their reputation [ 30 ].…”

Section: Methodsmentioning

confidence: 99%

How Can Research on Artificial Empathy Be Enhanced by Applying Deepfakes?

Yang¹,

Rahmanti²,

Huang³

et al. 2022

J Med Internet Res

View full text Add to dashboard Cite

We propose the idea of using an open data set of doctor-patient interactions to develop artificial empathy based on facial emotion recognition. Facial emotion recognition allows a doctor to analyze patients' emotions, so that they can reach out to their patients through empathic care. However, face recognition data sets are often difficult to acquire; many researchers struggle with small samples of face recognition data sets. Further, sharing medical images or videos has not been possible, as this approach may violate patient privacy. The use of deepfake technology is a promising approach to deidentifying video recordings of patients’ clinical encounters. Such technology can revolutionize the implementation of facial emotion recognition by replacing a patient's face in an image or video with an unrecognizable face—one with a facial expression that is similar to that of the original. This technology will further enhance the potential use of artificial empathy in helping doctors provide empathic care to achieve good doctor-patient therapeutic relationships, and this may result in better patient satisfaction and adherence to treatment.

show abstract

“…With the rise of engagement on social media platforms, many applications are now based on face-swapping technologies. [5]- [8] Many novel approaches have been introduced in recent years. Thies et al in his work Face2Face [5] produces a real-time facial reenactment video.…”

Section: A Deepfakes Generation Methodsmentioning

confidence: 99%

Detecting Deepfakes with Metric Learning

Kumar

Bhavsar

Verma³

2020

2020 8th International Workshop on Biometrics and Forensics (IWBF)

View full text Add to dashboard Cite

With the arrival of several face-swapping applications such as FaceApp, SnapChat, MixBooth, FaceBlender and many more, the authenticity of digital media content is hanging on a very loose thread. On social media platforms, videos are widely circulated often at a high compression factor. In this work, we analyze several deep learning approaches in the context of deepfakes classification in high compression scenarios and demonstrate that a proposed approach based on metric learning can be very effective in performing such a classification. Using less number of frames per video to assess its realism, the metric learning approach using a triplet network architecture proves to be fruitful. It learns to enhance the feature space distance between the cluster of real and fake videos embedding vectors. We validated our approaches on two datasets to analyze the behavior in different environments. We achieved a state-of-theart AUC score of 99.2% on the Celeb-DF dataset and accuracy of 90.71% on a highly compressed Neural Texture dataset. Our approach is especially helpful on social media platforms where data compression is inevitable.

show abstract

Deep Learning for Deepfakes Creation and Detection: A Survey

Cited by 59 publications

References 0 publications

CCVS: Context-aware Controllable Video Synthesis

CCVS: Context-aware Controllable Video Synthesis

How Can Research on Artificial Empathy Be Enhanced by Applying Deepfakes?

Detecting Deepfakes with Metric Learning

Contact Info

Product

Resources

About