Recently, several in‐loop filtering algorithms based on convolutional neural network (CNN) have been proposed to improve the efficiency of HEVC (High Efficiency Video Coding). Conventional CNN‐based filters only apply a single model to the whole image, which cannot adapt well to all local features from the image. To solve this problem, an in‐loop filtering algorithm based on a dynamic convolutional capsule network (DCC‐net) is proposed, which embeds localized dynamic routing and dynamic segmentation algorithms into capsule network, and integrate them into the HEVC hybrid video coding framework as a new in‐loop filter. The proposed method brings average 7.9% and 5.9% BD‐BR reductions under all intra (AI) and random access (RA) configurations, respectively, as well as, 0.4 dB and 0.2 dB BD‐PSNR gains, respectively. In addition, the proposed algorithm has an outstanding performance in terms of time efficiency.
With the rapid development of manipulation technologies, the generation of Deep Fake videos is more accessible than ever. As a result, face forgery detection becomes a challenging task, attracting a significant amount of attention from researchers worldwide. However, most previous work, consisting of convolutional neural networks (CNN), is not sufficiently discriminative and cannot fully utilise subtle clues and similar textures during the process of facial forgery detection. Moreover, these methods cannot simultaneously consider accuracy and time efficiency. To address such problems, we propose a novel framework named FPC-Net to extract some meaningful and unnatural expressions in local regions. This framework utilises CNN, long short-term memory (LSTM), channel groups loss (CG-Loss) and adaptive feature fusion to detect face forgery videos. First, the proposed method exploits spatial features by CNN, and a channel-wise attention mechanism is employed to separate channels. Specifically, with the help of channel groups loss, the channels are divided into two groups, each representing a specific class. Second, LSTM is applied to learn the correlation of spatial features. Finally, the correlation of features is mapped into other latent spaces. Through a lot of experiments, the results are that the detection speed of the proposed method reaches 420 FPS and the auc scores achieve best performance of 99.7%, 99.9%, 94.7%, and 82.0% on Raw Celeb-DF, Raw Face Forensics++, F2F and NT datasets respectively. The experimental results demonstrate that the proposed framework has great time efficiency performance while improving the detection performance compared with other frame-level methods in most cases. K E Y W O R D S adaptive feature fusion, facial forgery detection | INTRODUCTIONSince video synthesis has made remarkable progress and multimedia technologies have witnessed an explosion of generative models of continuously growing capability and capacity, the malicious abuse of video manipulation technology is causing great concern. Many scholars are devoted to the research of video and image forgery detection and have achieved some achievements. Tyagi et al. [1] provide a detailed analysis of image and video manipulation and detection techniques. Vinolin et al. [2] focus on establishing the 3D model of the video frame to generate light coefficients in order to detect the forgeries in videos. Chen et al. [3] propose a blind detection model for image forensics based on weak feature extraction. However, the videos generated by generative adversarial networks (GANs) [4] or variational autoencoder (VAEs) [5] are too realistic to distinguish, which causes serious problems, such as fake news, public security and privacy threats. For example, in 2018, a realistic-looking video showed that the former President Barack Obama was cussing another former President, Donald Trump, bringing attention to the risk of Deep-Fake. Recently, the most popular term 'DeepFake' in video This is an open access article under the terms of the Crea...
We propose a novel algorithm for detecting the duplication of moving objects in video and locating forged motion sequences. First, the algorithm constructs an energy factor (EF) curve to identify the suspect frames of the video. Second, an adaptive-parameter-based Visual Background Extractor (ViBe) algorithm (APViBe) is employed for background modelling. Moreover, all motion sequences are identified by using an improved fast compressive tracking (FCT) algorithm based on an adaptive learning rate and measurement matrix (ALMFCT). Third, a similarity-analysis-based scheme (SAS) is designed to search for pairs of suspect motion sequences. Finally, the flip-invariant scale-invariant feature transform (FISIFT) algorithm is used to match the feature points of moving objects in the pairs of suspect motion sequences, based on which the forged motion sequences in the video are confirmed. Experimental results show that the proposed approach outperforms previous algorithms in computational efficiency, accuracy and robustness.
With the rapid development of image editing technology, tampering with images has become easier. Maliciously tampered images lead to serious security problems (e.g., when used as evidence). The current mainstream methods of image tampering are divided into three types which are copy‐move, splicing and removal. Many image tampering detection methods can only detect one type of image tampering. Additionally, some methods learn features by suppressing image content, which can result in false positives when identifying tampered areas. In this paper, the authors propose a novel framework named the dual supervision neural network (DS‐Net) to localize the tampered regions of images tampered by the three tampering methods mentioned above. First, to extract richer multiscale information, the authors add skip connections to the atrous spatial pyramid pooling (ASPP) module. Second, a channel attention mechanism is introduced to dynamically weigh the results generated by ASPP. Finally, the authors build additional supervised branches for high‐level features to further enhance the extraction of these high‐level features before fusing them with low‐level features. The authors conduct experiments on various standard datasets. Through extensive experiments, the results show that the AUC scores reach 86.4%, 95.3% and 99.6% for CASIA, COVERAGE and NIST16 datasets, respectively, and the F1 scores are 56.0%, 73.4% and 82.7%, respectively. The results demonstrate that the authors’ method can accurately locate tampered regions and achieve better performance on various datasets than other methods of the same type.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.