The identification of source cameras from videos, though it is a highly relevant forensic analysis topic, has been studied much less than its counterpart that uses images. In this work we propose a method to identify the source camera of a video based on camera specific noise patterns that we extract from video frames. For the extraction of noise pattern features, we propose an extended version of a constrained convolutional layer capable of processing color inputs. Our system is designed to classify individual video frames which are in turn combined by a majority vote to identify the source camera. We evaluated this approach on the benchmark VISION data set consisting of 1539 videos from 28 different cameras. To the best of our knowledge, this is the first work that addresses the challenge of video camera identification on a device level. The experiments show that our approach is very promising, achieving up to 93.1% accuracy while being robust to the WhatsApp and YouTube compression techniques. This work is part of the EU-funded project 4NSEEK focused on forensics against child sexual abuse. c https://orcid.org/0000-0003-2081-774X d https://orcid.org/0000-0001-6552-2596 * Derrick Timmerman and Guru Swaroop Bennabhaktula are both first authors.
Source camera identification is an important and challenging problem in digital image forensics. The clues of the device used to capture the digital media are very useful for Law Enforcement Agencies (LEAs), especially to help them collect more intelligence in digital forensics. In our work, we focus on identifying the source camera device based on digital videos using deep learning methods. In particular, we evaluate deep learning models with increasing levels of complexity for source camera identification and show that with such sophistication the scene-suppression techniques do not aid in model performance. In addition, we mention several common machine learning strategies that are counter-productive in achieving a high accuracy for camera identification. We conduct systematic experiments using 28 devices from the VISION data set and evaluate the model performance on various video scenarios—flat (i.e., homogeneous), indoor, and outdoor and evaluate the impact on classification accuracy when the videos are shared via social media platforms such as YouTube and WhatsApp. Unlike traditional PRNU-noise (Photo Response Non-Uniform)-based methods which require flat frames to estimate camera reference pattern noise, the proposed method has no such constraint and we achieve an accuracy of $$72.75 \pm 1.1 \%$$ 72.75 ± 1.1 % on the benchmark VISION data set. Furthermore, we also achieve state-of-the-art accuracy of $$71.75\%$$ 71.75 % on the QUFVD data set in identifying 20 camera devices. These two results are the best ever reported on the VISION and QUFVD data sets. Finally, we demonstrate the runtime efficiency of the proposed approach and its advantages to LEAs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.