Source camera identification is an important and challenging problem in digital image forensics. The clues of the device used to capture the digital media are very useful for Law Enforcement Agencies (LEAs), especially to help them collect more intelligence in digital forensics. In our work, we focus on identifying the source camera device based on digital videos using deep learning methods. In particular, we evaluate deep learning models with increasing levels of complexity for source camera identification and show that with such sophistication the scene-suppression techniques do not aid in model performance. In addition, we mention several common machine learning strategies that are counter-productive in achieving a high accuracy for camera identification. We conduct systematic experiments using 28 devices from the VISION data set and evaluate the model performance on various video scenarios—flat (i.e., homogeneous), indoor, and outdoor and evaluate the impact on classification accuracy when the videos are shared via social media platforms such as YouTube and WhatsApp. Unlike traditional PRNU-noise (Photo Response Non-Uniform)-based methods which require flat frames to estimate camera reference pattern noise, the proposed method has no such constraint and we achieve an accuracy of $$72.75 \pm 1.1 \%$$
72.75
±
1.1
%
on the benchmark VISION data set. Furthermore, we also achieve state-of-the-art accuracy of $$71.75\%$$
71.75
%
on the QUFVD data set in identifying 20 camera devices. These two results are the best ever reported on the VISION and QUFVD data sets. Finally, we demonstrate the runtime efficiency of the proposed approach and its advantages to LEAs.