Convolutional neural networks (CNNs) have received significant attention due to their ability to adaptively learn classification features directly from data. While CNNs have helped cause dramatic advances in fields such as object and speech recognition, multimedia forensics is fundamentally different problem compared to other deep learning applications. Little work exists to guide the design of CNN architectures for forensic tasks. Furthermore, it is still unclear which forensic tasks can be performed using CNNs. In this work, we investigate the design of CNNs for multiple multimedia forensic applications. We show that CNNs are capable of performing image manipulation detection as well as camera model identification. Through a series of experiments, we systematically examine the influence of several important CNN design choices for forensic applications, such as the use of a constrained convolutional layer or fixed high-pass filter at the beginning of the CNN, the use of nonlinearity after the first layer, the choice of activation and pooling functions, etc. We show that different CNN design choices should be made for different forensic applications and identify design choices to maximize the performance of CNNs for manipulation detection and camera model identification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.