This paper presents the CG-1050 dataset consisting of 100 original images, 1050 tampered images and their corresponding masks. The dataset is organized into four directories: original images, tampered images, mask images, and a description file. The directory of original images includes 15 color and 85 grayscale images. The directory of tampered images has 1050 images obtained through one of the following type of tampering: copy-move, cut-paste, retouching, and colorizing. The true mask between every pair of original and its tampered image is included in the mask directory (1380 masks). The description file shows the names of the images (i.e., original, tampered and mask), the image description, the photo location, the type of tampering, and the manipulated object in the image. With this dataset, the researchers can train and validate fake image classification methods, either for labelling the tampered image or for forgery pixel-detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.