A JPEG image is double-compressed if it underwent JPEG compression twice, each time with a different quantization matrix but with the same 8 × 8 grid. Some popular steganographic algorithms (Jsteg, F5, OutGuess) naturally produce such double-compressed stego images. Because double-compression may significantly change the statistics of DCT coefficients, it negatively influences the accuracy of some steganalysis methods developed under the assumption that the stego image was only single-compressed. This paper presents methods for detection of double-compression in JPEGs and for estimation of the primary quantization matrix, which is lost during recompression. The proposed methods are essential for construction of accurate targeted and blind steganalysis methods for JPEG images, especially those based on calibration. Both methods rely on support vector machine classifiers with feature vectors formed by histograms of low-frequency DCT coefficients.
MOTIVATIONIn this paper, we consider a JPEG image double-compressed if it was compressed twice, each time with a different quantization matrix. The quantization matrix used in the first compression is called the primary quantization matrix, the quantization matrix used in subsequent (second) compression is called the secondary quantization matrix. Since the JPEG image file does not keep information about the compression history, only the latest (secondary) quantization matrix is stored within the file and the primary quantization matrix is lost.Detection of double-compression is important in steganalysis as well as in forensics because the fact that an image was double-compressed indicates that it was manipulated. By determining double-compression history in smaller regions, we may discover traces of malicious manipulation. For example, when pasting an object into a decompressed JPEG and resaving with a different JPEG quality factor, the pasted object may exhibit different repetitive JPEG compression artifacts than the rest of the image.
Some steganographic algorithms (e.g., F521 and OutGuess 18 ) decompress the cover image to the spatial domain and then the image is compressed again during embedding with a user supplied or a default quality factor. Unless the quantization matrices match, the resulting stego image will be double-compressed. Thus, steganalytic methods also benefit from knowledge of stego image compression history. This is especially true for methods that use calibration 5 to estimate the statistics of the cover image. It is absolutely essential to adjust the calibration to mimic what happened during embedding. To do so, we need to accurately detect double-compressed images and estimate their primary quantization matrix, otherwise the steganalytic methods may give completely misleading results.
5In this paper, we address two problems: the detection of double-compression in JPEG images and the estimation of primary quantization matrix. Even though the first problem can be understood as a subproblem of the second one, we consider them separately. This...