The evolvement in digital media and information technology over the past decades have purveyed the internet to be an effectual medium for the exchange of data and communication. With the advent of technology, the data has become susceptible to mismanagement and exploitation. This led to the emergence of Internet Security frameworks like Information hiding and detection. Examples of domains of Information hiding and detection are Steganography and steganalysis respectively. This work focus on addressing possible security breaches using Internet security framework like Information hiding and techniques to identify the presence of a breach. The work involves the use of Blind steganalysis technique with the concept of Machine Learning incorporated into it. The work is done using the Joint Photographic Expert Group (JPEG) format because of its wide use for transmission over the Internet. Stego (embedded) images are created for evaluation by randomly embedding a text message into the image. The concept of calibration is used to retrieve an estimate of the cover (clean) image for analysis. The embedding is done with four different steganographic schemes in both spatial and transform domain namely LSB Matching and LSB Replacement, Pixel Value Differencing and F5. After the embedding of data with random percentages, the first order, the second order, the extended Discrete Cosine Transform (DCT) and Markov features are extracted for steganalysis.The above features are a combination of interblock and intra block dependencies. They had been considered in this paper to eliminate the drawback of each one of them, if considered separately. Dimensionality reduction is applied to the features using Principal Component Analysis (PCA). Block based technique had been used in the images for better accuracy of results. The technique of machine learning is added by using classifiers to differentiate the stego image from a cover image. A comparative study had been during with the classifier names Support Vector Machine and its evolutionary counterpart using Particle Swarm Optimization. The idea of cross validation had also been used in this work for better accuracy of results. Further parameters used in the process are the four different types of sampling namely linear, shuffled, stratified and automatic and the six different kernels used in classification specifically dot, multi-quadratic, epanechnikov, radial and ANOVA to identify what combination would yield a better result.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.