Despite the huge advances in digital communications in the last decade, physical documents are still the most common media for information transfer, especially in the official context. However, the readily available document processing devices and techniques (printers, scanners, etc.) facilitate the illegal manipulation or imitation of original documents by forgers. Therefore, verification of the authenticity and detection of forgery is of paramount importance to all agencies receiving printed documents. We suggest an unsupervised forgery detection framework that can distinguish whether a document is forged based on the spectroscopy of the document’s ink. The spectra of the tested documents inks (original and questioned) were obtained using laser-induced breakdown spectroscopy (LIBS) technology. Then, a correlation matrix of the spectra was calculated for both the original and questioned documents together, which were then transformed into an adjacency matrix aiming at converting it into a weighted network under the concept of graph theory. Clustering algorithms were then applied to the network to determine the number of clusters. The proposed approach was tested under a variety of scenarios and different types of printers (e.g., inkjet, laser, and photocopiers) as well as different kinds of papers. The findings show that the proposed approach provided a high rate of accuracy in identifying forged documents and a high detection speed. It also provides a visual output that is easily interpretable to the non-expert, which provides great flexibility for real-world application.
The area of forgery detection of documents is considered an active field of research in digital forensics. One of the most common issues that investigators struggle with is circled around the selection of the approach in terms of accuracy, complexity, cost, and ease of use. The literature includes many approaches that are based on either image processing techniques or spectrums analysis. However, most of the available approaches have issues related to complexity and accuracy. This article suggests an unsupervised forgery detection framework that utilizes the correlations among the spectrums of documents’ matters in generating a weighted network for the tested documents. The network, then, is clustered using several unsupervised clustering algorithms. The detection rate is measured according to the number of network clusters. Based on the obtained results, our approach provides high accuracy using the Louvain clustering algorithms, while the use of the updated version of the DBSAN was more successful when testing many documents at the same time. Additionally, the suggested framework is considered simple to implement and does not require professional knowledge to use.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.