In this paper, we consider the problem of detecting counterfeit identity documents in images captured with smartphones. As the number of documents contain special fonts, we study the applicability of convolutional neural networks (CNNs) for detection of the conformance of the fonts used with the ones, corresponding to the government standards. Here, we use multi-task learning to differentiate samples by both fonts and characters and compare the resulting classifier with its analogue trained for binary font classification. We train neural networks for authenticity estimation of the fonts used in machine-readable zones and ID numbers of the Russian national passport and test them on samples of individual characters acquired from 3238 images of the Russian national passport. Our results show that the usage of multi-task learning increases sensitivity and specificity of the classifier. Moreover, the resulting CNNs demonstrate high generalization ability as they correctly classify fonts which were not present in the training set. We conclude that the proposed method is sufficient for authentication of the fonts and can be used as a part of the forgery detection system for images acquired with a smartphone camera.
This work focuses on the Fast Hough Transform (FHT) algorithm proposed by M.L. Brady. We propose how to modify the standard FHT to calculate sums along lines within any given range of their inclination angles. We also describe a new way to visualise Hough-image based on regrouping of accumulator space around its center. Finally, we prove that using Brady parameterization transforms any line into a figure of type "angle".
During the process of document recognition in a video stream using a mobile device camera, the image quality of the document varies greatly from frame to frame. Sometimes recognition system is required not only to recognize all the specified attributes of the document, but also to select final document image of the best quality. This is necessary, for example, for archiving or providing various services; in some countries it can be required by law. In this case, recognition system needs to assess the quality of frames in the video stream and choose the “best” frame. In this paper we considered the solution to such a problem where the “best” frame means the presence of all specified attributes in a readable form in the document image. The method was set up on a private dataset, and then tested on documents from the open MIDV-2019 dataset. A practically applicable result was obtained for use in recognition systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.