Abstract-A new statistical approach based on global typographical features is proposed to the widely neglected problem of font recognition. It aims at the identification of the typeface, weight, slope and size of the text from an image block without any knowledge of the content of that text. The recognition is based on a multivariate Bayesian classifier and operates on a given set of known fonts. The effectiveness of the adopted approach has been experimented on a set of 280 fonts. Font recognition accuracies of about 97 percent were reached on high-quality images. In addition, rates higher than 99.9 percent were obtained for weight and slope detection. Experiments have also shown the system robustness to document language and text content and its sensitivity to text length.
This paper presents a Convolutional Neural Network (CNN) based page segmentation method for handwritten historical document images. We consider page segmentation as a pixel labeling problem, i.e., each pixel is classified as one of the predefined classes. Traditional methods in this area rely on carefully hand-crafted features or large amounts of prior knowledge. In contrast, we propose to learn features from raw image pixels using a CNN. While many researchers focus on developing deep CNN architectures to solve different problems, we train a simple CNN with only one convolution layer. We show that the simple architecture achieves competitive results against other deep architectures on different public datasets. Experiments also demonstrate the effectiveness and superiority of the proposed method compared to previous methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.