Hidden Markov Models (HMMs) are now widely used for off-line handwriting recognition in many languages. As in speech recognition, they are usually built from shared, embedded HMMs at symbol level, where state-conditional probability density functions in each HMM are modeled with Gaussian mixtures. In contrast to speech recognition, however, it is unclear which kind of
Abstract. Hidden Markov Models (HMMs) are now widely used for off-line text recognition in many languages and, in particular, Arabic. In previous work, we proposed to directly use columns of raw, binary image pixels, which are directly fed into embedded Bernoulli (mixture) HMMs, that is, embedded HMMs in which the emission probabilities are modeled with Bernoulli mixtures. The idea was to by-pass feature extraction and to ensure that no discriminative information is filtered out during feature extraction, which in some sense is integrated into the recognition model. More recently, we extended the column bit vectors by means of a sliding window of adequate width to better capture image context at each horizontal position of the word image. However, these models might have limited capability to properly model vertical image distortions. In this paper, we have considered three methods of window repositioning after window extraction to overcome this limitation. Each sliding window is translated (repositioned) to align its center to the center of mass. Using this approach, state-of-art results are reported on the Arabic Printed Text Recognition (APTI) database.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.