The complexity of the recognition process is strongly related to language, the type of writing and the vocabulary size. Our work represents a contribution to a system of recognition of large canonical Arabic vocabulary of decomposable words derived from tri-consonantal roots. This system is based on a collaboration of three morphological classifiers specialized in the recognition of roots, schemes and conjugations. Our work deals with the first classifier. It is about proposing a root classifier based on 101 Hidden Markov Models, used to classify 101 tri-consonantal roots. The models have the same architecture endowed with Arabic linguistic knowledge. The proposed system deals, up to now, with a vocabulary of 5757 words. It has been learned then tested using a total of more than 17000 samples of printed words. Obtained results are satisfying and the top2 recognition rate reached 96%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.