Handwritten digit recognition is one of the extensively studied areas in machine learning. Apart from the wider research on handwritten digit recognition on MNIST dataset, there are many other research works on various script recognition. However, it is not very common for multi-script digit recognition which encourages the development of robust and multipurpose systems. Additionally, working on multi-script digit recognition enables multi-task learning. It is evident that multi-task learning improves model performance through inductive transfer using the information contained in related tasks. Therefore, in this study multi-script handwritten digit recognition using multi-task learning is proposed to be investigated. As a specific case of demonstrating the solution to the problem, Amharic handwritten character recognition is also experimentally tested. The handwritten digits of three scripts including Latin, Arabic, and Kannada are studied to show that multi-task models with a reformulation of the individual tasks have shown promising results. In this study, a novel approach of using the individual tasks predictions was proposed to help the classification performance. These research findings have outperformed the baseline and the conventional multi-task learning models. More importantly, it avoided the need for weighting the different losses of the tasks, which is one of the challenges in multi-task learning.
Amharic is one of the official languages of the Federal Democratic Republic of Ethiopia. It is one of the languages that use an Ethiopic script which is derived from Gee'z, ancient and currently a liturgical language. Amharic is also one of the most widely used literature-rich languages of Ethiopia. There are very limited innovative and customized research works in Amharic optical character recognition (OCR) in general and Amharic handwritten text recognition in particular. In this study, Amharic handwritten word recognition will be investigated. State-of-theart deep learning techniques including convolutional neural networks together with recurrent neural networks and connectionist temporal classification (CTC) loss were used to make the recognition in an end-toend fashion. More importantly, an innovative way of complementing the loss function using the auxiliary task from the row-wise similarities of the Amharic alphabet was tested to show a significant recognition improvement over a baseline method. Such findings will promote innovative problem-specific solutions as well as will open insight to a generalized solution that emerges from problem-specific domains.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.