Improving the accuracy of Arabic text recognition in imagery requires a big modern dataset as data is the fuel for many modern machine learning models. This paper proposes a new dataset, called QTID, for Quran Text Image Dataset, the first Arabic dataset that includes Arabic marks. It consists of 309,720 different 192x64 annotated Arabic word images that contain 2,494,428 characters in total, which were taken from the Holy Quran. These finely annotated images were randomly divided into 90%, 5%, 5% sets for training, validation, and testing, respectively. In order to analyze QTID, a different dataset statistics were shown. Experimental evaluation shows that current best Arabic text recognition engines like Tesseract and ABBYY FineReader cannot work well with word images from the proposed dataset.
SUMMARYThe paper presents concepts and ideas underlying an approach for consistency management in objectoriented (OO) databases. In this approach constraints are considered as first class citizens and stored in a meta-database called constraints catalog. When an object is created constraints of this object are retrieved from the constraints catalog and relationships between these constraints and the object are established. The structure of constraints has several features that enhance consistency management in OO database management systems which do not exist in conventional approaches in a satisfactory way. This includes: monitoring object consistency at different levels of update granularity, integrity independence, and efficiency of constraints maintenance; controlling inconsistent objects; enabling and disabling constraints, globally to all objects or locally to individual objects; and declaring constraints on individual objects. All these features are provided by means of basic notations of OO data models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.