“…Then, the extracted image word candidates are represented by a feature descriptor. Different types of descriptors have been proposed [1], [2], [3], [4]. Some methods describe the word image with a global representation, e.g., gradient, contextual, and convexity features [5], [6], [7], [8], features based on moments of binary images [9] or features based on the spatial distribution of shape pixels in a set of predefined image sub-regions [10].…”