“…Consequently, the ground truth for the evaluation becomes more complex and contains the location of each word in addition to its transcription. Locations can be expressed as, e. g., lines [14], bounding boxes [11], convex hulls [15], or other polygons. Our work only relies on closed polygonal lines enclosing areas and makes no further assumption on their geometric structure.…”