“…Captions, therefore, have different characteristics from scene texts. We can find many attempts on this task, such as [4,5,6,7,8,9,10,11,12,13]. Most of them deal with the static captions (i.e., captions without motions), while Zedan [10] deals with moving captions; they assume the vertical or horizontal scrolling of caption text, in addition to the static captions.…”