The quality of the input text image has a clear impact on the output of a scene text recognition (STR) system; however, due to the fact that the main content of a text image is a sequence of characters containing semantic information, how to effectively assess text image quality remains a research challenge. Text image quality assessment (TIQA) can help in picking a hard sample, leading to a more robust STR system and recognition-oriented text image restoration. In this paper, by arguing that the text image quality comes from character-level texture feature and embedding robustness, we propose a learning-based fine-grained, sharp, and recognizable text image quality assessment method (FSR–TIQA), which is the first TIQA scheme to our knowledge. In order to overcome the difficulty of obtaining the character position in a text image, an attention-based recognizer is used to generate the character embedding and character image. We use the similarity distribution distance to evaluate the character embedding robustness between the intra-class and inter-class similarity distributions. The Haralick feature is used to reflect the clarity of the character region texture feature. Then, a quality score network is designed under a label–free training scheme to normalize the texture feature and output the quality score. Extensive experiments indicate that FSR-TIQA has significant discrimination for different quality text images on benchmarks and Textzoom datasets. Our method shows good potential to analyze dataset distribution and guide dataset collection.
Researches in surface defect classification and segmentation technology have been seen significant progress in recent years. However, there are few works on One-Class learning in this direction by a single model. In previous researches, some problems remain unsolved in the surface defect detection methods, e.g. the training needs a large number of samples and these models cannot classify and locate the surface defect accurately, etc. The main contribution in this work is that we summarize the overall ideas of previous research in network design and propose a multi-task model which could be trained only using a few of positive samples. Meanwhile, the experiments on AITEX detection datasets[1] which get 84.4% DR, 4.4% FAR and 34.2% MIOU, and conduct an ablation experiment in real industrial product dataset to validate the effect of different backbones on DCSNet. It’s worth mentioning that DCSNet provides a solution to the task of surface defect classification and segmentation based on One-Class learning. The code will be open source in ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://agit.ai/wyxxx/zhengtu">https://agit.ai/wyxxx/zhengtu.
Although lots of progress were made in Text Recognition /OCR in recent years, the task of font recognition is remaining challenging.The main challenge lies in the subtle difference between these similar fonts, which is hard to distinguish. This paper proposes a novel font recognizer with a pluggable module solving the font recognition task. The pluggable module hides the most discriminative accessible features and forces the network to consider other complicated features to solve the hard examples of similar fonts, called HE Block. Compared with the available public font recognition systems, our proposed method does not require any interactions at the inference stage. Extensive experiments demonstrate that HENet achieves encouraging performance, including on character-level dataset Explor all and word-level dataset AdobeVFR.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.