Character recognition is an important research topic nowadays, and a large amount of excellent work has appeared. In contrast, research related to the recognition of Tangut characters is still in the initial stage. Creating databases and effective recognition methods that can support the recognition of Tangut characters remain a great challenge. In this paper, a labeling method based on Multi‐Model and Multi‐Prediction (MMMP) is proposed, which built a Tangut character database (TCD) and an enhanced database (called “TCD‐E”) covering 6077 classes, and five test sets were also built for specific tasks. To recognize Tangut characters effectively and quickly, a 5‐layer end‐to‐end Tangut Characters Recognition Network (TCRNet) based on CNN using shallow neural networks is designed. Its recognition accuracy on TCD‐E reaches 97.96%$\%$. Based on TCRNet, an end‐to‐end Similar Tangut Characters Recognition Network (STCRNet) is further proposed by improving the loss function by combining the softmax loss function with the central loss function, and its test accuracy on similar Tangut characters test set (called “TCD‐E‐S”) is 0.70%$\%$ higher than TCRNet. Experiments show that TCD and TCD‐E can provide data support for Tangut character recognition. The recognition accuracy of TCRNet and STCRNet surpasses the previous best results.
Tangut characters were created by the Tangut of the Western Xia (Xi Xia) Dynasty in ancient China and are over 1000 years old. In deep-learning-based recognition studies on Tangut characters, the lack of category-complete datasets has been problematic. Data augmentation cannot augment the character categories of unknown styles, whereas the use of image generation can effectively solve the problem. In this study, we consider the generation of antique book calligraphy styles of Tangut characters as a problem of learning to map from existing printed styles to personalized antique book calligraphy styles. We present M-ResNet, a multi-scale feature extraction residual unit, and Tangut-CycleGAN, a model for generation Tangut characters that combine M-ResNet and a cycle-consistent adversarial network (CycleGAN). This method uses unpaired data to generate Tangut character images in the calligraphy style of ancient books. To enhance the response of the model to significant channels, a squeezing-and-excitation (SE) module is introduced based on Tangut-CycleGAN to design the Tangut-CycleGAN+SE method for generating images of Tangut characters. This method is not only suitable for Tangut character image generation, but also can effectively generate calligraphy with aesthetic value. In addition, we propose an overall quality discrepancy evaluation metric, FA (Fréchet inception distance + Accuracy), to evaluate the quality of character image generation, which combines style discrepancy and content accuracy metrics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.