Supervised learning with the restriction of a few existing training samples is called Few-Shot Learning. FSL is a subarea that puts deep learning performance in a gap, as building robust deep networks requires big training data. Using transfer learning in FSL tasks is an acceptable way to avoid the challenge of building new deep models from scratch. Transfer learning methodology considers borrowing the architecture and parameters of a previously trained model on a large-scale dataset and fine-tuning it for low-data target tasks. But practically, fine-tuning pretrained models in target FSL tasks suffers from overfitting. The few existing samples are not enough to correctly adjust the pretrained model’s parameters to provide the best fit for the target task. In this study, we consider mitigating the overfitting problem when applying transfer learning in few-shot Handwritten Character Recognition (HCR) tasks. A data augmentation approach based on Conditional Generative Adversarial Networks is introduced. CGAN is a generative model that can create artificial instances that appear more real and indistinguishable from the original samples. CGAN helps generate extra samples that hold the possible variations of human handwriting instead of applying traditional image transformations. These transformations are low-level, data-independent operations, and only produce augmented samples with limited diversity. The introduced approach was evaluated in fine-tuning the three pretrained models: AlexNet, VGG-16, and GoogleNet. The results show that the samples generated by CGAN can enhance transfer learning performance in few-shot HCR tasks. This is by achieving model fine-tuning with fewer epochs and by increasing the model’s $$F1-score$$ F 1 - s c o r e and decreasing the Generalization Error $$(E_{test})$$ ( E test ) .
Deep metric learning is one of the recommended methods for the challenge of supporting few/zero-shot learning by deep networks. It depends on building a Siamese architecture of two homogeneous Convolutional Neural Networks (CNNs) for learning a distance function that can map input data from the input space to the feature space. Instead of determining the class of each sample, the Siamese architecture deals with the existence of a few training samples by deciding if the samples share the same class identity or not. The traditional structure for the Siamese architecture was built by forming two CNNs from scratch with randomly initialized weights and trained by binary cross-entropy loss. Building two CNNs from scratch is a trial and error and time-consuming phase. In addition, training with binary crossentropy loss sometimes leads to poor margins. In this paper, a novel Siamese network is proposed and applied to few/zero-shot Handwritten Character Recognition (HCR) tasks. The novelties of the proposed network are in. 1) Utilizing transfer learning and using the pre-trained AlexNet as a feature extractor in the Siamese architecture. Fine-tuning a pre-trained network is typically faster and easier than building from scratch. 2) Training the Siamese architecture with contrastive loss instead of the binary cross-entropy. Contrastive loss helps the network to learn a nonlinear mapping function that enables it to map the extracted features in the vector space with an optimal way. The proposed network is evaluated on the challenging Chars74K datasets by conducting two experiments. One is for testing the proposed network in few-shot learning while the other is for testing it in zero-shot learning. The recognition accuracy of the proposed network reaches to 85.6% and 82% in few-and zero-shot learning respectively. In addition, a comparison between the performance of the proposed Siamese network and the traditional Siamese CNNs is conducted. The comparison results show that the proposed network achieves higher recognition results in less time. The proposed network reduces the training time from days to hours in both experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.