Prototypical Siamese Networks for Few-shot Learning

Wang, Junhua; Zhai, Yongping

doi:10.1109/iceiec49280.2020.9152261

Cited by 133 publications

(110 citation statements)

References 2 publications

Supporting

Mentioning

110

Contrasting

Order By: Relevance

“…Few-shot learning aims to recognize novel visual categories from a limited amount of labeled training data. Recent fewshot learning literature has been proposed for image classification [41], [43] and semantic segmentation [6], [34]. They require amounts of labeled data drawn from old classes in the real world as they are unable to decrease the imbalance domain shift in the scenarios.…”

Section: Few-shot Learningmentioning

confidence: 99%

See 1 more Smart Citation

Counterfactual Balancing Feature Alignment for Few-Shot Cross-Domain Scene Parsing

Sun¹,

Chen

2024

IEEE Access

View full text Add to dashboard Cite

Scene parsing becomes a key step to develop a visual autonomous driver. Real-world images are too expansive to annotate at scale, while few-shot cross-domain scene parsing (CSP) approaches only require a few labeled target images to train a model with source virtual data, thus, attracting more attention in the community. However, since the target training images are too few to support the cross-domain measures in statistics, it is inappropriate of resembling the spirit of conventional domain adaptation. In this paper, we reconsider this imbalance transfer learning demand as a covariate balancing issue regularly found in Rubin causal framework. We first consider the domain adaptation in pixels in the view of the average treatment effect (ATE), in which data are categorized into a treatment group or a control group in terms of the domain identity taken as the treatment. In this manner, the pair of domains could be perfectly aligned if the ATE converges to zero. It motivates Counterfactual Balance Feature Alignment (CBFA) to mitigate the crossdomain imbalance in the categories. CBFA revises existing adversarial adaptation techniques by modeling the propensity score for all pixels in their contexts, for the sake of predicting which groups they belong to. The propensity score for a pixel refers to its output of the domain discriminator and can be applied to balancing the adversarial adaptation objective. We evaluate our method on two suites of virtual-to-real scene parsing setups. Our method has obtained the new state of the art across 1-5 shot scenarios (in particular, 1-shot 56.79 in SYNTHIA-to-CITYSCAPES and 51.56 in GTA5-to-CITYSCAPES), demonstrating our motivation of building the connection between ATE and domain gap.

show abstract

Section: Few-shot Learningmentioning

confidence: 99%

“…Recent few-shot learning research [6], [34], [39], [41], [43] promote to classify objects of the categories that never appear in training, provided with only few examples of each new class. Derived from this principle, few-shot domain adaption (FADA) [30] was raised to transfer a source model toclassify target images.…”

Section: Introductionmentioning

confidence: 99%

Counterfactual Balancing Feature Alignment for Few-Shot Cross-Domain Scene Parsing

Sun¹,

Chen

2024

IEEE Access

View full text Add to dashboard Cite

show abstract

“…The model uses segmented sampled minibatch data to simulate the test task during training, which can reduce the difference between training and testing, thereby improving the generalization performance on the test set. Snell et al 28 further explored the relationship between the class embedding vectors in the embedding space, and believed that there is a prototype expression for each category, and then proposed a prototype network. In the article, the class embedding vectors are closely clustered around the class representatives, which is the mean value of the embedding vector of the support set, so the classification problem becomes the category of finding the nearest neighbor of the class prototype representative of the test image, and good results have been achieved.…”

Section: Related Workmentioning

confidence: 99%

“…Cumulative class prototype. The approach of the mean class prototype (MCP) is proposed in the prototype network 28 , which can represent this class of character images to some extent. When there is a large deviation in a certain class of a certain image, such as the target foreground is small, the background is large, the target is partially obscured or the sample image contains only part of the target, etc., the contribution of such images to the mean class prototype will have a great impact.…”

Section: Sscl Metric Spacementioning

confidence: 99%

One shot ancient character recognition with siamese similarity network

Liu

Gao

et al. 2022

Sci Rep

View full text Add to dashboard Cite

Ancient character recognition is not only important for the study and understanding of ancient history but also has a profound impact on the inheritance and development of national culture. In order to reduce the study of difficult professional knowledge of ancient characters, and meanwhile overcome the lack of data, class imbalance, diversification of glyphs, and open set recognition problems in ancient characters, we propose a Siamese similarity network based on a similarity learning method to directly learn input similarity and then apply the trained model to establish one shot classification task for recognition. Multi-scale fusion backbone structure and embedded structure are proposed in the network to improve the model's ability to extract features. We also propose the soft similarity contrast loss function for the first time, which ensures the optimization of similar images with higher similarity and different classes of images with greater differences while reducing the over-optimization of back-propagation leading to model overfitting. Specially, we propose a cumulative class prototype based on our network to solve the deviation problem of the mean class prototype and obtain a good class representation. Since new ancient characters can still be found in reality, our model has the ability to reject unknown categories while identifying new ones. A large number of experiments show that our proposed method has achieved high-efficiency discriminative performance and obtained the best performance over the methods of traditional deep learning and other classic one-shot learning.

show abstract

“…For the first time, we raise a glaucoma scenario based on prototypical neural networks (PNN) [66], which have demonstrated a high rate of performance in recent image analysis tasks, such as domain adaptation [67], noisy evaluation [68], text classification [69], etc. Note that prototypical networks are usually formulated as a baseline within the few-shot paradigm [70][71][72][73], but in this paper, we exploit the prototypical concept in the k-shot methodology to optimize the learning process for glaucoma grading.…”

Section: Contribution Of This Workmentioning

confidence: 99%