The goal of unsupervised domain adaptation is to learn a task classifier that performs well for the unlabeled target domain by borrowing rich knowledge from a well-labeled source domain. Although remarkable breakthroughs have been achieved in learning transferable representation across domains, two bottlenecks remain to be further explored. First, many existing approaches focus primarily on the adaptation of the entire image, ignoring the limitation that not all features are transferable and informative for the object classification task. Second, the features of the two domains are typically aligned without considering the class labels; this can lead the resulting representations to be domain-invariant but nondiscriminative to the category. To overcome the two issues, we present a novel Informative Class-Conditioned Feature Alignment (IC 2 FA) approach for UDA, which utilizes a twofold method: informative feature disentanglement and class-conditioned feature alignment, designed to address the above two challenges, respectively. More specifically, to surmount the first drawback, we cooperatively disentangle the two domains to obtain informative transferable features; here, Variational Information Bottleneck (VIB) is employed to encourage the learning of task-related semantic representations and suppress task-unrelated information. With regard to the second bottleneck, we optimize a new metric, termed Conditional Sliced Wasserstein Distance (CSWD), which explicitly estimates the intraclass discrepancy and the inter-class margin. The intra-class and inter-class CSWDs are minimized and maximized, respectively, to
The capability of incrementally learning new classes and learning from a few examples is one of the hallmarks of human intelligence. It is crucial to endow a practical recognition system with such ability. Therefore, in this paper, we conduct pioneering work and focus on a challenging yet practical Semi-Supervised Few-Shot Class-Incremental Learning (SSFSCIL) problem, which requires CNN models incrementally learn new classes from very few labeled samples and a large number of unlabeled samples, without forgetting the previously learned ones. To address this problem, a simple and efficient solution for SSFSCIL is proposed to learn novel categories using a self-training strategy in a semi-supervised manner and avoid catastrophic forgetting by distillation-based methods. Our extensive experiments on CIFAR100, miniImageNet and CUB200 datasets demonstrate the promising performance of our proposed method, and define baselines in this new research direction.
Given a model well-trained with a large-scale base dataset, Few-Shot Class-Incremental Learning (FSCIL) aims at incrementally learning novel classes from a few labeled samples by avoiding overfitting, without catastrophically forgetting all encountered classes previously. Currently, semi-supervised learning technique that harnesses freely-available unlabeled data to compensate for limited labeled data can boost the performance in numerous vision tasks, which heuristically can be applied to tackle issues in FSCIL, i.e., the Semi-supervised FSCIL (Semi-FSCIL). So far, very limited work focuses on the Semi-FSCIL task, leaving the adaptability issue of semi-supervised learning to the FSCIL task unresolved. In this paper, we focus on this adaptability issue and present a simple yet efficient Semi-FSCIL framework named Uncertainty-aware Distillation with Class-Equilibrium (UaD-CE), encompassing two modules UaD and CE. Specifically, when incorporating unlabeled data into each incremental session, we introduce the CE module that employs a class-balanced self-training to avoid the gradual dominance of easy-to-classified classes on pseudo-label generation. To distill reliable knowledge from the reference model, we further implement the UaD module that combines uncertainty-guided knowledge refinement with adaptive distillation. Comprehensive experiments on three benchmark datasets demonstrate that our method can boost the adaptability of unlabeled data with the semi-supervised learning technique in FSCIL tasks. The code is available at https://github.com/yawencui/UaD-CE.
Remote photoplethysmography (rPPG), which aims at measuring heart activities and physiological signals from facial video without any contact, has great potential in many applications (e.g., remote healthcare and affective computing). Recent deep learning approaches focus on mining subtle rPPG clues using convolutional neural networks with limited spatio-temporal receptive fields, which neglect the long-range spatio-temporal perception and interaction for rPPG modeling. In this paper, we propose two end-to-end video transformer based architectures, namely PhysFormer and PhysFormer++, to adaptively aggregate both local and global spatio-temporal features for rPPG representation enhancement. As key modules in PhysFormer, the temporal difference transformers first enhance the quasi-periodic rPPG features with temporal difference guided global attention, and then refine the local spatio-temporal representation against interference. To better exploit the temporal contextual and periodic rPPG clues, we also extend the PhysFormer to the two-pathway SlowFast based PhysFormer++ with temporal difference periodic and cross-attention transformers. Furthermore, we propose the label distribution learning and a curriculum learning inspired dynamic constraint in frequency domain, which provide elaborate supervisions for PhysFormer and PhysFormer++ and alleviate overfitting. Comprehensive experiments are performed on four benchmark datasets to show our superior performance on both intra- and cross-dataset testings. Unlike most transformer networks needed pretraining from large-scale datasets, the proposed PhysFormer family can be easily trained from scratch on rPPG datasets, which makes it promising as a novel transformer baseline for the rPPG community.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.