MobileFAN: Transferring deep hidden representation for face alignment

Zhao, Yang; Liu, Yifan; Shen, Chunhua; Gao, Yongsheng; Xiong, Shengwu

doi:10.1016/j.patcog.2019.107114

Cited by 40 publications

(20 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To obtain part-level features from the images, additional bounding boxes are used to mark the desire regions and train the feature detection model. Many strategies [28] have been proposed to detect similar objects (features) in images including YOLOv5 [12] and Fast R-CNN [29]. The complete training process of the feature detection module consists of the following two steps.…”

Section: A Feature Detection Modulementioning

confidence: 99%

Mask-Guided Feature Extraction and Augmentation for Ultra-Fine-Grained Visual Categorization

Pan

Zhang

et al. 2021

2021 Digital Image Computing: Techniques and Applications (DICTA)

Self Cite

View full text Add to dashboard Cite

While the fine-grained visual categorization (FGVC) problems have been greatly developed in the past years, the Ultrafine-grained visual categorization (Ultra-FGVC) problems have been understudied. FGVC aims at classifying objects from the same species (very similar categories), while the Ultra-FGVC targets at more challenging problems of classifying images at an ultra-fine granularity where even human experts may fail to identify the visual difference. The challenges for Ultra-FGVC mainly comes from two aspects: one is that the Ultra-FGVC often arises overfitting problems due to the lack of training samples; and another lies in that the inter-class variance among images is much smaller than normal FGVC tasks, which makes it difficult to learn discriminative features for each class. To solve these challenges, a mask-guided feature extraction and feature augmentation method is proposed in this paper to extract discriminative and informative regions of images which are then used to augment the original feature map. The advantage of the proposed method is that the feature detection and extraction model only requires a small amount of target region samples with bounding boxes for training, then it can automatically locate the target area for a large number of images in the dataset at a high detection accuracy. Experimental results on two public datasets and ten state-of-the-art benchmark methods consistently demonstrate the effectiveness of the proposed method both visually and quantitatively.

show abstract

Section: A Feature Detection Modulementioning

confidence: 99%

Mask-Guided Feature Extraction and Augmentation for Ultra-Fine-Grained Visual Categorization

Pan

Zhang

et al. 2021

2021 Digital Image Computing: Techniques and Applications (DICTA)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Existing methods for this task include random forest [1], [26] and cascaded shape regression [27], [28]. Convolutional neural networks (CNN) based regression has become a recently popular approach for keypoint localization [18], [29]- [31]. Two CNN-based architecture designs have emerged: direct coordinate regression [30], [32] and heatmap regression [29], [33], where the latter usually outperforms the former, due to the advantage of preserving higher spatial resolution for accurate localization.…”

Section: Related Work a Anatomical Landmark Detectionmentioning

confidence: 99%

“…One feasible solution to mitigate the challenge is to utilize backbone CNN architectures (e.g. VGG16 [14] or ResNet50 [15]) trained on a large-scale and diverse image dataset, such as VGGFace2 [16], which can be fine-tuned or specialized with additional task-specific layers to promote the optimization of facial anatomical landmark detection [17], [18]. Finetuning, as a common paradigm in transfer learning [19], aims to benefit the target task by providing a good initialization, but it can require exhaustive tuning or a set of ad-hoc hyper-parameters to achieve good performance [20], [21].…”

Section: Introductionmentioning

confidence: 99%

Facial Anatomical Landmark Detection using Regularized Transfer Learning with Application to Fetal Alcohol Syndrome Recognition

Fu¹,

Jiao²,

Suttie³

et al. 2021

Preprint

View full text Add to dashboard Cite

Fetal alcohol syndrome (FAS) caused by prenatal alcohol exposure can result in a series of cranio-facial anomalies, and behavioral and neurocognitive problems. Current diagnosis of FAS is typically done by identifying a set of facial characteristics, which are often obtained by manual examination. Anatomical landmark detection, which provides rich geometric information, is important to detect the presence of FAS associated facial anomalies. This imaging application is characterized by large variations in data appearance and limited availability of labeled data. Current deep learning-based heatmap regression methods designed for facial landmark detection in natural images assume availability of large datasets and are therefore not wellsuited for this application. To address this restriction, we develop a new regularized transfer learning approach that exploits the knowledge of a network learned on large facial recognition datasets. In contrast to standard transfer learning which focuses on adjusting the pre-trained weights, the proposed learning approach regularizes the model behavior. It explicitly reuses the rich visual semantics of a domain-similar source model on the target task data as an additional supervisory signal for regularizing landmark detection optimization. Specifically, we develop four regularization constraints for the proposed transfer learning, including constraining the feature outputs from classification and intermediate layers, as well as matching activation attention maps in both spatial and channel levels. Experimental evaluation on a collected clinical imaging dataset demonstrate that the proposed approach can effectively improve model generalizability under limited training samples, and is advantageous to other approaches in the literature.

show abstract

“…In computer vision, fine-grained visual categorization (FGVC) aims to classify the objects with small inter-class variances in which a clear difference may exist for different species, and has been extensively studied and made considerable progress in the past years [28], [10], [27], [29], [5], [26]. Ultra-fine-grained visual categorization (ultra-FGVC), however, focuses on classifying objects with more similar patterns among categories under a same class, and has been understudied [22], [20].…”

Section: Introductionmentioning

confidence: 99%

A Compositional Feature Embedding and Similarity Metric for Ultra-Fine-Grained Visual Categorization

Sun

Zhang

et al. 2021

2021 Digital Image Computing: Techniques and Applications (DICTA)

Self Cite

View full text Add to dashboard Cite

Fine-grained visual categorization (FGVC), which aims at classifying objects with small inter-class variances, has been significantly advanced in recent years. However, ultra-finegrained visual categorization (ultra-FGVC), which targets at identifying subclasses with extremely similar patterns, has not received much attention. In ultra-FGVC datasets, the samples per category are always scarce as the granularity moves down, which will lead to overfitting problems. Moreover, the difference among different categories is too subtle to distinguish even for professional experts. Motivated by these issues, this paper proposes a novel compositional feature embedding and similarity metric (CECS). Specifically, in the compositional feature embedding module, we randomly select patches in the original input image, and these patches are then replaced by patches from the images of different categories or masked out. Then the replaced and masked images are used to augment the original input images, which can provide more diverse samples and thus largely alleviate overfitting problem resulted from limited training samples. Besides, learning with diverse samples forces the model to learn not only the most discriminative features but also other informative features in remaining regions, enhancing the generalization and robustness of the model. In the compositional similarity metric module, a new similarity metric is developed to improve the classification performance by narrowing the intra-category distance and enlarging the inter-category distance. Experimental results on two ultra-FGVC datasets and one FGVC dataset with recent benchmark methods consistently demonstrate that the proposed CECS method achieves the state-of-the-art performance.

show abstract

MobileFAN: Transferring deep hidden representation for face alignment

Cited by 40 publications

References 31 publications

Mask-Guided Feature Extraction and Augmentation for Ultra-Fine-Grained Visual Categorization

Mask-Guided Feature Extraction and Augmentation for Ultra-Fine-Grained Visual Categorization

Facial Anatomical Landmark Detection using Regularized Transfer Learning with Application to Fetal Alcohol Syndrome Recognition

A Compositional Feature Embedding and Similarity Metric for Ultra-Fine-Grained Visual Categorization

Contact Info

Product

Resources

About