Despite various success in computer vision with facial images (e.g., face detection, recognition, and generation), facial expression recognition is still a challenging problem yet to be solved. This is because of simple but fundamental bottlenecks: 1) no global agreement on different facial expressions, 2) significant dataset biases that prevent cross-dataset analysis for a large-scale study, and 3) high class imbalance in in-the-wild datasets that causes inconsistency in predicting expressions in images using a machine learning algorithm. To tackle these issues, we propose a novel Deep Learning approach via adaptive cross-dataset scheme. We combine multiple in-the-wild datasets to secure sufficient training samples while minimizing dataset bias using ideas of reversal gradients to retain generality. For this, we introduce a flexible objective function that can control for skewed label distributions in the dataset. Incorporating these ideas, together with the ResNet pipeline as a backbone, we carried extensive experiments to validate our ideas using three independent in-the-wild facial expression datasets, which first confirmed bias from different datasets and yielded improved performance on facial expression recognition using the multi-site dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.