Facial expression recognition, as part of an affective computing system, usually relies on solid performance metrics to be successful. These metrics depend significantly on the affective context in which one evaluates this system. While presenting excellent performance on the dataset it was trained on, a facial expression recognition model might drastically fail when one assesses it in a different scenario. Such performance reduction occurs because most facial perception models rely on an extreme generalization concept, focusing on a universal emotion perception system. With the recent findings on the non-universality of emotional perception, generalization of facial encoders seems not to be the optimal path to take. Therefore, exploiting transfer learning towards adapting specific facial features to specific scenarios could address this problem. This paper proposes and investigates a Spatial Transformer Plugin (STN) to rearrange different facial encoders towards particular affective representations from different scenarios. We experiment with our model in eight different facial expression recognition datasets (AffectNet and the derived MaskedAffectNet, OMG-Emotion, FERPlus, ElderReact, EmoReact, FABO and JAFFE datasets) and obtain competitive performance with much less training effort than state-of-the-art models. Besides performance alone, we introduce the STN as a mechanism towards a non-universal emotional perception system and discuss how it rearranges learned perception features to address some specific characteristics of each investigated dataset.