Automatic spotting and classification of facial Micro-Expressions (MEs) in ’in-the-wild’ videos is a topic of great interest in different fields involving sentiment analysis. Unfortunately, automatic spotting also represents a great challenge due to MEs quick temporal evolution and the lack of correctly annotated videos captured in the wild. In fact, the former makes ME difficult to grasp, while the latter results in the scarcity of real examples of spontaneous expressions in uncontrolled contexts. This paper proposes a novel but very simple spotting method that mainly exploits MEs perceptual characteristics. Specifically, the contribution is twofold: i) a distinguishing feature is defined for MEs in a domain that can capture and represent the peceptual stimuli of MEs, thus representing a suitable input for a standard binary classifier; ii) a proper numerical strategy is developed to augment the training set used to define the classification model. The rationale is that since MEs are visible by a human observer almost regardless of the specific context, it stands to reason that they have some sort of perceptual signature that activates pre-attentive vision. In this work this fingerprint is called Perceptual Emotional Signature (PES) and is modelled using the well-known Structural SIMilarity index (SSIM), which is a measure based on visual perception. A machine learning based classifier is then appropriately trained to recognize PESs. For this purpose, a suitable numerical strategy is applied to augment the training set; it mainly exploits error propagation rules in accordance with perceptual sensitivity to noise. The whole procedure is called PESMESS - Perceptual Emotional Signature of Micro- Expressions via SSIM and SVM. Preliminary studies show that SSIM can effectively guide the detection of MEs by identifying frames that contain PESs. Localization of PESs is accomplished using a properly trained Support Vector Machine (SVM) classifier that benefits from very short input feature vectors. Various tests on different benchmarking databases, containing both ’simulated’ and ’in-the-wild’ videos, confirm the potential and promising effectiveness of PESMESS when trained on appropriately perception-based augmented feature vectors.