“…Human-computer interaction increasingly relies on Machine Learning (ML) models such as Deep Neural Networks (DNNs) trained from, usually large, datasets [1,2,3,4]. The ubiquitous applications of DNNs in security-critical tasks, such as face identity recognition [5,6], speaker verification [7,8], voice controlled systems [9,10,11] or signal forensics [12,13,14,15] require a high reliability on these computational models. However, it has been demonstrated that such models can be fooled by perturbing an input sample with malicious and quasi-imperceptible perturbations.…”