“…Many of the SIC systems proposed in the literature are based on classical machine learning algorithms, such as, for example, linear discriminant analysis [3,5,6], support vector machines (SVM) [7][8][9], or random forests [10], that use, as input, different types of hand-crafted features, such as the average of the mel frequency cepstrum coefficients (MFCC) [8,11], the average of the mel-frequency delta-energy coefficients [12], the intensity and frequency of the maximum values of the modulation spectrum [5], the quotient between low and high modulation energies [3,6], the average energy of the modulation spectrogram [8] or features derived from the output of an automatic speech recognizer [9,13].…”