“…The number of acoustic parameters proven to contain emotional information is still increasing. Generally, the most commonly used features can be divided into three groups: prosodic features (e.g., fundamental frequency, energy, speed of speech) [ 22 ], quality characteristics (e.g., formants, brightness) [ 23 ] and spectrum characteristics (e.g., mel-frequency cepstral coefficients) [ 24 , 25 ]. The final features vector is based on their statistics such as mean, maximum, minimum, change rate, kurtosis, skewness, zero-crossing rate, variance etc., [ 26 , 27 ].…”