“…Mel-frequency cepstral coefficients (MFCCs) and other spectral transformations [37], [38], [36], [39], [40], [41], [42], [43], [44], glottal source related features (jitter and shimmer) [45], [46], difference between the low-pass and bandpass profile of the Teager Energy Operator (TEO) [47], [48], and non-linear features [49], [50] have all been proposed as model input features. Gaussian mixture models (GMM), support vector machines, and deep neural networks have been used in conjunction with these features for hypernasality evaluation from word and sentence level data [51], [52], [53], [54]. Recently, end-to-end neural networks taking MFCC frames as input and producing hypernasality assessments as output have also been proposed [55].…”