“…Originally, there were 95 local prosodic features computed for each word position. After several studies on voice and speech assessment, however, a relevant core set of 33 features had been defined for further processing [16], including statistics about pauses, energy, duration and F0. The 33 local features per word were then averaged with respect to different conditions, for example over all words, over all nouns, or over all verbs and nouns.…”