“…Non-verbal information indeed translates the slow temporal structure of speech with a stable precedence (van Wassenhove et al, 2005 ; Chandrasekaran et al, 2009 ; Biau and Soto-Faraco, 2013 ; Biau et al, 2015 ): on the one hand, the syllabic rate is reflected by the corresponding mouth aperture and jaw constraints at theta frequency (Hickok and Poeppel, 2007 ; Giraud and Poeppel, 2012 ; Peelle and Davis, 2012 ). On the other hand, a speaker’s prosody (i.e., modulations of amplitude, duration, and pitch accents in the speech envelope) occurs at a slower 1–3 Hz delta rate and correlates also with body movements (Munhall et al, 2004 ; Krahmer and Swerts, 2007 ; Wagner et al, 2014 ; Biau et al, 2016 , 2017 ). Interestingly, speech-related motion perception has been shown to activate the speech network as well (Macaluso et al, 2004 ; Skipper, 2014 ; Biau et al, 2016 ).…”