Although infants show sophisticated speech perception abilities, it is not clear whether they rely on the same acoustic information as adults. When perceiving speech in quiet, adults mainly use the slowest temporal envelope or amplitude modulation (AM) cues (<16Hz), while they rely more on the faster AM and frequency modulation (FM) cues, when perceiving speech in noise. The present study investigated how newborns process the slow and fast AM cues to discriminate phonemes in quiet. We combined near-infrared spectroscopy (NIRS) and electroencephalography (EEG) to assess newborns’ brain responses to infrequent deviant syllables that differed from frequent standard syllables in a single phoneme (consonant). The syllables were presented in three conditions, with the speech signal vocoded to preserve (i) both AM and FM cues, (ii) both fast and slow AM cues, but reducing FM, or (iii) only the slowest AM cues (< 8 Hz), and reducing fast AM and FM cues. Newborns detected the consonant change in all three conditions, as indexed by their electrophysiological response, suggesting that they are able to encode phonemes without the FM and fast AM cues. The slowest temporal cues of speech are thus sufficient for newborns to discriminate phonetic contrasts in quiet, similarly to adults. Yet, newborns do not process the slow and fast envelope information similarly, since these cues activate different neural areas, as shown by NIRS.